O3 beats 99.8% competitive coders
So apparently the equivalent percentile of a 2727 elo rating is 99.8 on codeforces Source: https://codeforces.com/blog/entry/126802 submitted by /u/user0069420 [link] [comments]
So apparently the equivalent percentile of a 2727 elo rating is 99.8 on codeforces Source: https://codeforces.com/blog/entry/126802 submitted by /u/user0069420 [link] [comments]
Note: Note: o1 was evaluated manually using ChatGPT. So far, it has only been scored on coding tasks. https://livebench.ai/#/ submitted by /u/user0069420 [link] [comments]
The new Alibaba QwQ 32B is exceptional for its size and is pretty much SOTA in terms of benchmarks, we had deepseek r1 lite a few days ago which should be 15B parameters if it's like the last DeepSeek Lite. It got me thinking what would happen if w…
I think Hallucinations in LLMs are what we call when we don't like the output, and creativity is what we call when we do like it, since they really think what they are responding is correct based on their training data and the context provided. Wha…