Which LLM is king right now? I ran a creative stress-test on GPT-5, Claude Opus 4.1, o3-pro, Grok 4, and Gemini 2.5 Pro
With GPT-5 and Claude Opus 4.1 launching recently, the obvious question is: which of the strongest LLMs is actually the best right now? I put 5 top models (GPT-5, Claude Opus 4.1, GPT o3-pro, Grok 4, Gemini 2.5 Pro) through the same ultimate stress-tes…