/u/amareshadak

Claude Sonnet 4.5 Hits 77.2% on SWE-bench + Microsoft Agent Framework: AI Coding Agents Are Getting Seriously Competent

/u/amareshadak October 13, 2025 October 13, 2025

The AI landscape just shifted dramatically. Three major releases dropped that could fundamentally change how developers work: Claude Sonnet 4.5 achieved 77.2% on SWE-bench Verified (vs. 48.1% for Sonnet 3.5). We're talking about real-world debuggin…

Share this: