Can AI Replace Developers? Princeton and University of Chicago’s SWE-bench Tests AI on Real Coding Issues
Exploiting AI to make software programming easier? SWE-bench, a unique evaluation system, tests language models' ability to solve real GitHub-collated programming issues. Interestingly, even top-notch models manage only the simplest problems,…