SmartGPT: Major Benchmark Broken – 89.0% on MMLU + Exam’s Many Errors
SmartGPT: Major Benchmark Broken – 89.0% on MMLU + Exam’s Many Errors