The length of tasks that generalist frontier model agents can complete autonomously with 50% reliability has been doubling approximately every 7 months
The length of tasks that generalist frontier model agents can complete autonomously with 50% reliability has been doubling approximately every 7 months