A quick second look at the data from that "length of tasks AI can do is doubling" paper
A quick second look at the data from that "length of tasks AI can do is doubling" paper

A quick second look at the data from that "length of tasks AI can do is doubling" paper

A quick second look at the data from that "length of tasks AI can do is doubling" paper

I pulled the dataset from the paper and looked at broke out task time by if a model actually succeeded at completing or not, and here's what's happening:

  • The length of task models actually complete increases slightly in the last year or so, while the length of task models fail to complete.
  • The apparent reason for this is that models are generally completing more tasks across time, but generally not the longer ones.
  • The exponential trend you're seeing seems like it's probably a result of fitting a logistic regression for each model - the shape of each curve is sensitive to the trends noted above, impacting the task times they're back calculating from estimated 50% success rates.

Thought this was worth sharing. I've dug into this quite a bit more, but don't have time write it all out tonight. Happy to answer questions if anybody has them.

https://preview.redd.it/0f2fornljwwe1.png?width=1188&format=png&auto=webp&s=e17d4688365957418036407a3a53d1601d508510

submitted by /u/Murky-Motor9856
[link] [comments]