Is It Scaling or is it or Learning that will Unlock AGI? Did Jensen Huang hint at when AGI will become possible? What is Scaling actually good for?
Is It Scaling or is it or Learning that will Unlock AGI? Did Jensen Huang hint at when AGI will become possible? What is Scaling actually good for?

Is It Scaling or is it or Learning that will Unlock AGI? Did Jensen Huang hint at when AGI will become possible? What is Scaling actually good for?

I've made the argument for a while now that LLM's are static and that is a fundamental problem in the quest for AGI. For those who doubt it or think it's no big deal should really watch and excellent podcast by Dwarkesh Patel with his interview of Francois Chollet.

Most of the conversation was about the ARC challenge and specifically why LLM's today aren't capable of doing well on the test. What a child would handle easily a multi-million dollar trained LLM cannot. The premise of the argument is that LLM's aren't very good at dealing with things that are new and not likely to have been in their training set.

The specific part of the interview of interest here at the minute mark:

https://youtu.be/UakqL6Pj9xo?si=zFNHMTnPLCILe7KG&t=819

Now, the key point here is that Jack Cole was able to score 35% on the test with only a 230 million parameter model by using a key concept of what Francois calls "Active Inference" or "Active/Dynamic fine tuning". Meaning, the notion that a model can update it's knowledge set on the fly is a very valuable attribute towards being an intelligent agent. Not seeing something ever and but being able to adapt and react to it. Study it, learn it, and retain that knowledge for future use.

Another case-in-point very related to this topic was the interview by Jensen Huang months earlier via the 2024 SIEPR Economic Summit at Stanford University. Another excellent video to watch. In this, Jensen makes this statement. https://youtu.be/cEg8cOx7UZk?si=Wvdkm5V-79uqAIzI&t=981

... the interactions that it's just continuously improving itself the learning process and the Train the the training process and the inference process the training process and the deployment process application process will just become one well that's exactly what we do you know we don't have like between ...

He's clearly speaking directly to what Francois's point was. In the future, say 10 years, we will be able to accomplish the exact thing that Jack is doing today albeit with a very tiny model.

To me this is clear as the day but nobody is really discussing it. What is scaling actually good for? To me the value and the path to AGI is in the learning mechanism. Scaling to me is just the G in AGI.

Somewhere along the line someone wrote down a rule, a law really, that stated in order to have ASI you must have something that is general purpose and thus we must all build AGI.

In this dogma I believe is the fundamental reason why I think we keep pushing scaling as the beacon of hope that ASI[AGI] will come.

submitted by /u/Xtianus21
[link] [comments]