<span class="vcard">/u/PianistWinter8293</span>
/u/PianistWinter8293

Theoretical Feasability of reaching AGI through scaling Compute

There is the pending question wether or not LLMs can get us to AGI by scaling up current paradigms. I believe that we have gone far and now towards the end of scaling compute in the pre-training phase as admitted by Sam Altman. The post-training is now…

This new paper poses a real threat to scaling RL

https://www.arxiv.org/abs/2504.13837 One finding of this paper is that as we scale RL, there will be problems that the model gets worse and worse at solving. GRPO and other RL on exact reward methods get stuck on local optima due to their lack of explo…

Can’t we solve Hallucinations by introducing a Penalty during Post-training?

Currently, reasoning models like Deepseek R1 use outcome-based reinforcement learning, which means it is rewarded 1 if their answer is correct and 0 if it's wrong. We could very easily extend this to 1 for correct, 0 if the model says it doesn'…

Google’s Coscientist finds what took Researchers a Decade

The article at https://www.techspot.com/news/106874-ai-accelerates-superbug-solution-completing-two-days-what.html highlights a Google AI CoScientist project featuring a multi-agent system that generates original hypotheses without any gradient-based t…

The stochastic parrot was just a phase, we will now see the ‘Lee Sedol moment’ for LLMs

The biggest criticism of LLMs is that they are stochastic parrots, not capable of understanding what they say. With Anthropic's research, it has become increasingly evident that this is not the case and that LLMs have real-world understanding. Howe…

From now to AGI – What will be the key advancements needed?

Please comment on what you believe will be a necessary development to reach AGI. To start, I'll try to frame what we have now in such a way that it becomes apparent what is missing, if we were to compare AI to human intelligence, and how we might a…

DeepMind Drops AGI Bombshell: Scaling Alone Could Get Us There Before 2030

I've been digging into that Google DeepMind AGI safety paper (https://arxiv.org/html/2504.01849v1). As someone trying to make sense of potential timelines from within the research trenches, their Chapter 3, outlining core development assumptions, c…

How do you deal with uncertainty?

I think never has life been as uncertain as it is now. The ever increasing amount of change and foresight of AGI in coming years means that its hard to adapt. Nobody knows exactly how the world will change, as a young person I don't know what to do…

[D] Why Bigger Models Generalize Better

There is still a lingering belief from classical machine learning that bigger models overfit and thus don't generalize well. This is described by the bias-variance trade-off, but this no longer holds in the new age of machine learning. This is empi…

The Difference Between Human and AI Reasoning

Older AI models showed some capacity for generalization, but pre-O1 models weren't directly incentivized to reason. This fundamentally differs from humans: our limbic system can choose its reward function and reward us for making correct reasoning …