<span class="vcard">/u/PianistWinter8293</span>
/u/PianistWinter8293

AGI won’t create new jobs and here is why

If we define AGI as something that performs as well as humans on all current economically valuable tasks, then it could theoretically be true that new tasks will be created that the AGI is not good at, which humans could then make their new niche. In t…

A nuanced take on current progress

We've been hearing that AI might be in a bubble, that we might be hitting some wall. This all might be true, but yet there remains a large proportion of people that insist we are actually moving towards AGI rather quickly. These two diverging views…

My Take on Ilya’s Interview: A path forward for RL

A while back I posted on some fundamental problem facing the current paradigm and this got some negative backlash. In light of Ilya's latest interview, I think things have become more clear. The way RL is done currently is not enough to reach AGI….

Trying to create a community of people interested in AI and cognition and the societal aspects of it

Posting it here since I believe other communities far too often have people with a too narrow lens. They either focus too much on the engineering / math (Data scientists), too much on the empirical (psychologists), or too much on the practical (politic…

Theoretical Feasability of reaching AGI through scaling Compute

There is the pending question wether or not LLMs can get us to AGI by scaling up current paradigms. I believe that we have gone far and now towards the end of scaling compute in the pre-training phase as admitted by Sam Altman. The post-training is now…

This new paper poses a real threat to scaling RL

https://www.arxiv.org/abs/2504.13837 One finding of this paper is that as we scale RL, there will be problems that the model gets worse and worse at solving. GRPO and other RL on exact reward methods get stuck on local optima due to their lack of explo…

Can’t we solve Hallucinations by introducing a Penalty during Post-training?

Currently, reasoning models like Deepseek R1 use outcome-based reinforcement learning, which means it is rewarded 1 if their answer is correct and 0 if it's wrong. We could very easily extend this to 1 for correct, 0 if the model says it doesn'…

Google’s Coscientist finds what took Researchers a Decade

The article at https://www.techspot.com/news/106874-ai-accelerates-superbug-solution-completing-two-days-what.html highlights a Google AI CoScientist project featuring a multi-agent system that generates original hypotheses without any gradient-based t…

The stochastic parrot was just a phase, we will now see the ‘Lee Sedol moment’ for LLMs

The biggest criticism of LLMs is that they are stochastic parrots, not capable of understanding what they say. With Anthropic's research, it has become increasingly evident that this is not the case and that LLMs have real-world understanding. Howe…

From now to AGI – What will be the key advancements needed?

Please comment on what you believe will be a necessary development to reach AGI. To start, I'll try to frame what we have now in such a way that it becomes apparent what is missing, if we were to compare AI to human intelligence, and how we might a…