<span class="vcard">/u/simulated-souls</span>
/u/simulated-souls

Google’s Aletheia AI Agent Autonomously Solves 6/10 Novel FirstProof Math Problems

Abstract: We report the performance of Aletheia (Feng et al., 2026b), a mathematics research agent powered by Gemini 3 Deep Think, on the inaugural FirstProof challenge. Within the allowed timeframe of the challenge, Aletheia autonomously solved 6 pro…

Mira Murati’s Thinking Machines seeks $50 billion valuation in funding talks

The startup was last valued at $12 billion in July, after it raised about $2 billion. It launched* its first product called Tinker, which helps fine-tune language models in October *There is currently a waitlist to gain access submitted by &#3…

Models Will Continue to Improve, Even If AI Research Hits a Complete Wall

TLDR: Better data will lead to better models, even if nothing else changes. Suppose that starting now: Compute scaling stops improving models Better architectures stop improving models Training and inference algorithms stop improving models RL (outsid…

Language Models Don’t Just Model Surface Level Statistics, They Form Emergent World Representations

A lot of people in this sub and elsewhere on reddit seem to assume that LLMs and other ML models are only learning surface-level statistical correlations. An example of this thinking is that the term "Los Angeles" is often associated with the…

Inside the Secret Meeting Where Mathematicians Struggled to Outsmart AI (Scientific American)

30 renowned mathematicians spent 2 days in Berkeley, California trying to come up with problems that OpenAl's o4-mini reasoning model could not solve… they only found 10. Excerpt: By the end of that Saturday night, Ono was frustrated with …