Zero data training approach still produce manipulative behavior inside the model
Not sure if this was already posted before, plus this paper is on a heavy technical side. So there is a 20 min video rundown: https://youtu.be/X37tgx0ngQE Paper itself: https://arxiv.org/abs/2505.03335 And tldr: Paper introduces Absolute Zero Reasoner …