<span class="vcard">/u/kanzenryu</span>
/u/kanzenryu

GPT-5 can’t play a game of Zendo

I have a fairly good test for LLMs, playing a simple game of Zendo using digits. At first things went quite well, but later we can see GPT-5 struggling for accuracy in simple observations. To be fair it later offered to generate a python script for rig…