GPT-5 can’t play a game of Zendo
I have a fairly good test for LLMs, playing a simple game of Zendo using digits. At first things went quite well, but later we can see GPT-5 struggling for accuracy in simple observations. To be fair it later offered to generate a python script for rig…