| I've spent 18 months building agent infrastructure and watched a lot of impressive demos. Here's the uncomfortable pattern: the demo works beautifully, the founder posts it, everyone claps and then it touches real users and quietly dies. Not because GPT-5 / Claude / whatever isn't smart enough. The model is almost never the problem anymore. It dies for three boring reasons nobody wants to talk about because they're not sexy: 1. AMNESIA. Your agent forgets everything the moment the process restarts. Crash, redeploy, pod cycle gone. So everyone hacks together a pickle file or a Postgres table, and it works until they have more than one agent and the memory needs to be shared. Then it's a mess. 2. SUICIDE BY LOOP. An agent has no idea it's in a loop. It will call the same tool with the same args 400 times and cheerfully burn $200 of tokens overnight, because it has no metacognition. It literally cannot detect its own failure. The defense has to live OUTSIDE the agent and almost nobody builds that. 3. NO BLACK BOX. The agent does something weird in front of a customer. They ask "why did it do that?" and you stare at logs that show inputs and outputs but no chain of reasoning. You have no answer. Trust evaporates. The whole industry is obsessed with the brain (the model and ignoring the nervous) system (memory, the immune system (loop detection), and the flight recorder (audit).) The unsexy truth: the next wave of agent winners won't have better prompts. They'll have better infrastructure. The model is commoditising. The reliability layer is where the actual moat is. I got annoyed enough about this that I built the layer myself persistent memory, automatic loop detection, and a tamper-evident audit trail, framework-agnostic (LangChain/CrewAI/AutoGen/OpenAI/MCP. It's at) octopodas.com if you want to tear it apart genuinely want feedback from people who've shipped agents and hit this wall. But honestly even if you never touch my thing: stop optimising the prompt and start thinking about what happens when your agent restarts, loops, or gets asked "why." [link] [comments] |