<span class="vcard">/u/docybo</span>
/u/docybo

We added cryptographic approval to our AI agent… and it was still unsafe

We’ve been working on adding “authorization” to an AI agent system. At first, it felt solved: – every action gets evaluated – we get a signed ALLOW / DENY – we verify the signature before execution Looks solid, right? It wasn’t. We hit a few problems a…

Built a demo where an agent can provision 2 GPUs, then gets hard-blocked on the 3rd call

Policy: – budget = 1000 – each `provision_gpu(a100)` call = 500 Result: – call 1 -> ALLOW – call 2 -> ALLOW – call 3 -> DENY (`BUDGET_EXCEEDED`) Key point: the 3rd tool call is denied before execution. The tool never runs. Also emits: – …

This OpenClaw paper shows why agent safety is an execution problem, not just a model problem

Paper: https://arxiv.org/abs/2604.04759 This OpenClaw paper is one of the clearest signals so far that agent risk is architectural, not just model quality. A few results stood out: – poisoning Capability / Identity / Knowledge pushes attack success fro…

LLM agents can trigger real actions now. But what actually stops them from executing?

We ran into a simple but important issue while building agents with tool calling: the model can propose actions but nothing actually enforces whether those actions should execute. That works fine… until the agent controls real side effects: APIs infra…

What actually prevents execution in agent systems?

Ran into this building an agent that could trigger API calls. We had validation, tool constraints, retries… everything looked “safe”. Still ended up executing the same action twice due to stale state + retry. Nothing actually prevented execution. It on…

Where should the execution boundary actually live in Agent systems?

following up on a discussion from earlier a pattern that keeps showing up in real systems: most control happens after execution – retries – state checks – monitoring – idempotency patches but the actual decision to execute is often implicit if the agen…

AI agents can trigger real-world actions. Why don’t we have execution authorization yet?

While experimenting with autonomous agents recently, I keep running into a pattern that feels oddly familiar from distributed systems history. A lot of current discussion around agent reliability focuses on: better prompting model alignment sandboxed …

Building AI agents taught me that most safety problems happen at the execution layer, not the prompt layer. So I built an authorization boundary

Something I kept running into while experimenting with autonomous agents is that most AI safety discussions focus on the wrong layer. A lot of the conversation today revolves around: • prompt alignment • jailbreaks • output filtering • sandboxing Those…

We’re building a deterministic authorization layer for AI agents before they touch tools, APIs, or money

Most discussions about AI agents focus on planning, memory, or tool use. But many failures actually happen one step later: when the agent executes real actions. Typical problems we've seen: runaway API usage repeated side effects from retries recur…