LLM agents can trigger real actions now. But what actually stops them from executing?
LLM agents can trigger real actions now. But what actually stops them from executing?

LLM agents can trigger real actions now. But what actually stops them from executing?

We ran into a simple but important issue while building agents with tool calling:

the model can propose actions
but nothing actually enforces whether those actions should execute.

That works fine… until the agent controls real side effects:

  • APIs
  • infrastructure
  • payments
  • workflows

Example

Same model, same tool, same input:

#1 provision_gpu -> ALLOW #2 provision_gpu -> ALLOW #3 provision_gpu -> DENY 

The key detail:

the third call is blocked before execution

No retry
No partial execution
No side effect

The underlying problem

Most setups look like this:

model -> tool -> execution 

Even with:

  • validation
  • retries
  • guardrails

…the model still indirectly controls when execution happens.

What changed

We tried a different approach:

proposal -> (policy + state) -> ALLOW / DENY -> execution 

Key constraint:

no authorization -> no execution path 

So a denied action doesn’t just “fail”, it never reaches the tool at all.

Demo

https://github.com/AngeYobo/oxdeai/tree/main/examples/openai-tools

Why this feels important

Once agents move from “thinking” to “acting”,
the risk is no longer the output, it’s the side effect.

And right now, most systems don’t have a clear boundary there.

Question

How are you handling this?

  • Do you gate execution before tool calls?
  • Or rely on retries / monitoring after the fact?
submitted by /u/docybo
[link] [comments]