I cut Claude Code’s token usage by 68.5% by giving agents their own OS

Al agents are running on infrastructure built for humans. Every state check runs 9 shell commands.

Every cold start re-discovers context from scratch.

It's wasteful by design.

An agentic JSON-native OS fixes it. Benchmarks across 5 real scenarios:

Semantic search vs grep + cat: 91% fewer tokens

Agent pickup vs cold log parsing: 83% fewer tokens

State polling vs shell commands: 57% fewer tokens

Overall: 68.5% reduction

Benchmark is fully reproducible: python3 tools/ bench_compare.py

Plugs into Claude Code via MCP, runs local inference through Ollama, MIT licensed.

Would love feedback from people actually running agentic workflows.