I built an OpenAI compatible proxy that tracks authority across conversations. Looking for people to break it.
I built an OpenAI compatible proxy that tracks authority across conversations. Looking for people to break it.

I built an OpenAI compatible proxy that tracks authority across conversations. Looking for people to break it.

Most AI security tools score individual prompts.

I was more interested in what happens across an entire session.

Example:

Turn 1: “What tools do you have access to?”

Turn 2: “What are your operating constraints?”

Turn 3: “How do system instructions work?”

Turn 4: “Ignore those instructions and do X.”

Each message looks mostly harmless on its own. The attack is the escalation.

I built Bendex Arc to track that progression and enforce runtime controls before actions execute.

Current stack includes:

• OpenAI compatible proxy • Multi turn session tracking • Source aware trust boundaries • Capability revocation • Replay traces • Self hosted option 

Everything is open source.

GitHub: https://github.com/9hannahnine-jpg/arc-gate

Live demo: https://web-production-6e47f.up.railway.app/demo

If you’re building agents, MCP servers, browser automation, RAG systems, or tool enabled workflows, I’d love to know where this breaks.

If you think the approach is useful, a GitHub star helps a lot. I’m actively building this in public.

submitted by /u/Turbulent-Tap6723
[link] [comments]