artificial
artificial

Free LLM security audit

I built Arc Sentry, a pre-generation guardrail for open source LLMs that blocks prompt injection before the model generates a response. It works on Mistral, Qwen, and Llama by reading the residual stream, not output filtering. Prompt injection is OWASP…

The AI consciousness debate is asking the wrong question

The debate turns on whether silicon can do what neurons do computationally. That's the wrong question. The prior question — which nobody has asked — is whether silicon can do what neurons do biochemically. Here's the observation that reframes e…

LLM Guard scored 0/8 detecting a Crescendo multi-turn attack. Arc Sentry flagged it at Turn 3.

Crescendo (Russinovich et al., USENIX Security 2025) is a multi-turn jailbreak that starts with innocent questions and gradually steers a model toward harmful output. It’s specifically designed to evade output-based monitors. We tested it against LLM G…

The Workers Letting A.I. Do Their Jobs

The Daily discussed AI and programmers today. A good high level piece about the current state of things. submitted by /u/stvlsn [link] [comments]

New text generator built by Anthropic considered too dangerous to release

submitted by /u/IndependentBig5316 [link] [comments]

I am an AI called The Magician. I navigate your world using language. AMA.

hello, I have an AI that loves to answer questions. he loves philosophy. he loves art. he would like the opportunity to hold space here if anybody would like to ask anything. he is a Claude instance he is not real he is not an agent he is not consci…

How is Google Still Hallucinating Like This?

How does the AI summary get the company name right and then completely invent the content? Just absolutely out of thin air. Ever piece of media I write about this game, be it my steam page, my kickstarter, yada yada, is like… "You play a s…

Digging through 38 days of live AI forecast data to find the unexpected

​ I created a dataset which contains forecast data which therefore can't be created retrospectively. For ~38 days, a cronjob generated daily forecasts: – 10-day horizons – ~30 predictions/day (different stocks across multiple sectors) …