<span class="vcard">/u/jonathancheckwise</span>
/u/jonathancheckwise

Single-model AI image detection failed in production. Here’s what 6 models in ensemble actually look like

About a year ago I was running a single open-source AI image detector in production for a fact-checking pipeline. The accuracy on paper was solid, the accuracy on real submitted images was not. The same image classified differently across reruns when I…

I run an AI-based fact-checking platform and I refuse to let the LLM produce the verdict. Here’s why.

After a year building a production fact-checking system, the single most counter-intuitive design decision I keep defending is this: the LLM in our pipeline never produces a numeric score, never produces a true/false verdict, never produces anything th…