/u/kamilc86

How much published AI research is wrong because of data leakage?

/u/kamilc86 June 1, 2026 June 1, 2026

There is a Princeton paper by Kapoor and Narayanan. They found data leakage in close to 300 papers across 17 fields, including medicine and economics. Leakage means the model was trained on information it would never have when it makes a real predicti…

artificial

How do you do OOD detection on a closed LLM API with no latent access?

/u/kamilc86 May 20, 2026 May 20, 2026

Classical OOD detection assumes you can see the model. Mahalanobis on features and energy on logits are typical, and both require cracking the model open. With closed LLM APIs you get text in, text out, and maybe top K logprobs per token if you are lu…

artificial

Anthropic’s new interpretability tool found Claude suspects it is being tested in 26% of benchmarks and never says so

/u/kamilc86 May 13, 2026 May 13, 2026

Anthropic published Natural Language Autoencoders last week, a tool that translates Claude's internal activations into human readable text. The key finding: during safety evaluations on SWE bench Verified, Claude formed the belief that it was being…

Share this:

Share this:

Share this: