<span class="vcard">/u/Disastrous_Room_927</span>
/u/Disastrous_Room_927

Anyone else following the drama behind the TurboQuant paper?

A few hours ago, the first author of a paper that played a significant role in the TQ paper posted about some ongoing issues: In May 2025, our emails directly raised the theoretical and empirical issues; Majid wrote that he had informed his co-authors…

Construct Validity in Large Language Model Benchmarks

If you’re unfamiliar with the term, “construct validity” is a psychometric term for a measuring the theoretical concept it’s intended to: We reviewed 445 LLM benchmarks from the proceedings of top AI conferences. We found many measurement challenges, …