<span class="vcard">/u/sirjoaco</span>
/u/sirjoaco

Distribution of favorite movies among 100 AI models

Performed an analysis by prompting 100 language models for their favorite movies, had them answer with only the movie title, and persuaded them to give an answer (A lot wanted to say they don't have a preference). Got very vanilla results, a…

I created a website (rival.tips) to view how the new models compare in one-shot challenges

https://reddit.com/link/1j12vc6/video/5qrwwq0tq3me1/player Last few weeks where a bit crazy with all the new gen of models, this makes it a bit easier to compare the models against. I was particularly surprised at how bad R1 performed to my likin…