Launching my own benchmarks!
Launching my own benchmarks!

Launching my own benchmarks!

Launching my own benchmarks for AI chatbots (non reasoning and reasoning models for now). Calling it the SaiNest Test!

I ask it questions from categories. Non-Reasoning categories: Normal, Simple searching questions, basic conversations, error handling, and programming. Reasoning categories: Logical questions, pattern recognition, programming.

Then I rate each answer out of 5. I try to be very unbiased in this. Then I total them up and see how much it is out of 10. That is the score of that model in that category.

Then I post the results on X (@SaiNemani1) and sometimes here!

submitted by /u/SaiCraze
[link] [comments]