Launching my own benchmarks!
Launching my own benchmarks for AI chatbots (non reasoning and reasoning models for now). Calling it the SaiNest Test! I ask it questions from categories. Non-Reasoning categories: Normal, Simple searching questions, basic conversations, error handling…