When you have no dataset, how do you create something reliable enough to evaluate a system in early stages?
We were blocked on evaluation of our multi agentic AI for a while because we assumed we needed a complete dataset before we could trust any results. What finally unblocked us was starting with something much smaller and more practical. We picked one wo…