<span class="vcard">/u/Important-Front429</span>
/u/Important-Front429

Evals, benchmarking, and more

This is more of a general question for the entire community (developers, end users, curious individuals). How do you see evals + benchmarking? Are they really relevant behind your decision to use a certain AI model? Are AI model releases (such as Llam…