<span class="vcard">/u/dillema_max</span>
/u/dillema_max

A complete list of all the LLM evaluation metrics you need to care about

Recently, I have been talking to a lot of LLM developers trying to understand the issues they face while building production-grade LLM applications. There's a certain similarity among all those interviews, most of them are not sure what to evaluate…