Evaluating Multimodal Interactive Agents
Evaluating Multimodal Interactive Agents

Evaluating Multimodal Interactive Agents

In this paper, we assess the merits of these existing evaluation metrics and present a novel approach to evaluation called the Standardised Test Suite (STS). The STS uses behavioural scenarios mined from real human interaction data.