Uncategorized

Challenges in Detoxifying Language Models

September 15, 2021 September 15, 2021

In our paper, we focus on LMs and their propensity to generate toxic language. We study the effectiveness of different methods to mitigate LM toxicity, and their side-effects, and we investigate the reliability and limits of classifier-based automatic toxicity evaluation.