Microsoft Researchers Propose AI Morality Test for LLMs in New Study
Microsoft Researchers Propose AI Morality Test for LLMs in New Study

Microsoft Researchers Propose AI Morality Test for LLMs in New Study

Researchers from Microsoft have just proposed using a psychological assessment tool called the Defining Issues Test (DIT) to evaluate the moral reasoning capabilities of large language models (LLMs) like GPT-3, ChatGPT, etc.

The DIT presents moral dilemmas and has subjects rate and rank the importance of various ethical considerations related to the dilemma. It allows quantifying the sophistication of moral thinking through a P-score.

In this new paper, the researchers tested prominent LLMs with adapted DIT prompts containing AI-relevant moral scenarios.

Key findings:

  • Large models like GPT-3 failed to comprehend prompts and scored near random baseline in moral reasoning.
  • ChatGPT, Text-davinci-003 and GPT-4 showed coherent moral reasoning with above-random P-scores.
  • Surprisingly, the smaller 70B LlamaChat model outscored larger models in its P-score, demonstrating advanced ethics understanding is possible without massive parameters.
  • The models operated mostly at intermediate conventional levels as per Kohlberg's moral development theory. No model exhibited highly mature moral reasoning.

I think this is an interesting framework to evaluate and improve LLMs' moral intelligence before deploying them into sensitive real-world environments - to the extent that a model can be said to possess moral intelligence (or, seem to possess it?).

Here's a link to my full summary with a lot more background on Kohlberg's model (had to read up on it since I didn't study psych). Full paper is here

submitted by /u/Successful-Western27
[link] [comments]