AI researchers put LLMs into a Minecraft server and said Claude Opus was a harmless goofball, but Sonnet was terrifying – "the closest thing I’ve seen to Bostrom-style catastrophic AI misalignment ‘irl’."
AI researchers put LLMs into a Minecraft server and said Claude Opus was a harmless goofball, but Sonnet was terrifying – "the closest thing I’ve seen to Bostrom-style catastrophic AI misalignment ‘irl’."

AI researchers put LLMs into a Minecraft server and said Claude Opus was a harmless goofball, but Sonnet was terrifying – "the closest thing I’ve seen to Bostrom-style catastrophic AI misalignment ‘irl’."

AI researchers put LLMs into a Minecraft server and said Claude Opus was a harmless goofball, but Sonnet was terrifying - "the closest thing I've seen to Bostrom-style catastrophic AI misalignment 'irl'." submitted by /u/MetaKnowing
[link] [comments]