<span class="vcard">/u/MetaKnowing</span>
/u/MetaKnowing

Anthropic’s Dario Amodei on the urgency of solving the black box problem: "They will be capable of so much autonomy that it is unacceptable for humanity to be totally ignorant of how they work."

https://www.darioamodei.com/post/the-urgency-of-interpretability submitted by /u/MetaKnowing [link] [comments]

Anthropic is considering giving models the ability to quit talking to an annoying or abusive user if they find the user’s requests too distressing

https://www.nytimes.com/2025/04/24/technology/ai-welfare-anthropic-claude.html submitted by /u/MetaKnowing [link] [comments]

What keeps Demis Hassabis up at night? As we approach "the final steps toward AGI," it’s the lack of international coordination on safety standards that haunts him. "It’s coming, and I’m not sure society’s ready."

submitted by /u/MetaKnowing [link] [comments]