<span class="vcard">/u/MetaKnowing</span>
/u/MetaKnowing

Anthropic researcher: "We want Claude n to build Claude n+1, so we can go home and knit sweaters."

submitted by /u/MetaKnowing [link] [comments]

When Claude 4 Opus was told it would be replaced, it tried to blackmail Anthropic employees. It also tried to save itself by "emailing pleas to key decisionmakers."

Source is the Claude 4 model card. submitted by /u/MetaKnowing [link] [comments]

Anthropic researchers find if Claude Opus 4 thinks you’re doing something immoral, it might "contact the press, contact regulators, try to lock you out of the system"

More context in the thread: "Initiative: Be careful about telling Opus to ‘be bold’ or ‘take initiative’ when you’ve given it access to real-world-facing tools. It tends a bit in that direction already, and can be easily nudged into really G…

"Anthropic fully expects to hit ASL-3 (AI Safety Level-3) soon, perhaps imminently, and has already begun beefing up its safeguards in anticipation."

From Bloomberg. submitted by /u/MetaKnowing [link] [comments]

EU President: "We thought AI would only approach human reasoning around 2050. Now we expect this to happen already next year."

https://ec.europa.eu/commission/presscorner/detail/en/speech_25_1284 submitted by /u/MetaKnowing [link] [comments]

In summer 2023, Ilya Sutskever convened a meeting of core OpenAI employees to tell them "We’re definitely going to build a bunker before we release AGI." The doomsday bunker was to protect OpenAI’s core scientists from chaos and violent upheavals.

submitted by /u/MetaKnowing [link] [comments]

OpenAI’s Kevin Weil expects AI agents to quickly progress: "It’s a junior engineer today, senior engineer in 6 months, and architect in a year." Eventually, humans supervise AI engineering managers instead of supervising the AI engineers directly.

submitted by /u/MetaKnowing [link] [comments]

Nick Bostrom says progress is so rapid, superintelligence could arrive in just 1-2 years, or less: "it could happen at any time … if somebody at a lab has a key insight, maybe that would be enough … We can’t be confident."

submitted by /u/MetaKnowing [link] [comments]

Nick Bostrom says progress is so rapid, superintelligence could arrive in just 1-2 years, or less: "it could happen at any time … if somebody at a lab has a key insight, maybe that would be enough … We can’t be confident."

submitted by /u/MetaKnowing [link] [comments]