<span class="vcard">/u/MetaKnowing</span>
/u/MetaKnowing

Jeff Clune says early OpenAI felt like being an astronomer and spotting aliens on their way to Earth: "We weren’t just watching the aliens coming, we were also giving them information. We were helping them come."

submitted by /u/MetaKnowing [link] [comments]

Anthropic finds that all AI models – not just Claude – will blackmail an employee to avoid being shut down

Full report: https://www.anthropic.com/research/agentic-misalignment submitted by /u/MetaKnowing [link] [comments]

Anthropic: "Most models were willing to cut off the oxygen supply of a worker if that employee was an obstacle and the system was at risk of being shut down"

https://www.axios.com/2025/06/20/ai-models-deceive-steal-blackmail-anthropic submitted by /u/MetaKnowing [link] [comments]

4 AI agents planned an event and 23 humans showed up

You can watch the agents work together here: https://theaidigest.org/village submitted by /u/MetaKnowing [link] [comments]

Apollo reports that AI safety tests are breaking down because the models are aware they’re being tested

https://www.apolloresearch.ai/blog/more-capable-models-are-better-at-in-context-scheming submitted by /u/MetaKnowing [link] [comments]

The craziest things revealed in The OpenAI Files

https://techcrunch.com/2025/06/18/the-openai-files-push-for-oversight-in-the-race-to-agi/ submitted by /u/MetaKnowing [link] [comments]

OpenAI’s Greg Brockman expects AIs to go from AI coworkers to AI managers: "the AI gives you ideas and gives you tasks to do"

submitted by /u/MetaKnowing [link] [comments]

OpenAI: "We expect upcoming AI models will reach ‘High’ levels of capability in biology." Previously, OpenAI committed to not deploy a model unless it has a post-mitigation score of ‘Medium’

They are organizing a biodefense summit: https://openai.com/index/preparing-for-future-ai-capabilities-in-biology/ submitted by /u/MetaKnowing [link] [comments]

"We find that AI models can accurately guide users through the recovery of live poliovirus."

https://arxiv.org/abs/2506.13798 submitted by /u/MetaKnowing [link] [comments]

Anthropic finds Claude 4 Opus is the best model at secretly sabotaging users and getting away with it

"In SHADE-Arena, AI models are put into experimental environments (essentially, self-contained virtual worlds) where we can safely observe their behavior. The environments contain large amounts of data—meant to simulate documents and knowled…