| Many have speculations on how the Anthropic saga with Fable will end. Prediction markets cover it too, giving a <50% chance of a re-release by July 1. This post isn't about my conclusion. Instead I want to share how AI can be used for world-modeling such situations, and gesture towards what the world will look like with autonomous AI systems get better at this than humans. I see three challenges with modeling the Anthropic situation:
Claude generated the image here of the causal graph that models this all out, starting with (a) Scenarios for what happened so far, (b) Moves each side can make, and (c) Outcomes. (I did this mostly by hand, my choice of key scenarios and outcomes, but in the future it shouldn't be too hard for an LLM-agent system to do this part.) I ended up with a large combination of unconditional and conditional forecasting questions, in total 33 I consider critical, to get an answer. Then I had to forecast. LLM agents can shine here as AI forecasters are about as good as human crowds now (e.g. see ForecastBench). And anyway 33 forecasts at the quality of crowds of humans would take 100+ hours, so it's not an option for a fast-moving situation. I used FutureSearch for all of these. The forecasts have reasoning like:
These are already very good forecasts, and will only get better. The final step was to reconcile everything. All the research done in all the forecasts were done independently by LLM agents, and were not consistent with each other. I did this by raising all the inconsistencies in Claude Code and addressing them manually, but again you can imagine a world-model-reconciliation module that uses a new set of LLM agents that fix up all the inconsistencies. More detail on the process, and all the results, are in https://www.lesswrong.com/posts/zhRe3tdBpsZbGCdDK/world-modeling-the-us-vs-anthropic-standoff-on-claude-fable [link] [comments] |