Older AI models showed some capacity for generalization, but pre-O1 models weren't directly incentivized to reason. This fundamentally differs from humans: our limbic system can choose its reward function and reward us for making correct reasoning steps. The key distinction was that older models only received RLHF rewards based on outcomes, not the reasoning process itself.
The current gap between humans and O1 models centers on flexibility: AI can't choose its reward function. This limitation impacts higher-level capabilities like creativity and autonomous goal-setting (like maximizing profit). We're essentially turning these models into reasoning engines.
However, there are notable similarities between humans and AI:
- Both use "System 1" thinking: We generate sequences of pattern-matched data. In humans, we call this imagination; in models, we call it output. Imagination is essentially predicted output that isn't physically present. This is exactly what models do and what we do (relating to the thousand brains theory of columns).
- Both can potentially train on generated data. Models can use their outputs for further training (though this might require an evaluator function). Humans might do something similar during sleep.
- Both can improve System 1 thinking through evaluation. With an evaluator function, models can increase their generation performance to match their evaluation capabilities. This makes sense because it's typically easier to validate an answer than to generate a good one initially. Humans can do this too.
The key aspect here is that while models are becoming more sophisticated reasoning engines, they still lack the flexible, self-directed reward systems that humans possess through their limbic systems.
[link] [comments]