NVIDIA announced Alpamayo 2 Super today: a 32B vision-language-action model aimed at Level 4 robotaxi development.
The interesting part is not only the model size. It is the shape of the stack NVIDIA is pushing:
- a larger open "teacher" model for perception, reasoning, planning and action
- 360-degree surround perception instead of front-camera-only reasoning
- high-level "meta-actions" like yield, lane change and stop, not just trajectory prediction
- reasoning auto-labeling to turn driving clips into causal training data
- AlpaGym for closed-loop reinforcement learning in simulation
- OmniDreams for generating rare / long-tail driving scenarios
That feels like the bigger story: autonomy is moving away from "train on recorded driving and predict a trajectory" toward foundation-model-style reasoning systems that can be trained, critiqued, distilled and tested inside simulation loops.
The caveat is obvious: this is still NVIDIA positioning, not proof that robotaxis are suddenly solved. Model weights are expected this summer, and real-world validation is the hard part.
But if open AV foundation models become normal, smaller autonomy teams may stop rebuilding the same perception/planning infrastructure from scratch and start competing on data, safety validation, deployment constraints and closed-loop testing.
Source: NVIDIA press release https://investor.nvidia.com/news/press-release-details/2026/NVIDIA-Launches-Alpamayo-2-Super-Open-Reasoning-Model-for-Robotaxis/default.aspx
[link] [comments]