For years, foundation models in robotics have primarily used vision-language pretraining as the stepping stone towards scaling robotics, allowing us to transfer the benefits of semantic generalization from existing large multimodal models. But what's been missing is how to effectively scale large multimodal model training in the domain of robotics itself—to establish scaling laws that corroborate the consistent (and predictable) improvement of robot intelligence with more compute & data, as has underpinned progress in other domains e.g. LLMs. This requires an architecture, training procedure, and data engine that pushes new sensorimotor capabilities, provides behavioral generalization, and grows with the vast and ever-expanding experience generated by interacting with the real physical world. To this end, we’re introducing GEN-0, a new class of embodied foundation models built for multimodal training directly on high-fidelity raw physical interaction. Its architecture builds on the strengths of vision and language models while also going beyond them—natively designed to capture human-level reflexes and physical commonsense. One core feature is Harmonic Reasoning, in which the models are trained to simultaneously think and act seamlessly. We’ve shared a glimpse of the capabilities of early precursors in our prior videos, and today we are sharing that not only does GEN-0 have breakthrough fundamental capabilities, but these capabilities are scaling: Surpassing the Intelligence Threshold – in an unprecedented high-data regime for robotics, we observe a phase transition at 7B where smaller models exhibit ossification, while larger ones continue to improve. We’ve since scaled GEN-0 to 10B+ model sizes, and observe fast adaptation to new tasks with increasingly less post-training. Scaling Laws – GEN-0 models exhibit strong scaling laws, in which more pretraining data and compute consistently (and predictably) improve downstream post-training performance of the model acros...
First seen: 2025-11-15 00:53
Last seen: 2025-11-15 10:55