Olmo 3: Charting a path through the model flow to lead open-source AI

https://news.ycombinator.com/rss Hits: 21
Summary

Language models are often treated as snapshots—brief captures of a long and carefully curated development process. But sharing only the end result obscures the rich context needed to modify, adapt, and extend a model's capabilities. Many meaningful adjustments require integrating domain-specific knowledge deep within the development pipeline, not merely at the final stage. To truly advance open AI development and research, the entire model flow – not just its endpoint – should be accessible and customizable. The model flow is the full lifecycle of an LM: every stage, checkpoint, dataset, and dependency required to create and modify it. By exposing this complete process, the goal is to engender greater trust and enable more effective adaptation, collaboration, and innovation.With today's release of Olmo 3, we're empowering the open source community with not only state-of-the-art open models, but the entire model flow and full traceability back to training data.At its center is Olmo 3-Think (32B), the best fully open 32B-scale thinking model that for the first time lets you inspect intermediate reasoning traces and trace those behaviors back to the data and training decisions that produced them. Olmo 3 is a family of compact, dense models at 7 billion and 32 billion parameters that can run on everything from laptops to research clusters.Olmo 3-Base (7B, 32B) is our most powerful base model yet. When evaluated on our expanded, diverse evaluation suite, Olmo 3-Base delivers the strongest performance among fully open base models – where training data, code, and weights are all publicly available, like Stanford's Marin and Swiss AI's Apertus – and achieves competitive performance with some of the best open-weights base models of comparable size and architecture, including Qwen 2.5 and Gemma 3. Achieving strong results in programming, reading comprehension, and math problem solving, Olmo 3-Base maintains performance at extended context lengths (~up to 65K tokens)—providing...

First seen: 2025-11-21 08:07

Last seen: 2025-11-22 04:11