Skywork-OR1: new SOTA 32B thinking model with open weight

https://news.ycombinator.com/rss Hits: 7
Summary

✊ Unleashing the Power of Reinforcement Learning for Math and Code Reasoners 🤖 🔥 News April 13, 2025 : We release the Skywork-OR1 (Open Reasoner 1) series of models, including Skywork-OR1-Math-7B , Skywork-OR1-32B-Preview , and Skywork-OR1-7B-Preview . We open-source 🤗 Model weights: Skywork-OR1-Math-7B , Skywork-OR1-32B-Preview , Skywork-OR1-7B-Preview 🤗 Training data: Skywork-OR1-RL-Data (Coming Soon) 🧑‍💻 Code: Skywork-OR1 We also release a Notion Blog to share detailed training recipes and extensive experimental results, analysis, and insights, dedicated to helping the community to better research, understand, and push the frontier of open reasoning models. : We release the (Open Reasoner 1) series of models, including , , and . We open-source 📖 Overview The AIME24 scores versus training steps of Skywork-OR1-Math-7B in our multi-stage training pipeline. The Skywork-OR1 (Open Reasoner 1) model series consists of powerful math and code reasoning models trained using large-scale rule-based reinforcement learning with carefully designed datasets and training recipes. This series includes two general-purpose reasoning models— Skywork-OR1-7B-Preview and Skywork-OR1-32B-Preview —along with a math-specialized model, Skywork-OR1-Math-7B . Skywork-OR1-Math-7B is specifically optimized for mathematical reasoning, scoring 69.8 on AIME24 and 52.3 on AIME25 — well ahead of all models of similar size. is specifically optimized for mathematical reasoning, scoring on AIME24 and on AIME25 — well ahead of all models of similar size. Skywork-OR1-32B-Preview delivers the 671B-parameter Deepseek-R1 performance on math tasks (AIME24 and AIME25) and coding tasks (LiveCodeBench). delivers the 671B-parameter Deepseek-R1 performance on math tasks (AIME24 and AIME25) and coding tasks (LiveCodeBench). Skywork-OR1-7B-Preview outperforms all similarly sized models in both math and coding scenarios. The final release version will be available in two weeks. 📊 Evaluation We evaluate our models on...

First seen: 2025-04-13 17:01

Last seen: 2025-04-13 23:02