Qwen3-Coder: Agentic Coding in the World

https://news.ycombinator.com/rss Hits: 11
Summary

GITHUB HUGGING FACE MODELSCOPE DISCORDToday, we’re announcing Qwen3-Coder, our most agentic code model to date. Qwen3-Coder is available in multiple sizes, but we’re excited to introduce its most powerful variant first: Qwen3-Coder-480B-A35B-Instruct — a 480B-parameter Mixture-of-Experts model with 35B active parameters which supports the context length of 256K tokens natively and 1M tokens with extrapolation methods, offering exceptional performance in both coding and agentic tasks. Qwen3-Coder-480B-A35B-Instruct sets new state-of-the-art results among open models on Agentic Coding, Agentic Browser-Use, and Agentic Tool-Use, comparable to Claude Sonnet 4.Alongside the model, we’re also open-sourcing a command-line tool for agentic coding: Qwen Code. Forked from Gemini Code, Qwen Code has been adapted with customized prompts and function calling protocols to fully unleash the capabilities of Qwen3-Coder on agentic coding tasks. Qwen3-Coder works seamlessly with the community’s best developer tools. As a foundation model, we hope it can be used anywhere across the digital world — Agentic Coding in the World!Qwen3-Coder#Pre-Training#There’s still room to scale in pretraining—and with Qwen3-Coder, we’re advancing along multiple dimensions to strengthen the model’s core capabilities:Scaling Tokens: 7.5T tokens (70% code ratio), excelling in coding while preserving general and math abilities.Scaling Context: Natively supports 256K context and can be extended up to 1M with YaRN, optimized for repo-scale and dynamic data (e.g., Pull Requests) to empower Agentic Coding.Scaling Synthetic Data: Leveraged Qwen2.5-Coder to clean and rewrite noisy data, significantly improving overall data quality.Post-Training#Scaling Code RL: Hard to Solve, Easy to Verify#Unlike the prevailing focus on competition-level code generation in the community, we believe all code tasks are naturally well-suited for execution-driven large-scale reinforcement learning. That’s why we scaled up Code RL t...

First seen: 2025-07-22 22:52

Last seen: 2025-07-23 08:53