Gemini 2.5 Flash

https://news.ycombinator.com/rss Hits: 25

Summary

Today we are rolling out an early version of Gemini 2.5 Flash in preview through the Gemini API via Google AI Studio and Vertex AI. Building upon the popular foundation of 2.0 Flash, this new version delivers a major upgrade in reasoning capabilities, while still prioritizing speed and cost. Gemini 2.5 Flash is our first fully hybrid reasoning model, giving developers the ability to turn thinking on or off. The model also allows developers to set thinking budgets to find the right tradeoff between quality, cost, and latency. Even with thinking off, developers can maintain the fast speeds of 2.0 Flash, and improve performance.Our Gemini 2.5 models are thinking models, capable of reasoning through their thoughts before responding. Instead of immediately generating an output, the model can perform a "thinking" process to better understand the prompt, break down complex tasks, and plan a response. On complex tasks that require multiple steps of reasoning (like solving math problems or analyzing research questions), the thinking process allows the model to arrive at more accurate and comprehensive answers. In fact, Gemini 2.5 Flash performs strongly on Hard Prompts in LMArena, second only to 2.5 Pro. 2.5 Flash has comparable metrics to other leading models for a fraction of the cost and size. Our most cost-efficient thinking model2.5 Flash continues to lead as the model with the best price-to-performance ratio. Gemini 2.5 Flash adds another model to Google’s pareto frontier of cost to quality.* Fine-grained controls to manage thinkingWe know that different use cases have different tradeoffs in quality, cost, and latency. To give developers flexibility, we’ve enabled setting a thinking budget that offers fine-grained control over the maximum number of tokens a model can generate while thinking. A higher budget allows the model to reason further to improve quality. Importantly, though, the budget sets a cap on how much 2.5 Flash can think, but the model does not use the fu...

First seen: 2025-04-17 20:13

Last seen: 2025-04-18 20:18

Read Full Article More from this Source

Gemini 2.5 Flash

Summary

Related News

Visual Transistor-level Simulation of the 6502 CPU

How a Pipe Organ Works

TmuxAI: AI-Powered, Non-Intrusive Terminal Assistant

Cut: Chattanooga Civic User Testing

Show HN: I created snapDOM to capture DOM nodes as images with exceptional speed