menu
close

Google Unveils Gemini 2.5 Flash with Advanced Reasoning Capabilities

Google has released Gemini 2.5 Flash in preview mode, bringing significant improvements to its fast, cost-efficient AI model. This new version introduces hybrid reasoning capabilities that allow developers to control the model's thinking process while maintaining speed and efficiency. The preview is now available in Google AI Studio, Vertex AI, and the Gemini app, with general availability planned for early June 2025.
Google Unveils Gemini 2.5 Flash with Advanced Reasoning Capabilities

Google has launched a preview version of Gemini 2.5 Flash, the latest iteration of its efficient AI model designed to balance performance with speed and cost-effectiveness.

Building upon the foundation of Gemini 2.0 Flash, the new 2.5 Flash model delivers a major upgrade in reasoning capabilities while maintaining its reputation for efficiency. Google describes it as "a major upgrade in reasoning capabilities, while still prioritizing speed and cost."

The standout feature of Gemini 2.5 Flash is its hybrid reasoning system. It's Google's "first fully hybrid reasoning model, allowing developers to turn thinking on or off, and set thinking budgets to optimize the balance between quality, cost, and latency." This innovative approach gives developers unprecedented control over how the model processes complex tasks.

In practice, this means developers can specify a "thinking budget" that controls how much reasoning the model performs. They can adjust "the number of tokens a model can generate while thinking" from 0 to 24,576 tokens using a slider in Google AI Studio and Vertex AI, or through an API parameter. When the thinking budget is set to zero, the model matches Gemini 2.0 Flash's cost and latency.

Pricing reflects this flexibility, with input tokens costing 15 cents per million and output tokens 60 cents per million without reasoning. With thinking capabilities activated, the cost increases to $3.50 per million tokens.

Benchmark tests show impressive results. Gemini 2.5 Flash "performs strongly on Hard Prompts in LMArena, second only to 2.5 Pro" and "has comparable metrics to other leading models for a fraction of the cost and size," continuing "to lead as the model with the best price-to-performance ratio."

Google describes 2.5 Flash as its "most efficient workhorse model designed for speed and low-cost," noting that it has "improved across key benchmarks for reasoning, multimodality, code and long context while getting even more efficient, using 20-30% less tokens" in evaluations.

The new model is currently available in preview mode through multiple channels. It's rolling out "in Google AI Studio (developers), Vertex AI (enterprise), and the Gemini app (everyone)." According to Google's I/O 2025 announcements, the updated version will be "generally available in Google AI Studio for developers and in Vertex AI for enterprises in early June," with Gemini 2.5 Pro following "soon after."

As Google continues to expand its AI capabilities, Gemini 2.5 Flash represents a significant step forward in making advanced reasoning more accessible and cost-effective for developers and users alike.

Source:

Latest News