Google has announced that Gemini 2.5 Flash and 2.5 Pro are now stable and generally available, providing organizations with the reliability and scalability needed to confidently deploy advanced AI capabilities into mission-critical applications. Alongside these releases, the company has introduced Gemini 2.5 Flash-Lite in preview, described as their most cost-efficient and fastest 2.5 model yet.
Gemini 2.5 Flash-Lite is a reasoning model that allows for dynamic control of the thinking budget with an API parameter. Unlike other models in the 2.5 family, Flash-Lite is optimized for cost and speed with "thinking" turned off by default. Despite these optimizations, it supports all native tools including Grounding with Google Search, Code Execution, and URL Context in addition to function calling.
Flash-Lite delivers higher performance than the previous Flash-Lite model and is 1.5 times faster than 2.0 Flash at a lower cost. It's specifically designed for high-volume, latency-sensitive tasks like translation, classification, intelligent routing, and other cost-sensitive, high-scale operations. The model has significantly higher performance than 2.0 Flash-Lite on coding, math, science, reasoning and multimodal benchmarks, making it ideal for high-volume tasks.
Like other models in the Gemini 2.5 family, Flash-Lite comes with capabilities that make it helpful, including the ability to turn thinking on at different budgets, connecting to tools like Grounding with Google Search and code execution, multimodal input, and a 1 million-token context length. To ensure responses are current and factual, Flash-Lite can use Google Search as a built-in tool, intelligently deciding when to use Search to augment its knowledge.
Beyond Flash-Lite's introduction, Google has announced that Gemini 2.5 Pro has become the world-leading model across the WebDev Arena and LMArena leaderboards. With an ELO score of 1415 on the WebDev Arena coding leaderboard, it leads across all leaderboards in LMArena, which measures human preferences in multiple dimensions. Additionally, Google has infused LearnLM directly into Gemini 2.5, making it the world's leading model for learning. According to their latest report, Gemini 2.5 Pro outperformed competitors on every category of learning science principles, with educators and pedagogy experts preferring it over other offerings across a range of learning scenarios.
Gemini 2.5 Flash-Lite is now available in preview in Google AI Studio and Vertex AI, alongside the stable versions of 2.5 Flash and Pro. Developers can access the model (gemini-2.5-flash-lite-preview-06-17) through the Google Gen AI SDK, which provides a unified interface to the Gemini 2.5 model family through both the Gemini Developer API and the Vertex AI Gemini API.