Google has announced the extension of its innovative 'thinking budgets' feature to Gemini 2.5 Pro, the company's most advanced AI reasoning model, following its successful implementation in Gemini 2.5 Flash earlier this year.
The thinking budgets feature represents a significant advancement in AI cost management, allowing developers to precisely control how much computational power is allocated to reasoning through complex problems. With this capability, users can specify the maximum number of tokens a model uses for internal reasoning before generating a response, or even turn thinking capabilities off completely for simpler tasks.
"We launched 2.5 Flash with thinking budgets to give developers more control over cost by balancing latency and quality. And we're extending this capability to 2.5 Pro," Google stated in its announcement. The company confirmed that Gemini 2.5 Pro with budgets will be generally available for stable production use in the coming weeks.
This development addresses a fundamental tension in today's AI marketplace, where more sophisticated reasoning typically increases both latency and cost. For instance, with Gemini 2.5 Flash, enabling reasoning increases output costs nearly sixfold—from $0.60 to $3.50 per million tokens. By implementing thinking budgets, businesses can optimize their AI deployments based on specific use cases, enabling reasoning only when necessary.
The feature is particularly valuable for enterprise customers who need to carefully manage AI deployment costs while still accessing advanced capabilities. For simple queries like language translation or basic information retrieval, thinking can be disabled for maximum cost efficiency. For complex tasks requiring multi-step reasoning, such as mathematical problem-solving or nuanced analysis, the thinking function can be enabled and fine-tuned.
As AI becomes increasingly embedded in business workflows, Google's approach with customizable reasoning reflects a maturing market where cost optimization and performance tuning are becoming as important as raw capabilities—signaling a new phase in the commercialization of generative AI technologies.