Google is transforming its most advanced AI model, Gemini 2.5 Pro, into what it calls a 'world model' - a system that can understand and simulate aspects of reality in ways that mirror human brain function.
According to Google DeepMind's recent announcements at Google I/O 2025, this world model capability will enable Gemini to make plans and imagine new experiences by understanding and simulating aspects of the world. This represents a major evolution in AI's ability to reason about and interact with complex environments.
"This is why we're working to extend our best multimodal foundation model, Gemini 2.5 Pro, to become a 'world model' that can make plans and imagine new experiences by understanding and simulating aspects of the world, just as the brain does," Google stated in its official blog.
The world model approach builds on Google's extensive research in training AI agents to master complex games and create interactive simulations. Evidence of these capabilities is already emerging in Gemini's ability to use world knowledge and reasoning to represent natural environments, understand intuitive physics, and teach robots to follow instructions and adapt on the fly.
Central to this evolution is the new Deep Think feature, an experimental enhanced reasoning mode for Gemini 2.5 Pro. Deep Think enables the model to consider multiple hypotheses before responding, significantly improving performance on complex math and coding tasks. The feature has already achieved impressive scores on challenging benchmarks like the 2025 USAMO math competition and LiveCodeBench for coding.
Gemini 2.5 Pro with Deep Think is currently available to trusted testers via the Gemini API, with Google conducting additional safety evaluations before wider release. Meanwhile, the standard Gemini 2.5 Pro model is expected to be generally available by late June 2025, following the earlier release of Gemini 2.5 Flash.
This advancement toward world modeling is part of Google's broader vision to create a universal AI assistant that can understand context, plan effectively, and take action across devices - ultimately transforming how humans interact with AI systems.