menu
close

Google Brings Project Astra's Visual Powers to Gemini Live

Google has integrated Project Astra's advanced visual understanding capabilities into Gemini Live, enabling the AI assistant to see and interpret the world through users' smartphone cameras and screens. This significant upgrade, announced at Google I/O 2025, allows Gemini to provide real-time assistance by analyzing visual information during conversations. The feature, previously limited to paid subscribers, is now available to all Android and iOS users, marking a major step toward Google's vision of creating a universal AI assistant.
Google Brings Project Astra's Visual Powers to Gemini Live

Google has taken a significant leap forward in making AI assistants more visually aware by integrating Project Astra's capabilities into Gemini Live, as announced at Google I/O 2025 on May 20.

Project Astra, first unveiled at Google I/O 2024, represents Google DeepMind's vision for a "universal AI assistant that can be truly helpful in everyday life." The ultimate goal is transforming the Gemini app into a universal AI assistant that performs everyday tasks, handles mundane administration, and surfaces personalized recommendations to make users more productive and enrich their lives. This starts with capabilities first explored in Project Astra, such as video understanding, screen sharing, and memory.

Google announced that Project Astra — the company's low latency, multimodal AI experience — will power an array of new experiences in Search, the Gemini AI app, and products from third-party developers. Most notably, Project Astra is powering a new Search Live feature in Google Search. When using AI Mode or Lens, users can click the "Live" button to ask questions about what they're seeing through their smartphone's camera. Project Astra streams live video and audio into an AI model and responds with answers with little to no latency.

In the Gemini app, Google says Project Astra's real-time video and screen-sharing capabilities are coming to all users. While Project Astra already powers Gemini Live's low-latency conversations, this visual input was previously reserved for paid subscribers. People love Gemini Live, with conversations five times longer than text-based interactions on average because it offers new ways to get help, whether troubleshooting a broken appliance or getting personalized shopping advice. That's why, starting today, Google is making Gemini Live with camera and screen sharing available to everyone on Android and iOS for free.

The practical applications are impressive. Google demoed Project Astra capabilities by showing a video of Gemini Live helping with everyday activities, like fixing a bike. In the video, the user asks Project Astra to look for the manual of the bike they're repairing. The AI browses the web, finds the document, and asks what the user wants to see next. The user then tells Project Astra to scroll the document until it finds a section about brakes. The Android phone's screen shows Project Astra doing just that and finding the information. This kind of agentic behavior suggests Project Astra will be able to access specific information online, even within documents.

Over the past year, Google has been integrating these capabilities into Gemini Live for more people to experience. The company continues to improve and explore new innovations, including upgrading voice output to be more natural with native audio, improving memory, and adding computer control. Google is now gathering feedback about these capabilities from trusted testers and working to bring them to Gemini Live, new experiences in Search, the Live API for developers, and new form factors like glasses.

This integration represents a significant advancement in making AI assistants more contextually aware and useful in everyday scenarios, allowing technology to better understand users' environments and provide more relevant assistance.

Source:

Latest News