In a significant milestone for artificial intelligence, both OpenAI and Google DeepMind have demonstrated gold medal-level mathematical reasoning at the 2025 International Mathematical Olympiad (IMO), the world's most prestigious competition for young mathematicians.
Both companies' AI models achieved identical scores of 35 out of a possible 42 points, solving five out of six problems perfectly. This performance matched the gold medal threshold at this year's competition, where only about 11% of the 630 human contestants (approximately 67 students) received gold medals.
Google DeepMind's advanced version of Gemini with Deep Think was officially graded and certified by IMO coordinators, with IMO President Gregor Dolinar noting that their solutions were "astonishing in many respects" and "clear, precise and most of them easy to follow." This represents a significant advancement from last year, when DeepMind achieved silver medal status using specialized systems.
OpenAI evaluated its experimental reasoning model on the same problems under identical competition conditions—two 4.5-hour exam sessions without internet access or tools. While OpenAI wasn't part of the official IMO evaluation process, the company had its solutions independently graded by three former IMO medalists.
The timing of the announcements created some tension between the companies. OpenAI published its results on July 19, while Google DeepMind waited until July 21, respecting the IMO Board's request to share results after official verification and student recognition.
Junehyuk Jung, a math professor at Brown University and visiting researcher at Google DeepMind, believes this achievement suggests AI is less than a year away from helping mathematicians tackle unsolved research problems at the frontier of mathematics. "I think the moment we can solve hard reasoning problems in natural language will enable the potential for collaboration between AI and mathematicians," Jung told Reuters.
While impressive, some experts caution that IMO problems, though difficult, are conceptually simpler than frontier research mathematics. The achievement demonstrates AI's growing reasoning capabilities but doesn't necessarily indicate readiness for all aspects of mathematical research.