[Google DeepMind] the AI co-mathematician also achieves state of the art results on hard problemsolving benchmarks, including scoring 48% on FrontierMath Tier 4, a new high score among all AI systems evaluated.
Summary
Google DeepMind's AI co-mathematician achieves state-of-the-art results on hard problem-solving benchmarks, scoring 48% on FrontierMath Tier 4, the highest among all AI systems evaluated.
Similar Articles
Humans outperform AI at this highly rigorous mathematics test
The First Proof test evaluated four AI systems on novel research-level math problems, with the top model scoring only 6 out of 10, demonstrating that current AI still lags behind top mathematicians in rigorous reasoning.
Google DeepMind's Al agent autonomously solved 9 of 353 open Erdos problems in mathematics, at a cost of a few hundred dollars per problem.
Google DeepMind's AI agent autonomously solved 9 of 353 open Erdős problems in mathematics at a cost of a few hundred dollars per problem.
AI Co-Mathematician: Accelerating Mathematicians with Agentic AI
This paper introduces the AI Co-Mathematician, a workbench that uses agentic AI to support mathematicians in open-ended research tasks like ideation and theorem proving. Early tests show the system achieving state-of-the-art results on hard problem-solving benchmarks, including a 48% score on FrontierMath Tier 4.
Advanced Gemini with Deep Think Achieves Gold Medal Standard at International Mathematical Olympiad
Google DeepMind's advanced Gemini with Deep Think achieved gold-medal standard at the International Mathematical Olympiad 2025, solving 5 out of 6 problems for 35 points—a significant advance over last year's silver-medal performance, operating end-to-end in natural language within competition time limits.
@GoogleDeepMind: We evaluated AI’s impact by looking beyond test scores to behavioral shifts. Over eight weeks, results suggest students…
Google DeepMind's study in Sierra Leone shows that using Gemini as a pedagogical tool improved math scores and student engagement, with students increasingly using AI to understand concepts rather than just find answers.