@FinanceYF5: Google new paper: Let LLM solve math competition problems, accuracy jumps from 10% to 70%. [LEAP framework] Instead of having the model write a complete proof at once, it breaks down the problem into a goal tree, learns step by step from Lean verifier feedback, and reuses proven lemmas. Result: All 12 problems of Putnam 2025 solved, IMO style…

X AI KOLs Timeline Papers

Summary

Google new paper proposes the LEAP framework, which decomposes math problems into goal trees, learns from Lean verifier feedback, and improves LLM accuracy on math competition problems from 10% to 70%. It solves all 12 problems of Putnam 2025 and surpasses dedicated gold-medal-level systems on IMO-style benchmarks.

Google new paper: Let LLM solve math competition problems, accuracy jumps from 10% to 70%. [LEAP framework] Instead of having the model write a complete proof at once, it breaks down the problem into a goal tree, learns step by step from Lean verifier feedback, and reuses proven lemmas. Result: All 12 problems of Putnam 2025 solved, surpasses dedicated gold-medal-level systems by 48% on IMO-style benchmarks. Model capabilities unchanged, structure changed, upper bound changed. https://t.co/aY2IEGePO9
Original Article
View Cached Full Text

Cached at: 06/05/26, 09:09 AM

Google’s New Paper: Enables LLMs to solve math competition problems, accuracy jumps from 10% to 70%.

[LEAP Framework] Instead of having the model write a complete proof in one go, it breaks the problem into a goal tree, learning on the fly from Lean verifier feedback and reusing proven lemmas.

Result: All 12 problems of Putnam 2025 solved, surpassing the dedicated gold-medal-level system by 48% on IMO-style benchmarks.

The model’s capability didn’t change; the structure did, and the ceiling shifted. https://t.co/aY2IEGePO9

Similar Articles

LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks

arXiv cs.AI

LEAP is an agentic framework that enables general-purpose LLMs to achieve state-of-the-art performance in formal theorem proving in Lean, solving all 12 problems from the 2025 Putnam Competition and boosting formal solve rates from below 10% to 70% on a new benchmark (Lean-IMO-Bench), surpassing specialized systems.

@berryxia: Honestly, only truly brilliant people dare to say such things! An undergraduate student can handle the math training of LLMs! In a recent interview, Terence Tao laid out the core mystery of LLMs directly. The Fields Medal winner, the highest honor in mathematics — often called the Nobel Prize of math — and one of the most top contemporary…

X AI KOLs Timeline

Terence Tao pointed out that the math behind current LLMs is actually very simple, but the real puzzle lies in the intermediate zone of natural language data, which leads to unpredictable model behavior.