@0xLogicrw: MiniMax Developer Relations Lead Ryan Lee announced that MaxProof, a test-time scaling framework for large language model mathematical proofs, has been officially open-sourced, along with a companion technical paper. MaxProof restructures mathematical proof during inference into an evolutionary search system, enabling inference scaling through verification, repair, and elimination mechanisms.

X AI KOLs Timeline 06/12/26, 07:17 AM Tools

Summary

MiniMax open-sourced MaxProof, a test-time scaling framework for LLM mathematical proofs, and released a companion paper. The framework uses an evolutionary search mechanism to enable the M3 model to achieve gold-medal scores on both the IMO 2025 and USAMO 2026 test sets.

MiniMax Developer Relations Lead Ryan Lee announced that MaxProof, a test-time scaling framework for large language model mathematical proofs, has been officially open-sourced, along with a companion technical paper. MaxProof restructures mathematical proof during inference into an evolutionary search system, enabling inference scaling through verification, repair, and elimination mechanisms. Under the MaxProof framework, the MiniMax-M3 model achieved scores of 35 and 36 (out of 42) on the International Mathematical Olympiad (IMO 2025) and the USA Mathematical Olympiad (USAMO 2026) test sets, respectively, both reaching the gold-medal threshold. In algorithm design, the development team built a multi-layered verification mechanism by integrating three expert capabilities: generation, verification, and repair. The generation expert is trained with long-horizon reinforcement learning, guided by primary reward signals from a generative verifier. The verification expert focuses on explicit error detection to reduce false positive rates. The repair expert uses fine-grained fine-tuning under critique conditions to correct flagged erroneous proofs. The three expert capabilities are eventually merged into the released M3 model. During inference, MaxProof reshapes the proof derivation process as an evolutionary search. The M3 model is decoupled into four roles: generator, verifier, optimizer, and scorer. The system first constructs a pool of candidate proofs as the population, applies mutations using locally repaired patches and rewritten explorations, and finally selects the best derivation through a tournament mechanism. The evolutionary search mechanism successfully transforms the model's best@K capability on mathematical proofs into more stable pass@1 performance.

Original Article

View Cached Full Text

Cached at: 06/12/26, 12:58 PM

Ryan Lee, Head of Developer Relations at MiniMax, announced that MaxProof, a test-time scaling framework for large model mathematical proofs, has been officially open-sourced, along with an accompanying technical paper.

MaxProof reframes the mathematical proof process during inference as an evolutionary search system, achieving inference-time scaling through verification, repair, and elimination mechanisms.

With the support of the MaxProof framework, the MiniMax-M3 model scored 35 and 36 points (out of a possible 42) on the International Mathematical Olympiad (IMO 2025) and the United States of America Mathematical Olympiad (USAMO 2026) test sets respectively, both achieving the gold medal threshold.

On the algorithmic design side, the development team constructed a multi-layered defense verification mechanism by integrating three expert capabilities: generation, verification, and repair. The generation expert uses the primary reward signal provided by the generative verifier to guide long-range reinforcement learning training. The verification expert focuses on explicit error detection to reduce the false positive rate. The repair expert corrects flagged erroneous proofs through refined fine-tuning under critical conditions. These three expert capabilities are ultimately merged into the released M3 model.

During inference, MaxProof reshapes the proof derivation process into an evolutionary search. The M3 model is decoupled into four roles: generator, verifier, optimizer, and scorer. The system first constructs a pool of candidate proofs as the population, applies mutations via local repair patches and re-exploration rewrites, and finally selects the best derivation through a tournament mechanism. This evolutionary search mechanism successfully converts the model’s best@K capability on mathematical proofs into a more stable pass@1 performance.

RyanLee (@RyanLeeMiniMax): With the MaxProof framework, M3 exceeded the human gold-medal threshold on both sets. In this paper, we go deeper into the technical path behind our progress in mathematical proof: improving the base model, aligning a verifier, building refinement capability, and designing the

Similar Articles

Maxproof

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

MiniMaxAI/MiniMax-M2.7

@stingning: We’re releasing a 30B-A3B reasoning model that reaches gold-medal level across both physics and math Olympiad evaluatio…

Submit Feedback

Similar Articles

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

@FinanceYF5: Google new paper: Let LLM solve math competition problems, accuracy jumps from 10% to 70%. [LEAP framework] Instead of having the model write a complete proof at once, it breaks down the problem into a goal tree, learns step by step from Lean verifier feedback, and reuses proven lemmas. Result: All 12 problems of Putnam 2025 solved, IMO style…

@stingning: We’re releasing a 30B-A3B reasoning model that reaches gold-medal level across both physics and math Olympiad evaluatio…