@stingning: We’re releasing a 30B-A3B reasoning model that reaches gold-medal level across both physics and math Olympiad evaluatio…

X AI KOLs Timeline 05/15/26, 03:08 AM Models

Summary

Researchers release SU-01, a 30B-A3B reasoning model achieving gold-medal-level performance on physics and math Olympiad problems using a unified scaling recipe for proof search.

We’re releasing a 30B-A3B reasoning model that reaches gold-medal level across both physics and math Olympiad evaluations: IPhO directly, and IMO/USAMO with test-time self-verification and refinement. A simple, unified scaling recipe for proof search. https://t.co/yc2ZlLVbD2

Original Article

View Cached Full Text

Cached at: 05/15/26, 05:06 PM

A simple, unified scaling recipe for proof search.

https://t.co/yc2ZlLVbD2

Paper page - Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Source: https://huggingface.co/papers/2605.13301 Published on May 13

#1 Paper of the day Authors:

Abstract

A systematic approach transforms post-trained reasoning models into rigorous olympiad-level solvers through reverse-perplexity curriculum, two-stage reinforcement learning, and test-time scaling, achieving gold-medal performance on mathematical and physics competitions.

Recent progress inreasoning modelshas substantially advanced long-horizon mathematical andscientific problem solving, with several systems now reaching gold-medal-level performance onInternational Mathematical Olympiad(IMO) andInternational Physics Olympiad(IPhO) problems. In this paper, we introduce a simple and unified recipe for converting a post-trained reasoningbackboneinto a rigorous olympiad-level solver. The recipe first uses areverse-perplexity curriculumforSFTto instill rigorousproof-searchandself-checking behaviors, then scales these behaviors through a two-stageRLpipeline that progresses fromRLwithverifiable rewardsto more delicateproof-level RL, and finally boosts solving performance withtest-time scaling. Applying this recipe, we train a 30B-A3BbackbonewithSFTon around 340K sub-8K-tokentrajectories followed by 200RLsteps. The resulting model, SU-01, supports stable reasoning on difficult problems with trajectories exceeding 100Ktokens, while achieving gold-medal-level performance on mathematical and physical olympiad competitions, including IMO 2025/USAMO 2026 and IPhO 2024/2025. It also demonstrates strong generalization of scientific reasoning to domains beyond mathematics and physics.

View arXiv page View PDF Project page GitHub41 Add to collection

Get this paper in your agent:

hf papers read 2605\.13301

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper1

#### Simplified-Reasoning/SU-01 Reinforcement Learning• 31B• Updated1 day ago • 21 • 2

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.13301 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.13301 in a Space README.md to link it from this page.

@stingning: We’re releasing a 30B-A3B reasoning model that reaches gold-medal level across both physics and math Olympiad evaluatio…

Paper page - Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Abstract

Models citing this paper1

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper2

Similar Articles

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

@ClementDelangue: Paper of the day! https://huggingface.co/papers/2605.13301…

Introducing OpenAI o1

OpenAI o3-mini

Submit Feedback

Similar Articles

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

@ClementDelangue: Paper of the day! https://huggingface.co/papers/2605.13301…