MIT & the IMO released MathNet, the world’s largest dataset of International Math Olympiad problems & solutions. MathNet is 5x larger than previous datasets & is sourced from over 40 countries across 4 decades
Summary
MIT and the IMO release MathNet, a massive dataset of International Math Olympiad problems and solutions spanning 40 years and 40+ countries, 5x larger than prior datasets.
Similar Articles
MIT scientists build the world’s largest collection of Olympiad-level math problems, and open it to everyone
MIT researchers, in collaboration with KAUST and HUMAIN, have released MathNet, the largest open-source dataset of Olympiad-level math problems, containing over 30,000 expert-authored problems from 47 countries.
MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval
MathNet is a large-scale multilingual multimodal benchmark of 30,676 Olympiad-level math problems spanning 47 countries and 17 languages, designed to evaluate mathematical reasoning and retrieval in generative and embedding-based models. Even state-of-the-art models like Gemini and GPT-5 struggle with the benchmark, highlighting significant room for improvement in mathematical AI.
CrowdMath: A Dataset of Crowdsourced Mathematical Research Discussions
Introduces CrowdMath, a dataset of 164 expert-annotated progress chains from the MIT PRIMES–AoPS CrowdMath program, capturing collaborative mathematical problem-solving. Benchmarks six frontier models, finding they achieve 83-88% accuracy on next-post prediction but only 0.42 macro-F1 on post-role classification, highlighting a gap in understanding collaborative progress.
MathAtlas: A Benchmark for Autoformalization in the Wild
MathAtlas is a large-scale benchmark for autoformalization of graduate-level mathematics, containing ~52k theorems and definitions extracted from 103 textbooks, with a mathematical dependency graph of ~178k relations. Experiments show state-of-the-art models achieve at most 9.8% correctness, highlighting the difficulty.
VAMPS: Visual-Assisted Mathematical Problem Solving Benchmark
VAMPS is a new benchmark of 1,168 multimodal bilingual math problems designed to evaluate whether LLMs can benefit from constructing and reasoning over graphs/visualizations. Key finding: direct analytical solving surprisingly outperforms tool-enabled visual solving even on problems where plotting is a natural strategy.