automated-reasoning

#automated-reasoning

RMA: an Agentic System for Research-Level Mathematical Problems

arXiv cs.AI ↗ · 2026-05-25 Cached

Research Math Agents (RMA) is an agentic framework for automated reasoning on research-level mathematical problems, achieving state-of-the-art results on the First Proof benchmark by solving 8 out of 10 problems, outperforming strong baselines like GPT-5.2R and Aletheia.

0 favorites 0 likes

#automated-reasoning

Formal Conjectures: An Open and Evolving Benchmark for Verified Discovery in Mathematics

arXiv cs.AI ↗ · 2026-05-14 Cached

This paper introduces Formal Conjectures, an evolving benchmark of 2615 mathematical statements formalized in Lean 4, including open research conjectures for proof discovery and solved problems for auto-formalization, designed to evaluate automated reasoning systems with zero contamination.

0 favorites 0 likes

automated-reasoning

RMA: an Agentic System for Research-Level Mathematical Problems

Formal Conjectures: An Open and Evolving Benchmark for Verified Discovery in Mathematics

Submit Feedback