Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search

Hugging Face Daily Papers Papers

Summary

Lean Refactor presents a retrieval-augmented agentic framework for multi-objective, controllable, and version-robust refactoring of Lean proofs, achieving significant compression and compilation-time reduction.

We present Lean Refactor, a plug-and-play retrieval-augmented agentic framework for multi-objective, controllable, and version-robust refactoring of Lean proofs. LLM-generated proofs are notoriously correct-but-verbose and brittle across library versions, yet existing refactoring works overlook three practical challenges: 1) Lean refactoring is natively multi-objective (proof length, compilation cost, and version compatibility are often in tension); 2) Lean repositories have fragile compatibility, whereas LLM releases are unaware of Lean/Mathlib versions; 3) Training-based pipelines require repeated fine-tuning with each new LLM release, scaling neither with model churn nor with Lean's release cycle. Lean Refactor steers a frozen agentic LLM with retrievals from a curated database of multi-objective refactoring strategies, each densely annotated with metadata such as supported Lean/Mathlib versions and expected compilation-cost reduction. Experiments show over 70% token-level compression on competition benchmarks, over 20% on research repositories, and up to 60% compilation-time reduction, outperforming prior work and Claude Code. Version-filtered retrieval further improves compression on the target Lean version, and refactored miniF2F proofs exhibit stronger zero-shot version transfer to future Lean releases than their unrefactored counterparts.
Original Article
View Cached Full Text

Cached at: 05/22/26, 02:20 PM

Paper page - Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search

Source: https://huggingface.co/papers/2605.20244

Abstract

Lean Refactor presents a retrieval-augmented agentic framework that improves Lean proof refactoring by addressing multi-objective optimization, version compatibility, and scalability challenges through curated strategy databases and version-filtered retrieval.

We present Lean Refactor, a plug-and-playretrieval-augmented agentic frameworkfor multi-objective, controllable, andversion-robust refactoringofLean proofs. LLM-generated proofs are notoriously correct-but-verbose and brittle across library versions, yet existing refactoring works overlook three practical challenges: 1) Lean refactoring is natively multi-objective (proof length, compilation cost, and version compatibility are often in tension); 2) Lean repositories have fragile compatibility, whereas LLM releases are unaware of Lean/Mathlib versions; 3) Training-based pipelines require repeated fine-tuning with each new LLM release, scaling neither with model churn nor with Lean’s release cycle. Lean Refactor steers a frozen agentic LLM with retrievals from a curated database of multi-objective refactoring strategies, each densely annotated with metadata such as supported Lean/Mathlib versionsand expected compilation-cost reduction. Experiments show over 70%token-level compressionon competition benchmarks, over 20% on research repositories, and up to 60% compilation-time reduction, outperforming prior work and Claude Code. Version-filtered retrieval further improves compression on the target Lean version, and refactored miniF2F proofs exhibit strongerzero-shot version transferto future Lean releases than their unrefactored counterparts.

View arXiv pageView PDFProject pageAdd to collection

Get this paper in your agent:

hf papers read 2605\.20244

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.20244 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.20244 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.20244 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

Discover and Prove: An Open-source Agentic Framework for Hard Mode Automated Theorem Proving in Lean 4

arXiv cs.CL

This paper introduces Discover and Prove (DAP), an open-source agentic framework for automated theorem proving in Lean 4 that tackles 'Hard Mode' problems where the answer must be discovered independently before formal proof construction. The work releases new Hard Mode benchmark variants and achieves state-of-the-art results while revealing a significant gap between LLM answer accuracy (>80%) and formal prover success (<10%).

OProver: A Unified Framework for Agentic Formal Theorem Proving

Hugging Face Daily Papers

OProver is a unified framework for agentic formal theorem proving in Lean 4 that iteratively improves proof generation through training with verified proofs and compiler feedback, achieving state-of-the-art results on multiple benchmarks.

Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning

Hugging Face Daily Papers

ACTS (Agentic Chain-of-Thought Steering) formulates LLM reasoning control as a Markov decision process where a controller agent adaptively steers a frozen reasoner during inference using reasoning strategies and steering phrases. The approach achieves comparable accuracy to full-thinking models with significant token savings, enabling controllable accuracy-efficiency trade-offs.

R-APS: Compositional Reasoning and In-Context Meta-Learning for Constrained Design via Reflective Adversarial Pareto Search

arXiv cs.AI

R-APS (Reflective Adversarial Pareto Search) is a novel method for constrained design tasks that addresses three structural failures in LLM-based agentic systems—error propagation, robustness evaluation, and knowledge invalidation—through reasoning-mode decomposition across three timescales, requiring no fine-tuning. Evaluated on planar mechanism synthesis, it achieves 3.5x tighter robustness certificates, 46% faster iterations-to-first-admission, and 2.1x Chamfer-distance reduction over baselines.