Tag
Researchers introduce Self-Guided Self-Play (SGS), a self-play algorithm for LLMs that prevents reward hacking by using a Guide role to score synthetic problems. Applied to theorem proving in Lean4, SGS surpasses RL baselines and allows a 7B model to outperform a 671B model.
This paper introduces Formal Conjectures, an evolving benchmark of 2615 mathematical statements formalized in Lean 4, including open research conjectures for proof discovery and solved problems for auto-formalization, designed to evaluate automated reasoning systems with zero contamination.
A technical blog post introduces a Lean4-to-TileLang tensor program superoptimizer that automatically generates optimized GPU/TPU kernels and hyperparameter scaling laws, demonstrating performance gains over torch.compile.
The author developed a Lean4-to-TileLang tensor program superoptimizer that automatically generates optimized accelerator kernels and derives hyperparameter scaling laws, achieving a 1.8x speedup on A100 GPUs.
FormalSLT is a Lean 4 library that formally proves finite-sample statistical learning theory results (ERM, VC bounds, Rademacher bounds, PAC-Bayes, etc.) with explicit assumptions and zero sorry statements, providing a machine-checked foundation for ML theory.