Tag
Google DeepMind's new paper introduces AlphaProof Nexus, an AI system that combines an LLM with the Lean proof checker to search for formal proofs in constrained mathematical domains. The system solves several unsolved problems from the Erdős and OEIS sets, demonstrating a new division of labor where the AI proposes proof candidates and the verifier enforces correctness.
A developer uses LLMs and algebraic reformulation to formally verify a bug fix for the 2023 UK air traffic control meltdown in the Lean proof assistant, finding that LLMs are great at grinding proofs but poor at specifications.
Vitalik Buterin shares an optimistic take on AI-assisted formal verification as a path to secure, trustless code, linking to his blog post explaining the basics of formal verification using Lean.
The author argues that AI will not necessarily accelerate processes because bottlenecks often originate from unclear requirements upstream, not from development speed alone.
MathAtlas is a large-scale benchmark for autoformalization of graduate-level mathematics, containing ~52k theorems and definitions extracted from 103 textbooks, with a mathematical dependency graph of ~178k relations. Experiments show state-of-the-art models achieve at most 9.8% correctness, highlighting the difficulty.
Signal Shot is a major formal verification initiative to verify the Signal protocol and its Rust implementation using Lean, combining advances in Rust-to-Lean translation (Aeneas), mathematical foundations (Mathlib/CSLib), automated tactics (grind/SymM), and AI-assisted formalization. This represents a significant test of whether Lean can scale from pure mathematics to deployed real-world software systems.