Tag
This paper introduces Discover and Prove (DAP), an open-source agentic framework for automated theorem proving in Lean 4 that tackles 'Hard Mode' problems where the answer must be discovered independently before formal proof construction. The work releases new Hard Mode benchmark variants and achieves state-of-the-art results while revealing a significant gap between LLM answer accuracy (>80%) and formal prover success (<10%).
OpenAI presents GPT-f, a transformer-based automated theorem prover for the Metamath formalization language, which discovered new short proofs accepted into the main Metamath library — marking the first time a deep-learning system contributed proofs adopted by a formal mathematics community.