Generative language modeling for automated theorem proving

OpenAI Blog Papers

Summary

OpenAI presents GPT-f, a transformer-based automated theorem prover for the Metamath formalization language, which discovered new short proofs accepted into the main Metamath library — marking the first time a deep-learning system contributed proofs adopted by a formal mathematics community.

No content available
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 04/20/26, 02:55 PM

# Generative language modeling for automated theorem proving Source: [https://openai.com/index/generative-language-modeling-for-automated-theorem-proving/](https://openai.com/index/generative-language-modeling-for-automated-theorem-proving/) OpenAI## Abstract We explore the application of transformer\-based language models to automated theorem proving\. This work is motivated by the possibility that a major limitation of automated theorem provers compared to humans \-\- the generation of original mathematical terms \-\- might be addressable via generation from language models\. We present an automated prover and proof assistant, GPT‑f, for the Metamath formalization language, and analyze its performance\. GPT‑f found new short proofs that were accepted into the main Metamath library, which is to our knowledge, the first time a deep\-learning based system has contributed proofs that were adopted by a formal mathematics community\.

Similar Articles

Better language models and their implications

OpenAI Blog

OpenAI introduces GPT-2, a 1.5 billion parameter transformer-based language model trained on 40GB of internet text that achieves state-of-the-art performance on language modeling benchmarks and demonstrates zero-shot capabilities in reading comprehension, translation, question answering, and summarization. Due to safety concerns, only a smaller model and technical paper are released publicly rather than the full trained model.

Advancing science and math with GPT-5.2

OpenAI Blog

OpenAI releases GPT-5.2, featuring GPT-5.2 Pro and GPT-5.2 Thinking variants optimized for scientific and mathematical work. The models achieve state-of-the-art performance on benchmarks like GPQA Diamond (93.2%) and FrontierMath (40.3%), demonstrating improved reasoning capabilities designed to accelerate scientific research across physics, chemistry, biology, and mathematics.

First look at GPT-5

OpenAI Blog

OpenAI provides a first look at GPT-5, representing a major advancement in large language models with potential paradigm-shifting capabilities.

Early experiments in accelerating science with GPT-5

OpenAI Blog

OpenAI releases a research paper documenting early experiments where GPT-5 accelerates scientific discovery across mathematics, physics, biology, and other fields, including cases where the model helped identify disease mechanisms, solve open problems, and improve optimization algorithms in collaboration with leading universities and national laboratories.

OpenAI’s technology explained

OpenAI Blog

OpenAI publishes an explainer on its core technology, detailing how language models like GPT-4 are developed through pre-training (learning from vast text data) and post-training (alignment with human values and safety practices). The article emphasizes OpenAI's nonprofit mission structure and explains the distinction between raw base models and refined, usable versions.