OpenAI o1-mini

OpenAI Blog Models

Summary

OpenAI releases o1-mini, a cost-efficient reasoning model that matches o1 performance on STEM tasks like math and coding while being 80% cheaper. The model is optimized for reasoning-heavy applications and is now available to API users and ChatGPT Plus/Team/Enterprise/Edu subscribers.

Advancing cost-efficient reasoning
Original Article
View Cached Full Text

Cached at: 04/20/26, 02:44 PM

# OpenAI o1-mini Source: [https://openai.com/index/openai-o1-mini-advancing-cost-efficient-reasoning/](https://openai.com/index/openai-o1-mini-advancing-cost-efficient-reasoning/) OpenAIAdvancing cost\-efficient reasoning\. We're releasing OpenAI o1‑mini, a cost\-efficient reasoning model\. o1‑mini excels at STEM, especially math and coding—nearly matching the performance of[OpenAI o1](https://openai.com/index/introducing-openai-o1-preview/)on evaluation benchmarks such as AIME and Codeforces\. We expect o1‑mini will be a faster, cost\-effective model for applications that require reasoning without broad world knowledge\. Today, we are launching o1‑mini to[tier 5 API users⁠\(opens in a new window\)](https://platform.openai.com/docs/guides/rate-limits/usage-tiers)at a cost that is 80% cheaper than OpenAI o1‑preview\. ChatGPT Plus, Team, Enterprise, and Edu users can use o1‑mini as an alternative to o1‑preview, with higher rate limits and lower latency \(see[Model Speed⁠](https://openai.com/index/openai-o1-mini-advancing-cost-efficient-reasoning/#model-speed)\)\. Large language models such as o1 are pre\-trained on vast text datasets\. While these high\-capacity models have broad world knowledge, they can be expensive and slow for real\-world applications\. In contrast, o1‑mini is a smaller model optimized for STEM reasoning during pretraining\. After training with the same high\-compute reinforcement learning \(RL\) pipeline as o1, o1‑mini achieves comparable performance on many useful reasoning tasks, while being significantly more cost efficient\. When evaluated on benchmarks requiring intelligence and reasoning, o1‑mini performs well compared to o1‑preview and o1\. However, o1‑mini performs worse on tasks requiring non\-STEM factual knowledge \(see[Limitations⁠](https://openai.com/index/openai-o1-mini-advancing-cost-efficient-reasoning/#limitations-and-whats-next)\)\. ##### Math Performance vs Inference Cost **Mathematics:**In the high school AIME math competition, o1‑mini \(70\.0%\) is competitive with o1 \(74\.4%\)–while being significantly cheaper–and outperforms o1‑preview \(44\.6%\)\. o1‑mini’s score \(about 11/15 questions\) places it in approximately the top 500 US high\-school students\. **Coding:**On the Codeforces competition website, o1‑mini achieves 1650 Elo, which is again competitive with o1 \(1673\) and higher than o1‑preview \(1258\)\. This Elo score puts the model at approximately the 86th percentile of programmers who compete on the Codeforces platform\. o1‑mini also performs well on the HumanEval coding benchmark and high\-school level cybersecurity capture the flag challenges \(CTFs\)\. ##### Codeforces ##### HumanEval ##### Cybersecurity CTFs 20\.0%43\.0%28\.7%Accuracy \(Pass@12\) **STEM:**On some academic benchmarks requiring reasoning, such as GPQA \(science\) and MATH\-500, o1‑mini outperforms GPT‑4o\. o1‑mini does not perform as well as GPT‑4o on tasks such as MMLU and lags behind o1‑preview on GPQA due to its lack of broad world knowledge\. ##### MMLU ###### 0\-shot CoT ##### GPQA ###### Diamond, 0\-shot CoT ##### MATH\-500 ###### 0\-shot CoT **Human preference evaluation:**We had human raters compare o1‑mini to GPT‑4o on challenging, open\-ended prompts in various domains, using the same methodology as our[o1‑preview vs GPT‑4o comparison](https://openai.com/index/learning-to-reason-with-llms/)\. Similar to o1‑preview, o1‑mini is preferred to GPT‑4o in reasoning\-heavy domains, but is not preferred to GPT‑4o in language\-focused domains\. ##### Human preference evaluation vs chatgpt\-4o\-latest DomainWin Rate vs GPT\-4o \(%\) As a concrete example, we compared responses from GPT‑4o, o1‑mini, and o1‑preview on a word reasoning question\. While GPT‑4o did not answer correctly, both o1‑mini and o1‑preview did, and o1‑mini reached the answer around 3\-5x faster\. o1‑mini is trained using the same alignment and safety techniques as o1‑preview\. The model has 59% higher jailbreak robustness on an internal version of the StrongREJECT dataset compared to GPT‑4o\. Before deployment, we carefully assessed the safety risks of o1‑mini using the same approach to preparedness, external red\-teaming, and safety evaluations as o1‑preview\. We are publishing the detailed results from these evaluations in the accompanying[system card](https://openai.com/index/openai-o1-system-card/)\. Due to its specialization on STEM reasoning capabilities, o1‑mini’s factual knowledge on non\-STEM topics such as dates, biographies, and trivia is comparable to small LLMs such as GPT‑4o mini\. We will improve these limitations in future versions, as well as experiment with extending the model to other modalities and specialities outside of STEM\.

Similar Articles

OpenAI o3-mini

OpenAI Blog

OpenAI releases o3-mini, a cost-efficient reasoning model with strong STEM capabilities, available in ChatGPT and API with support for function calling, structured outputs, and three reasoning effort levels. The model matches o1 performance in math and coding while being faster and cheaper, with free plan users gaining access to a reasoning model for the first time.

Introducing OpenAI o3 and o4-mini

OpenAI Blog

OpenAI releases o3 and o4-mini, its latest reasoning models that can agentically access and combine all ChatGPT tools (web search, code execution, image analysis, image generation). o3 achieves state-of-the-art performance on coding, math, and science benchmarks with 20% fewer major errors than o1, while o4-mini offers efficient reasoning optimized for cost and speed.

Introducing OpenAI o1

OpenAI Blog

OpenAI released o1, a new series of reasoning-focused AI models that outperform previous models on complex tasks in science, coding, and mathematics. The preview model solved 83% of IMO problems compared to GPT-4o's 13%, and reached the 89th percentile in competitive coding.

Economics and reasoning with OpenAI o1

OpenAI Blog

OpenAI released the o1 model series, designed with extended reasoning capabilities to tackle complex problems in science, coding, and math by spending more time thinking before responding.

OpenAI o3 and o4-mini System Card

OpenAI Blog

OpenAI released system cards for o3 and o4-mini models, which feature advanced reasoning capabilities combined with tool integration (web browsing, Python, image analysis, etc.) and are evaluated under OpenAI's Preparedness Framework v2 for safety in biological, cybersecurity, and AI self-improvement domains.