OpenAI introduces a Model Distillation offering in its API, enabling developers to use outputs from frontier models like o1-preview and GPT-4o to fine-tune smaller, cost-efficient models like GPT-4o mini through an integrated pipeline including Stored Completions, Evals, and Fine-tuning.
Fine-tune a cost-efficient model with the outputs of a large frontier model–all on the OpenAI platform
# Model Distillation in the API
Source: [https://openai.com/index/api-model-distillation/](https://openai.com/index/api-model-distillation/)
We’re introducing a new Model Distillation offering to provide developers with an integrated workflow to manage the entire distillation pipeline directly within the OpenAI platform\. This lets developers easily use the outputs of frontier models like o1‑preview and GPT‑4o to fine\-tune and improve the performance of more cost\-efficient models like GPT‑4o mini\.
Model distillation involves fine\-tuning smaller, cost\-efficient models using outputs from more capable models, allowing them to match the performance of advanced models on specific tasks at a much lower cost\. Until now, distillation has been a multi\-step, error\-prone process, which required developers to manually orchestrate multiple operations across disconnected tools, from generating datasets to fine\-tuning models and measuring performance improvements\. Since distillation is inherently iterative, developers needed to repeatedly run each step, adding significant effort and complexity\.
Our new Model Distillation suite includes:
- [**Stored Completions**\(opens in a new window\)](http://platform.openai.com/docs/api-reference/chat/create#chat-create-store)**:**Developers can now easily generate datasets for distillation by automatically capturing and storing the input\-output pairs generated by one of our models, like GPT‑4o or o1‑preview through our API\. With Stored Completions, you can easily build datasets with your production data to evaluate and fine\-tune models\. Developers can review[this integration guide\(opens in a new window\)](http://platform.openai.com/docs/guides/distillation)to learn how to opt\-in to storing completions\.
- [**Evals**\(opens in a new window\)](http://platform.openai.com/docs/guides/evals)\(beta\): Developers can now create and run custom evaluations on our platform to measure model performance on specific tasks\. Instead of manually creating evaluation scripts and integrating disparate logging tools, Evals provides an integrated way to measure model performance\. You can either use data from Stored Completions or upload existing datasets to set up your evaluations\. Evals can also be used independently of fine\-tuning to quantitatively evaluate model performance for your use cases\.
- [**Fine\-tuning**\(opens in a new window\)](https://platform.openai.com/docs/guides/fine-tuning)**:**Stored Completions and Evals are fully integrated with our existing fine\-tuning offering\. This means that developers can use datasets created with Stored Completions in their fine\-tuning jobs and run evaluations on fine\-tuned models using Evals, all within our platform\.
OpenAI launches fine-tuning for GPT-4o and GPT-4o mini, allowing developers to customize models with their own datasets at lower costs. The feature includes free training tokens (1M/day for GPT-4o and 2M/day for GPT-4o mini through September 23) and is available to all paid-tier developers.
OpenAI has released fine-tuning capabilities for GPT-3.5 Turbo, allowing developers to customize models for specific use cases with improved performance, steerability, and output formatting. The update enables fine-tuned GPT-3.5 Turbo to match GPT-4 performance on certain tasks while reducing prompt sizes by up to 90%.
OpenAI launches GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano models via API with major improvements in coding (54.6% on SWE-bench), instruction following, and 1M token context windows at lower costs. GPT-4.5 Preview will be deprecated on July 14, 2025.
OpenAI introduces improvements to its fine-tuning API with new features including epoch-based checkpoints, comparative playground for model evaluation, third-party integrations, and enhanced dashboard capabilities. The company also expands its custom models program to give developers more control and flexibility in building domain-specific AI solutions.
OpenAI has launched fine-tuning capabilities for GPT-3, allowing developers to customize the model on their own data via a single CLI command, resulting in improved accuracy, reduced costs, and lower latency for production use cases. Early customers like Keeper Tax, Viable, and Sana Labs report significant accuracy improvements after fine-tuning.