Model Distillation in the API

OpenAI Blog 10/01/24, 10:02 AM Products

model-distillation fine-tuning openai api llm developer-tools evaluation

Summary

OpenAI introduces a Model Distillation offering in its API, enabling developers to use outputs from frontier models like o1-preview and GPT-4o to fine-tune smaller, cost-efficient models like GPT-4o mini through an integrated pipeline including Stored Completions, Evals, and Fine-tuning.

Fine-tune a cost-efficient model with the outputs of a large frontier model–all on the OpenAI platform

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 04/20/26, 02:51 PM

# Model Distillation in the API Source: [https://openai.com/index/api-model-distillation/](https://openai.com/index/api-model-distillation/) We’re introducing a new Model Distillation offering to provide developers with an integrated workflow to manage the entire distillation pipeline directly within the OpenAI platform\. This lets developers easily use the outputs of frontier models like o1‑preview and GPT‑4o to fine\-tune and improve the performance of more cost\-efficient models like GPT‑4o mini\. Model distillation involves fine\-tuning smaller, cost\-efficient models using outputs from more capable models, allowing them to match the performance of advanced models on specific tasks at a much lower cost\. Until now, distillation has been a multi\-step, error\-prone process, which required developers to manually orchestrate multiple operations across disconnected tools, from generating datasets to fine\-tuning models and measuring performance improvements\. Since distillation is inherently iterative, developers needed to repeatedly run each step, adding significant effort and complexity\. Our new Model Distillation suite includes: - [**Stored Completions**⁠\(opens in a new window\)](http://platform.openai.com/docs/api-reference/chat/create#chat-create-store)**:**Developers can now easily generate datasets for distillation by automatically capturing and storing the input\-output pairs generated by one of our models, like GPT‑4o or o1‑preview through our API\. With Stored Completions, you can easily build datasets with your production data to evaluate and fine\-tune models\. Developers can review[this integration guide⁠\(opens in a new window\)](http://platform.openai.com/docs/guides/distillation)to learn how to opt\-in to storing completions\. - [**Evals**⁠\(opens in a new window\)](http://platform.openai.com/docs/guides/evals)\(beta\): Developers can now create and run custom evaluations on our platform to measure model performance on specific tasks\. Instead of manually creating evaluation scripts and integrating disparate logging tools, Evals provides an integrated way to measure model performance\. You can either use data from Stored Completions or upload existing datasets to set up your evaluations\. Evals can also be used independently of fine\-tuning to quantitatively evaluate model performance for your use cases\. - [**Fine\-tuning**⁠\(opens in a new window\)](https://platform.openai.com/docs/guides/fine-tuning)**:**Stored Completions and Evals are fully integrated with our existing fine\-tuning offering\. This means that developers can use datasets created with Stored Completions in their fine\-tuning jobs and run evaluations on fine\-tuned models using Evals, all within our platform\.

Model Distillation in the API

Similar Articles

Fine-tuning now available for GPT-4o

GPT-3.5 Turbo fine-tuning and API updates

Introducing GPT-4.1 in the API

Introducing improvements to the fine-tuning API and expanding our custom models program

Customizing GPT-3 for your application

Submit Feedback

Similar Articles

Fine-tuning now available for GPT-4o

GPT-3.5 Turbo fine-tuning and API updates

Introducing GPT-4.1 in the API

Introducing improvements to the fine-tuning API and expanding our custom models program

Customizing GPT-3 for your application