GPT-2: 1.5B release

OpenAI Blog 11/05/19, 08:00 AM Models

gpt-2 language-model release safety detection fine-tuning

Summary

OpenAI releases GPT-2 1.5B model with analysis of human perception of credibility, potential for misuse through fine-tuning on extremist ideologies, and challenges in detecting synthetic text. Detection models achieve ~95% accuracy but require complementary approaches for practical deployment.

As the final model release of GPT-2’s staged release, we’re releasing the largest version (1.5B parameters) of GPT-2 along with code and model weights to facilitate detection of outputs of GPT-2 models. While there have been larger language models released since August, we’ve continued with our original staged release plan in order to provide the community with a test case of a full staged release process. We hope that this test case will be useful to developers of future powerful models, and we’re actively continuing the conversation with the AI community on responsible publication.

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 04/20/26, 02:45 PM

# GPT-2: 1.5B release Source: [https://openai.com/index/gpt-2-1-5b-release/](https://openai.com/index/gpt-2-1-5b-release/) **1\. Humans find GPT‑2 outputs convincing**\. Our partners at Cornell University surveyed people to assign GPT‑2 text a credibility score across model sizes\. People gave the 1\.5B model a “credibility score” of 6\.91 out of 10\. This is marginally greater than outputs from the 774M model \(6\.72\) and significantly above the medium 355M model \(6\.07\)\. These results make us more inclined to release the 1\.5B model, as the incremental increase in human\-perceived credibility relative to 774M seems low\. **2\. GPT‑2 can be fine\-tuned for misuse**\. Our partners at the Middlebury Institute of International Studies’ Center on Terrorism, Extremism, and Counterterrorism \(CTEC\) found that extremist groups can use GPT‑2 for misuse, specifically by fine\-tuning GPT‑2 models on four ideological positions: white supremacy, Marxism, jihadist Islamism, and anarchism\. CTEC demonstrated that it’s possible to create models that can generate synthetic propaganda for these ideologies\. They also show that, despite having low detection accuracy on synthetic outputs, ML\-based detection methods can give experts reasonable suspicion that an actor is generating synthetic text\. **3\. Detection is challenging**\. We expect that content\-based detection of synthetic text is a long\-term challenge\. To test whether machine learning approaches may help today, we conducted in\-house detection research and developed a[detection model⁠\(opens in a new window\)](https://github.com/openai/gpt-2-output-dataset)that has detection rates of ~95% for detecting 1\.5B GPT‑2‑generated text\.[A](https://openai.com/index/gpt-2-1-5b-release/#citation-bottom-A)We believe this is not high enough accuracy for standalone detection and needs to be paired with metadata\-based approaches, human judgment, and public education to be more effective\. We are releasing this model to aid the study of research into the detection of synthetic text, although this does let adversaries with access better evade detection\. While we found detection accuracy depends heavily on the sampling methods used in training and testing, we also found detection to be more reliable when training across a range of sampling techniques\. As seen in the figure below, we observed that larger models’ outputs are more difficult to classify, but training on larger models’ outputs makes detection results more accurate and robust\. We expect this trend to continue and that detection will be more challenging with increased model size\.

GPT-2: 1.5B release

Similar Articles

GPT-2: 6-month follow-up

Better language models and their implications

Introducing GPT-5.2

GPT-4

First look at GPT-5

Submit Feedback