GPT-2: 1.5B release

OpenAI Blog Models

Summary

OpenAI releases GPT-2 1.5B model with analysis of human perception of credibility, potential for misuse through fine-tuning on extremist ideologies, and challenges in detecting synthetic text. Detection models achieve ~95% accuracy but require complementary approaches for practical deployment.

As the final model release of GPT-2’s staged release, we’re releasing the largest version (1.5B parameters) of GPT-2 along with code and model weights to facilitate detection of outputs of GPT-2 models. While there have been larger language models released since August, we’ve continued with our original staged release plan in order to provide the community with a test case of a full staged release process. We hope that this test case will be useful to developers of future powerful models, and we’re actively continuing the conversation with the AI community on responsible publication.
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 04/20/26, 02:45 PM

# GPT-2: 1.5B release Source: [https://openai.com/index/gpt-2-1-5b-release/](https://openai.com/index/gpt-2-1-5b-release/) **1\. Humans find GPT‑2 outputs convincing**\. Our partners at Cornell University surveyed people to assign GPT‑2 text a credibility score across model sizes\. People gave the 1\.5B model a “credibility score” of 6\.91 out of 10\. This is marginally greater than outputs from the 774M model \(6\.72\) and significantly above the medium 355M model \(6\.07\)\. These results make us more inclined to release the 1\.5B model, as the incremental increase in human\-perceived credibility relative to 774M seems low\. **2\. GPT‑2 can be fine\-tuned for misuse**\. Our partners at the Middlebury Institute of International Studies’ Center on Terrorism, Extremism, and Counterterrorism \(CTEC\) found that extremist groups can use GPT‑2 for misuse, specifically by fine\-tuning GPT‑2 models on four ideological positions: white supremacy, Marxism, jihadist Islamism, and anarchism\. CTEC demonstrated that it’s possible to create models that can generate synthetic propaganda for these ideologies\. They also show that, despite having low detection accuracy on synthetic outputs, ML\-based detection methods can give experts reasonable suspicion that an actor is generating synthetic text\. **3\. Detection is challenging**\. We expect that content\-based detection of synthetic text is a long\-term challenge\. To test whether machine learning approaches may help today, we conducted in\-house detection research and developed a[detection model⁠\(opens in a new window\)](https://github.com/openai/gpt-2-output-dataset)that has detection rates of ~95% for detecting 1\.5B GPT‑2‑generated text\.[A](https://openai.com/index/gpt-2-1-5b-release/#citation-bottom-A)We believe this is not high enough accuracy for standalone detection and needs to be paired with metadata\-based approaches, human judgment, and public education to be more effective\. We are releasing this model to aid the study of research into the detection of synthetic text, although this does let adversaries with access better evade detection\. While we found detection accuracy depends heavily on the sampling methods used in training and testing, we also found detection to be more reliable when training across a range of sampling techniques\. As seen in the figure below, we observed that larger models’ outputs are more difficult to classify, but training on larger models’ outputs makes detection results more accurate and robust\. We expect this trend to continue and that detection will be more challenging with increased model size\.

Similar Articles

GPT-2: 6-month follow-up

OpenAI Blog

OpenAI discusses their 6-month follow-up to GPT-2 release, outlining plans to release the 1558M parameter model in a few months and emphasizing staged release and partnership-based sharing as key to responsible AI publication.

Better language models and their implications

OpenAI Blog

OpenAI introduces GPT-2, a 1.5 billion parameter transformer-based language model trained on 40GB of internet text that achieves state-of-the-art performance on language modeling benchmarks and demonstrates zero-shot capabilities in reading comprehension, translation, question answering, and summarization. Due to safety concerns, only a smaller model and technical paper are released publicly rather than the full trained model.

Introducing GPT-5.2

OpenAI Blog

OpenAI introduces GPT-5.2, the most capable model series yet, with significant improvements in knowledge work, code generation, image perception, long-context understanding, and tool-calling. The GPT-5.2 Thinking variant achieves state-of-the-art performance on professional benchmarks, outperforming human experts on 70.9% of GDPval tasks across 44 occupations.

GPT-4

OpenAI Blog

OpenAI releases GPT-4, a large multimodal model that accepts image and text inputs and demonstrates human-level performance on professional and academic benchmarks, significantly outperforming GPT-3.5 across various evaluation metrics.

First look at GPT-5

OpenAI Blog

OpenAI provides a first look at GPT-5, representing a major advancement in large language models with potential paradigm-shifting capabilities.