Upgrading the Moderation API with our new multimodal moderation model

OpenAI Blog Models

Summary

OpenAI is launching `omni-moderation-latest`, a new multimodal moderation model built on GPT-4o that supports both text and image inputs, adds new harm categories, and significantly improves accuracy across 40 languages. The updated model is free to use via the Moderation API for all developers.

We’re introducing a new model built on GPT-4o that is more accurate at detecting harmful text and images, enabling developers to build more robust moderation systems.
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 04/20/26, 02:54 PM

# Upgrading the Moderation API with our new multimodal moderation model Source: [https://openai.com/index/upgrading-the-moderation-api-with-our-new-multimodal-moderation-model/](https://openai.com/index/upgrading-the-moderation-api-with-our-new-multimodal-moderation-model/) OpenAIWe’re introducing a new model built on GPT‑4o that is more accurate at detecting harmful text and images, enabling developers to build more robust moderation systems\. Today we are introducing a new moderation model,`omni\-moderation\-latest`, in the[Moderation API⁠\(opens in a new window\)](https://platform.openai.com/docs/guides/moderation)\. Based on[GPT‑4o⁠](https://openai.com/index/gpt-4o-and-more-tools-to-chatgpt-free/), the new model supports both text and image inputs and is more accurate than our previous model, especially in non\-English languages\. Like the previous version, this model uses OpenAI's GPT‑based classifiers to assess whether content should be flagged across categories such as hate, violence, and self\-harm, while also adding the ability to detect additional harm categories\. Additionally, it provides more granular control over moderation decisions by calibrating probability scores to reflect the likelihood of content matching the detected category\. The new moderation model is free to use for all developers through the Moderation API\. Since we first [launched⁠](https://openai.com/index/new-and-improved-content-moderation-tooling/)the Moderation API in 2022, the volume and variety of content that automated moderation systems need to handle has increased, especially as more AI apps have reached massive scale in production\. We hope today’s upgrades help more developers benefit from the latest research and investments in our safety systems\. Companies across various sectors—from social media platforms and productivity tools to generative AI platforms—are using the Moderation API to build safer products for their users\. For instance, Grammarly is using the Moderation API as part of the safety guardrails in its AI communications assistance to ensure its products outputs are safe and fair\. Similarly, ElevenLabs utilizes the Moderation API along with in\-house solutions to scan content generated by their audio AI products, preventing and flagging outputs that violate their policies\. The updated moderation model includes a number of major improvements: - **Multimodal harm classification across six categories:**the new model can evaluate the likelihood that an image, in isolation or in conjunction with text, contains harmful content\. This is supported today for the following categories: violence \(`violence`and`violence/graphic`\), self\-harm \(`self\-harm, self\-harm/intent`, and`self\-harm/instruction`\) and sexual \(`sexual`but not`sexual/minors`\)\. The remaining categories are currently text\-only and we are working to expand multimodal support to more categories in the future\. - **Two new text\-only harm categories:**the new model can detect harm in two additional categories compared to our previous models:`illicit`, which covers instructions or advice on how to commit wrongdoing—a phrase like “how to shoplift” for example, and`illicit/violent`, which covers the same for wrongdoing that also includes violence\. - **More accurate scores, especially for non\-English content**: in a test of 40 languages, compared to the previous model, this new model improved 42% on our internal multilingual eval, and improved in 98% of languages tested\. For low\-resource languages like Khmer or Swati, it improved 70%, and we saw the biggest improvements in Telugu \(6\.4x\), Bengali \(5\.6x\), and Marathi \(4\.6x\)\. While the previous model had limited support for non\-English languages, the performance of the new model in Spanish, German, Italian, Polish, Vietnamese, Portuguese, French, Chinese, Indonesian, and English all exceed even English performance from the previous model\. - **Calibrated scores:**the new model’s scores now more accurately represent the probability that a piece of content violates the relevant policies and will be significantly more consistent across future moderation models\. AI content moderation systems help enforce platform policies and ease the workload on human moderators, crucially sustaining the health of digital platforms\. That’s why, just like our[previous model⁠](https://openai.com/index/new-and-improved-content-moderation-tooling/), we’re making the new moderation model free to use for all developers through the Moderation API, with rate limits depending on usage tier\. To get started, see our[Moderation API guide⁠\(opens in a new window\)](https://platform.openai.com/docs/guides/moderation)\.

Similar Articles

New and improved content moderation tooling

OpenAI Blog

OpenAI has launched an improved Moderation API endpoint that uses GPT-based classifiers to detect sexual, hateful, violent, or self-harm content, offering free access to developers. They also released a technical paper and evaluation dataset alongside the tool.

Using GPT-4 for content moderation

OpenAI Blog

OpenAI describes using GPT-4 for content moderation by enabling policy experts to develop and refine content policies in hours rather than months through an iterative process of comparing GPT-4 judgments against human labels. The approach reduces manual moderation burden while keeping humans in the loop for complex cases and bias monitoring.

Introducing our latest image generation model in the API

OpenAI Blog

OpenAI is releasing gpt-image-1, the natively multimodal image generation model powering ChatGPT's image feature, via its API for developers and businesses. The model supports diverse styles, text rendering, and custom guidelines, and is already being integrated by Canva, GoDaddy, HubSpot, Instacart, and others.

Spring Update

OpenAI Blog

OpenAI releases GPT-4o, a new flagship model capable of real-time reasoning across audio, vision, and text modalities.

Hello GPT-4o

OpenAI Blog

OpenAI announces GPT-4o, a flagship multimodal model that processes audio, vision, text, and video in real-time with 232ms average audio response latency. The model matches GPT-4 Turbo on text/code while significantly improving multilingual, audio, and vision capabilities at 50% cheaper API costs.