@sama: people are really starting to use voice to interact with AI, especially when they have a lot of context to dump. GPT-Re…

X AI KOLs Models

Summary

Sam Altman announces the release of GPT-Realtime-2 to the API, highlighting a significant advancement in voice interaction with AI for handling complex context.

people are really starting to use voice to interact with AI, especially when they have a lot of context to dump. GPT-Realtime-2 comes to the API today; it is a pretty big step forward. (we are working on improvements to voice in chat.)
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 05/08/26, 10:01 AM

people are really starting to use voice to interact with AI, especially when they have a lot of context to dump.

GPT-Realtime-2 comes to the API today; it is a pretty big step forward.

(we are working on improvements to voice in chat.)

Similar Articles

Introducing the Realtime API

OpenAI Blog

OpenAI introduces the Realtime API, enabling developers to build low-latency multimodal speech-to-speech conversational experiences with natural voice interactions powered by GPT-4o. The API supports six preset voices and simplifies development by eliminating the need to integrate multiple models.

Introducing gpt-realtime and Realtime API updates

OpenAI Blog

OpenAI is making the Realtime API generally available with a new advanced speech-to-speech model called gpt-realtime, featuring improved instruction following, tool calling, and natural speech quality. New capabilities include MCP server support, image inputs, SIP phone calling, and two new voices (Cedar and Marin).

Advancing voice intelligence with new models in the API

OpenAI Blog

OpenAI has announced three new voice models in its API: GPT-Realtime-2 with advanced reasoning, GPT-Realtime-Translate for live multilingual translation, and GPT-Realtime-Whisper for streaming transcription, aiming to enable more natural and action-oriented voice applications.

@seclink: OpenAI Launches GPT-Realtime-2, Its Most Intelligent Voice Model to Date. The model features GPT-5-level reasoning, a 128,000 token context window, and supports adjusting 'effort level' for more natural conversation. It can pair with GPT-R…

X AI KOLs Following

OpenAI released the GPT-Realtime-2 voice model, featuring GPT-5-level reasoning capabilities and a 128,000 token context window. It supports real-time translation from over 70 input languages to 13 output languages, achieving 96.6% accuracy on the Big Bench Audio Intelligence benchmark. Greg Brockman called it a milestone in voice translation.