@sama: people are really starting to use voice to interact with AI, especially when they have a lot of context to dump. GPT-Re…
Summary
Sam Altman announces the release of GPT-Realtime-2 to the API, highlighting a significant advancement in voice interaction with AI for handling complex context.
View Cached Full Text
Cached at: 05/08/26, 10:01 AM
people are really starting to use voice to interact with AI, especially when they have a lot of context to dump.
GPT-Realtime-2 comes to the API today; it is a pretty big step forward.
(we are working on improvements to voice in chat.)
Similar Articles
Introducing the Realtime API
OpenAI introduces the Realtime API, enabling developers to build low-latency multimodal speech-to-speech conversational experiences with natural voice interactions powered by GPT-4o. The API supports six preset voices and simplifies development by eliminating the need to integrate multiple models.
Introducing gpt-realtime and Realtime API updates
OpenAI is making the Realtime API generally available with a new advanced speech-to-speech model called gpt-realtime, featuring improved instruction following, tool calling, and natural speech quality. New capabilities include MCP server support, image inputs, SIP phone calling, and two new voices (Cedar and Marin).
Advancing voice intelligence with new models in the API
OpenAI has announced three new voice models in its API: GPT-Realtime-2 with advanced reasoning, GPT-Realtime-Translate for live multilingual translation, and GPT-Realtime-Whisper for streaming transcription, aiming to enable more natural and action-oriented voice applications.
@kwindla: OpenAI shipped a new speech-to-speech model today: gpt-realtime-2 This is the first speech-to-speech model good enough …
OpenAI has released gpt-realtime-2, a new speech-to-speech model optimized for real-time voice agent interactions with low-latency tool calling.
@seclink: OpenAI Launches GPT-Realtime-2, Its Most Intelligent Voice Model to Date. The model features GPT-5-level reasoning, a 128,000 token context window, and supports adjusting 'effort level' for more natural conversation. It can pair with GPT-R…
OpenAI released the GPT-Realtime-2 voice model, featuring GPT-5-level reasoning capabilities and a 128,000 token context window. It supports real-time translation from over 70 input languages to 13 output languages, achieving 96.6% accuracy on the Big Bench Audio Intelligence benchmark. Greg Brockman called it a milestone in voice translation.