Tag
A demonstration shows controlling a computer entirely by voice using GPT-Realtime 2.0, showcasing a hands-free operating system interface.
We gave a Reachy Mini robot a real-time voice brain using GPT Realtime, allowing it to hear, see, talk, and physically react via motion tools. The project is open-source on GitHub.
OpenAI has released gpt-realtime-2, a new speech-to-speech model optimized for real-time voice agent interactions with low-latency tool calling.
GPT-Realtime-2 demonstrates a 15 percentage point improvement over version 1.5 on the Big Bench Audio benchmark, approaching saturation levels.
Sam Altman announces the release of GPT-Realtime-2 to the API, highlighting a significant advancement in voice interaction with AI for handling complex context.
OpenAI has announced three new voice models in its API: GPT-Realtime-2 with advanced reasoning, GPT-Realtime-Translate for live multilingual translation, and GPT-Realtime-Whisper for streaming transcription, aiming to enable more natural and action-oriented voice applications.