ChatGPT voice mode is a weaker model
Summary
ChatGPT's voice mode runs on a weaker GPT-4o era model with an April 2024 knowledge cutoff, significantly older than OpenAI's latest capabilities. The article highlights a growing gap between OpenAI's consumer voice interface and its more advanced paid models, driven by differences in reward signal clarity and B2B market incentives.
View Cached Full Text
Cached at: 04/20/26, 08:28 AM
Similar Articles
ChatGPT can now see, hear, and speak
OpenAI is rolling out new voice and image capabilities to ChatGPT Plus and Enterprise users, enabling users to have voice conversations and share images for multimodal interactions powered by GPT-3.5/GPT-4 and custom text-to-speech models.
Introducing ChatGPT Pro
OpenAI launches ChatGPT Pro, a $200/month subscription plan offering unlimited access to advanced models including o1, o1-mini, GPT-4o, and Advanced Voice, plus o1 pro mode for compute-intensive reasoning tasks.
How the voices for ChatGPT were chosen
OpenAI explains its process for selecting five distinct voices for ChatGPT's Voice Mode feature, involving professional voice actors, casting directors, and a five-month selection process. The company addresses controversy over the 'Sky' voice, clarifying it is not an imitation of Scarlett Johansson and was cast before any outreach to her.
OpenAI prepares major ChatGPT voice upgrade with GPT-Bidi-1 (2 minute read)
OpenAI is preparing to release GPT-Bidi-1, a next-generation voice model for ChatGPT that supports bidirectional communication, interruptions, and mid-sentence adjustments, aiming to close the gap between voice and text capabilities.
Introducing ChatGPT
OpenAI introduces ChatGPT, a conversational AI model fine-tuned from GPT-3.5 using reinforcement learning from human feedback (RLHF). The model is designed to answer follow-up questions, admit mistakes, and reject inappropriate requests, with free access provided during the research preview.