Tag
OpenAI announces GPT-4o, a flagship multimodal model that processes audio, vision, text, and video in real-time with 232ms average audio response latency. The model matches GPT-4 Turbo on text/code while significantly improving multilingual, audio, and vision capabilities at 50% cheaper API costs.
OpenAI releases GPT-4o, a new flagship model capable of real-time reasoning across audio, vision, and text modalities.