All news PRODUCTS

OpenAI launches GPT-4o and demonstrates real-time voice the day before Google I/O

A spring product update, deliberately timed against Google's annual developer keynote. The voice demo was a year ahead of what most users would actually get.

MONDAY, 13 MAY 2024 By The AI Desk

OpenAI launches GPT-4o and demonstrates real-time voice the day before Google I/O

OpenAI held a brief streamed event on the morning of 13 May 2024 to announce GPT-4o, where the o stood for omni. The model was multimodal at training time, not bolted-on at inference, and could handle text, image and audio input in a single shared latent representation. Mira Murati introduced it from a couch in the company's San Francisco offices. The keynote ran twenty-five minutes.

The voice demo was the moment that travelled. Two researchers conducted a live conversation with the model, including interruptions, emotional inflection in the model's reply, and at one point a request that it sing happy birthday. As The Verge and Wired reported the same afternoon, the demo's most striking feature was the latency: roughly 320 milliseconds end-to-end, down from the two-to-three second round-trip that voice on GPT-4 had required.

Smartphone illuminated with a voice assistant interface — GPT-4o brought voice latency under the conversational threshold.Photo: Sten Ritterfeld / Unsplash

The Google I/O timing

Google I/O was scheduled for the following day. As reporting in The Information later that week made clear, the OpenAI event was scheduled with deliberate awareness of that calendar. The Google keynote, when it landed on 14 May, opened with a Gemini 1.5 update and a series of multimodal features that landed in a press environment already saturated with the OpenAI demo. The timing did not invalidate Google's announcements; it relocated the news cycle.

The free-tier announcement may have been the more durable strategic move. GPT-4-class capability arrived for the first time in the unauthenticated ChatGPT product, with rate limits but no paywall. By the end of 2024, weekly active users on ChatGPT were reported by OpenAI at over two hundred million. The free GPT-4o tier was the proximate reason.

Three hundred milliseconds is what made it feel like a person. The rest was packaging.

The voice product itself, in the form demonstrated on the couch, took several months to be available to most users at the latency and quality of the demo. The advanced voice rollout did not complete until late 2024. This pattern, demo first, capability second, deployment third, became a recurring shape for major OpenAI product launches in 2024 and 2025.

Originally reported by OpenAI (OpenAI) on 13 May 2024. Read the original report →

← Previous

Meta releases Llama 3 and the open-weight thesis gets its strongest evidence yet

Anthropic ships Claude 3.5 Sonnet and the working model for serious developers changes

Voice latency	~320 ms end-to-end
Context window	128k tokens
Pricing	50% lower than GPT-4 Turbo
Languages	50+ for voice
Free tier	yes, with rate limits

OpenAI launches GPT-4o and demonstrates real-time voice the day before Google I/O

The Google I/O timing

Discussion

The AI Desk, in your inbox.

More from PRODUCTS