OpenAI launches GPT-4o and demonstrates real-time voice the day before Google I/O
A spring product update, deliberately timed against Google's annual developer keynote. The voice demo was a year ahead of what most users would actually get.

OpenAI held a brief streamed event on the morning of 13 May 2024 to announce GPT-4o, where the o stood for omni. The model was multimodal at training time, not bolted-on at inference, and could handle text, image and audio input in a single shared latent representation. Mira Murati introduced it from a couch in the company's San Francisco offices. The keynote ran twenty-five minutes.
The voice demo was the moment that travelled. Two researchers conducted a live conversation with the model, including interruptions, emotional inflection in the model's reply, and at one point a request that it sing happy birthday. As The Verge and Wired reported the same afternoon, the demo's most striking feature was the latency: roughly 320 milliseconds end-to-end, down from the two-to-three second round-trip that voice on GPT-4 had required.
The Google I/O timing
Google I/O was scheduled for the following day. As reporting in The Information later that week made clear, the OpenAI event was scheduled with deliberate awareness of that calendar. The Google keynote, when it landed on 14 May, opened with a Gemini 1.5 update and a series of multimodal features that landed in a press environment already saturated with the OpenAI demo. The timing did not invalidate Google's announcements; it relocated the news cycle.
The free-tier announcement may have been the more durable strategic move. GPT-4-class capability arrived for the first time in the unauthenticated ChatGPT product, with rate limits but no paywall. By the end of 2024, weekly active users on ChatGPT were reported by OpenAI at over two hundred million. The free GPT-4o tier was the proximate reason.
Three hundred milliseconds is what made it feel like a person. The rest was packaging.
The voice product itself, in the form demonstrated on the couch, took several months to be available to most users at the latency and quality of the demo. The advanced voice rollout did not complete until late 2024. This pattern, demo first, capability second, deployment third, became a recurring shape for major OpenAI product launches in 2024 and 2025.



Discussion