Quoting Luke Curley

Simon Willison's Blog News

Summary

Technical commentary from Luke Curley discussing how WebRTC's design prioritizes low latency by aggressively dropping audio packets, which conflicts with LLM voice applications where prompt accuracy matters more than speed. He recounts challenges faced at Discord implementing retransmission within browser constraints.

No content available
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 05/09/26, 02:28 AM

# A quote from Luke Curley Source: [https://simonwillison.net/2026/May/9/luke-curley/](https://simonwillison.net/2026/May/9/luke-curley/) 9th May 2026 > WebRTC is designed to**degrade and drop my prompt**during poor network conditions\. wtf my dude WebRTC aggressively drops audio packets to keep latency low\. If you’ve ever heard distorted audio on a conference call, that’s WebRTC baybee\. The idea is that conference calls depend on rapid back\-and\-forth, so pausing to wait for audio is unacceptable\. …but as a user, I would much rather wait an extra 200ms for my slow/expensive prompt to be accurate\. After all, I’m paying good money to boil the ocean, and a garbage prompt means a garbage response\. It’s not like LLMs are particularly responsive anyway\. **But I’m not allowed to wait**\. It’s*impossible*to even retransmit a WebRTC audio packet within a browser; we tried at Discord\. The*implementation*is hard\-coded for real\-time latency**or else**\. —[Luke Curley](https://moq.dev/blog/webrtc-is-the-problem/),OpenAI’s WebRTC Problem, in response to[How OpenAI delivers low\-latency voice AI at scale](https://openai.com/index/delivering-low-latency-voice-ai-at-scale/)

Similar Articles

OpenAI's WebRTC problem

Hacker News Top

Technical blog post from a self-described WebRTC expert criticizing OpenAI's use of WebRTC for voice AI, arguing the protocol is poorly suited because it's designed for real-time conferencing with aggressive packet dropping, which conflicts with Voice AI use cases where accuracy matters more than minimal latency.

How OpenAI delivers low-latency voice AI at scale

OpenAI Blog

OpenAI details its rearchitected WebRTC stack designed to deliver low-latency voice AI at scale for over 900 million users. The post explains how new split-relay and transceiver architectures optimize media routing and connection setup for real-time interactions like ChatGPT voice.

Lightricks/LTX-2.3

Hugging Face Models Trending

Lightricks released LTX-2.3, an open-weight diffusion-based audio-video foundation model with improved quality and prompt adherence, available in multiple checkpoints including distilled and LoRA variants for local execution.

EchoChain: A Full-Duplex Benchmark for State-Update Reasoning Under Interruptions

arXiv cs.CL

EchoChain is a new benchmark for evaluating AI models' ability to revise in-progress responses when users interrupt mid-generation. The benchmark identifies three failure patterns (contextual inertia, interruption amnesia, objective displacement) and finds that across evaluated real-time voice models, no system exceeds 50% pass rate.