Quoting Luke Curley

Simon Willison's Blog 05/09/26, 01:03 AM News

Summary

Technical commentary from Luke Curley discussing how WebRTC's design prioritizes low latency by aggressively dropping audio packets, which conflicts with LLM voice applications where prompt accuracy matters more than speed. He recounts challenges faced at Discord implementing retransmission within browser constraints.

No content available

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 05/09/26, 02:28 AM

# A quote from Luke Curley Source: [https://simonwillison.net/2026/May/9/luke-curley/](https://simonwillison.net/2026/May/9/luke-curley/) 9th May 2026 > WebRTC is designed to**degrade and drop my prompt**during poor network conditions\. wtf my dude WebRTC aggressively drops audio packets to keep latency low\. If you’ve ever heard distorted audio on a conference call, that’s WebRTC baybee\. The idea is that conference calls depend on rapid back\-and\-forth, so pausing to wait for audio is unacceptable\. …but as a user, I would much rather wait an extra 200ms for my slow/expensive prompt to be accurate\. After all, I’m paying good money to boil the ocean, and a garbage prompt means a garbage response\. It’s not like LLMs are particularly responsive anyway\. **But I’m not allowed to wait**\. It’s*impossible*to even retransmit a WebRTC audio packet within a browser; we tried at Discord\. The*implementation*is hard\-coded for real\-time latency**or else**\. —[Luke Curley](https://moq.dev/blog/webrtc-is-the-problem/),OpenAI’s WebRTC Problem, in response to[How OpenAI delivers low\-latency voice AI at scale](https://openai.com/index/delivering-low-latency-voice-ai-at-scale/)

Quoting Luke Curley

Similar Articles

OpenAI's WebRTC problem

How OpenAI delivers low-latency voice AI at scale

Lightricks/LTX-2.3

EchoChain: A Full-Duplex Benchmark for State-Update Reasoning Under Interruptions

$R^2$-dLLM: Accelerating Diffusion Large Language Models via Spatio-Temporal Redundancy Reduction

Submit Feedback

Similar Articles

How OpenAI delivers low-latency voice AI at scale

EchoChain: A Full-Duplex Benchmark for State-Update Reasoning Under Interruptions

$R^2$-dLLM: Accelerating Diffusion Large Language Models via Spatio-Temporal Redundancy Reduction