SSE sucks for transporting LLM tokens

https://news.ycombinator.com/rss Hits: 2
Summary

SSE sucksI鈥檓 just going to cut to the chase here. SSE as a transport mechanism for LLM tokens is naff. It鈥檚 not that it can鈥檛 work, obviously it can, because people are using it and SDKs are built around it. But it鈥檚 not a great fit for the problem space.The basic SSE flow goes something like this:Client makes an HTTP POST request to the server with a promptServer responds with a 200 OK and keeps the connection openServer streams tokens back to the client as they are generated, using the SSE formatClient processes the tokens as they arrive on the long-lived HTTP connectionSure the approach has some benefits, like simplicity and compatibility with existing HTTP infrastructure. But it still sucks.When you鈥檙e building an app that integrates LLM model responses, the most expensive part of any call is the model inference. The cost of generating the tokens dwarfs the cost of transporting them over the network. So the transport mechanism should be bulletproof. It would suck to have a transport where some network interruption meant that you had to re-run the model inference. But that鈥檚 exactly what you get with SSE.If the SSE connection drops halfway through the response, the client has to re-POST the prompt, the model has to re-run the generation, and the client has to start receiving tokens from scratch again. This is sucky.SSE might be fine for server-to-server communication where network reliability is high, but for end user client connections over the internet, where connections can be flaky, it鈥檚 a poor choice.If your user goes into a tunnel, or switches networks, or their phone goes to sleep, or any number of other common scenarios, the SSE connection can drop. And then your user has to wait for the entire response to be re-generated. This leads to a poor user experience. And someone has to pay the model providers for the extra inference calls.And don鈥檛 even think about wanting to steer the generation (or AI agent) mid-response. Nope, not gonna happen with SSE. It鈥檚 ...

First seen: 2025-12-13 18:52

Last seen: 2025-12-13 19:52