Websockets, SSE, and HTTP/3 from first principles

A primer on the real-time web protocols, derived from scratch: what problem each one solves, how it works on the wire, when to use which, and how HTTP/2 and HTTP/3 (QUIC) change the tradeoffs. With real-world analogies and connection-model diagrams.

I want to derive these from the problem, not list them as features. So start with the constraint the web was born with.

The web’s starting constraint: the client always speaks first

The web began as request-response. The browser asks (GET /page), the server answers, the connection closes. The whole model assumes the client always speaks first and the server only ever replies.

That is fine for documents. It breaks the moment you want the server to tell the client something the client did not just ask for:

  • a new chat message arrived
  • a stock price moved
  • a long job finished
  • another user moved their cursor

In pure request-response, the server has no way to reach a client that is not currently asking. The server knows something; the client is not on the line.

Real-world analogy: the original web is postal mail. You send a letter, you wait, a reply comes back, the exchange ends. If the post office learns something urgent after your letter is sealed, it cannot reach you. It has to wait for you to write again. Everything below is the web growing a phone line.

The first hack was polling: the client asks "anything new?" every few seconds. It works and it is wasteful. Most requests answer "no", you trade latency (the gap between events and the next poll) against load (poll more often, hammer the server harder). Polling is calling the post office every five minutes to ask if mail came. The protocols below are all attempts to do better than that.

The transport underneath: TCP

Before the application protocols, the layer they ride on, because HTTP/3 is a story about replacing it.

TCP is the reliable, ordered byte stream that HTTP/1 and HTTP/2 run on. It guarantees two things:

  • Reliable. Lost packets are retransmitted; nothing is silently dropped.
  • Ordered. Bytes arrive in the order they were sent.

You get those guarantees through a handshake (the famous SYN, SYN-ACK, ACK three-step) that costs a network round-trip before any data moves, plus a TLS handshake on top for https, costing more round-trips. Hold the ordering guarantee in mind. It is a feature for one stream and, as we will see, a curse for many.

Real-world analogy: TCP is a single-file conveyor belt where items must come off in the exact order they went on. If the third item jams, everything behind it waits, even if items four through ten are ready. That jam has a name, and HTTP/3 exists largely to fix it.

SSE: the server gets a one-way megaphone

The smallest possible upgrade to request-response: the client makes one normal request, and the server never closes the response. It holds the connection open and keeps writing into it as events happen.

That is Server-Sent Events (SSE). It is plain HTTP with Content-Type: text/event-stream. The browser reads it with the built-in EventSource, or by reading a fetch response body as a stream.

On the wire it is text events separated by a blank line:

event: price
data: {"sym": "ACME", "px": 41.20}

event: price
data: {"sym": "ACME", "px": 41.18}

What you get nearly for free:

  • One-directional, server to client. The server can push anytime once the stream is open. The client cannot send on it; to say something back, it makes a new ordinary request.
  • Automatic reconnection. If the connection drops, EventSource reconnects on its own and can send a Last-Event-ID header so the server resumes from where it left off. You did not write that loop; the browser did.
  • Plain HTTP all the way down. Proxies, CDNs, load balancers, and auth middleware already understand it because it is just a long HTTP response.
flowchart LR
  C[Client] -->|one GET, held open| S[Server]
  S -->|event, event, event...| C
  C -.->|new request to talk back| S

Real-world analogy: SSE is a radio broadcast you tuned into. The station transmits continuously and you receive; you cannot talk back on the same channel. If you want to call in, you pick up a different phone. That asymmetry is the whole point, and it is a perfect fit for anything where the server pushes and the client mostly listens: notifications, live feeds, progress bars, and streaming a model’s tokens into a chat UI.

SSE’s one historical weakness was a browser limit of about six concurrent connections per domain under HTTP/1.1. Open a few SSE streams in a few tabs and you starve the rest of the page. HTTP/2 removed that, which I will get to.

Websockets: both sides get a phone

SSE gives the server a megaphone. Sometimes you need both sides talking at once on the same line, with minimal overhead per message. That is websockets.

A websocket starts as an ordinary HTTP request carrying an Upgrade: websocket header and a Sec-WebSocket-Key. The server agrees, returns 101 Switching Protocols, and from that point the same TCP connection stops being HTTP and becomes a full-duplex byte pipe. Both sides can send framed messages anytime, with a few bytes of overhead per frame instead of a full set of HTTP headers.

sequenceDiagram
  participant C as Client
  participant S as Server
  C->>S: GET / (Upgrade: websocket, Sec-WebSocket-Key)
  S->>C: 101 Switching Protocols
  Note over C,S: connection is now a two-way byte pipe
  C->>S: message
  S->>C: message
  S->>C: message (unprompted)
  C->>S: message

What it buys you:

  • Two-directional, low overhead. Either side sends whenever, and a small frame header beats re-sending HTTP headers per message.
  • Stateful and persistent. One long-lived connection per client, held open.

What it costs you, and the reasons not to reach for it by default:

  • You own the connection lifecycle. No automatic reconnect like SSE. You write heartbeats (ping/pong), reconnection with backoff, and resync-after-reconnect yourself.
  • It is not plain HTTP after the upgrade. Some proxies, caches, and middleware that handle HTTP transparently need extra configuration for websockets.
  • Stateful connections are harder to scale. Sticky sessions, connection counts per node, and graceful drain on deploy all become your problem.

Real-world analogy: a websocket is an open phone call. Both people can talk and interrupt, latency is low, but someone has to keep the line up, notice when it drops, and call back. You would not hold an open call just to occasionally hear an announcement; that is what the radio is for.

The decision rule falls straight out of direction:

  • Only the server pushes to a mostly-listening client: SSE. It is simpler and reconnects itself.
  • Both sides push with low latency on one connection (live editing, multiplayer, voice, games): websockets.
  • Neither pushes unprompted: plain request-response. Most things.

This is exactly the call the open-source alfred-os codebase makes: its live transcript and token streams are one-directional, so it streams over SSE and never opens a websocket. Direction decided it.

HTTP/2: many streams down one connection

HTTP/1.1 had a real problem: one connection carried one request-response at a time. To load a page with 50 assets the browser opened multiple connections (capped around six per domain) and queued the rest. That cap is what starved SSE.

HTTP/2 changed the connection model. One TCP connection now carries many independent, interleaved streams at once. Each request-response is a stream; they share the wire; frames from different streams interleave.

What that fixes:

  • The six-connection cap is gone in practice. All those SSE streams and asset fetches share one multiplexed connection, so opening several SSE streams no longer starves the page. SSE got materially better under HTTP/2 without changing a line of SSE code.
  • Header compression and server push. Headers are compressed (HPACK); a server-push mechanism existed (now largely deprecated, but the multiplexing is the lasting win).

What it does not do is replace SSE or websockets. HTTP/2 is still a request-response-and-streams model, not a symmetric pipe. Websockets keep their two-way niche. The protocols are layered, not competing: you can run SSE over HTTP/2 and get the best of both.

Real-world analogy: HTTP/1.1 is a single-lane road, one car at a time, so you build six parallel roads to get throughput. HTTP/2 is a multi-lane highway on one roadbed: many cars side by side on one connection. Which sets up the catch.

Head-of-line blocking: the highway with one stuck lane

Here is the subtle problem HTTP/2 did not solve, and could not, because of what it runs on. HTTP/2 multiplexes many streams over one TCP connection. But TCP guarantees ordered delivery of all bytes on the connection. So if a single packet is lost, TCP holds back every byte that arrived after it, across all the multiplexed streams, until the lost packet is retransmitted.

That is head-of-line blocking. Ten independent streams sharing one TCP connection, one packet drops on stream three, and streams one, two, and four through ten all stall, even though their data already arrived intact. The very ordering guarantee that makes TCP reliable for one stream becomes a shared chokepoint for many.

Back to the conveyor belt: you put ten independent orders on one single-file belt that must come off in order. One order jams, all ten wait. Multiplexing onto one ordered belt means one jam stops everything.

You cannot fix this inside HTTP/2, because the blocking lives in TCP underneath it. To fix it you have to change the transport. That is HTTP/3.

HTTP/3 and QUIC: rebuild the transport on UDP

HTTP/3 keeps the HTTP/2 idea of many multiplexed streams but moves it onto a new transport called QUIC, which runs over UDP instead of TCP.

UDP is the opposite of TCP: it sends independent packets (datagrams) with no ordering and no reliability guarantees. On its own that is useless for the web. So QUIC rebuilds reliability and ordering on top of UDP, but with one decisive difference: it tracks order and loss per stream, not for the whole connection.

That one change is the whole point:

  • No more cross-stream head-of-line blocking. A lost packet on stream three stalls only stream three. Streams one, two, and four through ten keep flowing, because QUIC knows they are independent and does not make them wait on a packet that was not theirs. This is the problem TCP made unsolvable and QUIC makes solvable, because reliability now lives at the stream level, where the independence already is.
  • Faster handshakes. QUIC folds the TLS handshake into its own connection setup, so establishing a secure connection costs fewer round-trips. A returning client can often resume in 0-RTT, sending data in the very first packet.
  • Connection migration. A QUIC connection is identified by a connection ID, not by the IP-and-port four-tuple TCP uses. So when your phone switches from wifi to cellular and your IP changes, the QUIC connection survives instead of dropping and re-handshaking. The call does not drop when you walk out the door.
flowchart TB
  subgraph H2["HTTP/2 over TCP: one ordered belt"]
    direction LR
    s1[stream 1] --> tcp[(single TCP order)]
    s2[stream 2 - lost packet] --> tcp
    s3[stream 3] --> tcp
    tcp -->|one drop stalls all| out2[blocked]
  end
  subgraph H3["HTTP/3 over QUIC over UDP: independent lanes"]
    direction LR
    q1[stream 1] --> ok1[flows]
    q2[stream 2 - lost packet] --> wait2[only this one waits]
    q3[stream 3] --> ok3[flows]
  end

Real-world analogy: HTTP/2 over TCP is the multi-lane highway whose lanes are secretly chained together, so one stalled lane stalls all of them. HTTP/3 over QUIC cuts the chains: the lanes are finally independent, and a wreck in one does not freeze the others. UDP is the bare road with no traffic rules, and QUIC is the rules HTTP/3 paints back on, but per-lane instead of across the whole highway.

A few honest caveats:

  • QUIC lives in user space, not the kernel. TCP is implemented in the operating system kernel; QUIC is typically implemented in the application or library. That makes it easier to evolve but means it does not get the kernel’s decades of tuning for free, and UDP throughput can need work to match a finely tuned TCP stack.
  • Some networks throttle or block UDP. Corporate firewalls sometimes treat UDP with suspicion, so HTTP/3 clients keep HTTP/2 as a fallback. You do not bet the connection on UDP getting through.
  • SSE and websockets still apply. HTTP/3 improves the transport beneath them; it does not replace the application-level choice between one-way and two-way. SSE over HTTP/3 is SSE with a better belt.

Putting it together: how to choose

The choice is two questions, in order.

  1. Who needs to push?

    • Neither side pushes unprompted: plain HTTP request-response. Simplest, cacheable, stateless.
    • Only the server pushes: SSE. One-way, auto-reconnect, plain HTTP.
    • Both sides push, low latency, one connection: websockets.
  2. Which HTTP version carries it? This is mostly your infrastructure’s job, not your application’s, but it changes the tradeoffs:

    • HTTP/2 removes the old SSE connection cap by multiplexing, so SSE is a strong default again.
    • HTTP/3 removes cross-stream head-of-line blocking and speeds up setup and network switches. You usually get it by enabling it at the load balancer or CDN, not by rewriting application code.
NeedProtocolDirectionReconnectCarries well on
Server pushes to a listenerSSEserver to clientbuilt inHTTP/2, HTTP/3
Both sides push, low latencywebsocketsfull-duplexyou build itHTTP/1.1+, HTTP/2
Client asks, server answersrequest-responseclient to servern/aany

The mental model that holds all of it: the web started as postal mail (request-response), grew a radio broadcast for one-way push (SSE), grew an open phone call for two-way push (websockets), widened the road from one lane to many (HTTP/2), and finally cut the chains between the lanes so one jam stops one lane instead of all of them (HTTP/3 over QUIC). Each step solved a specific failure of the step before it. None of them replaced the others; they stacked.

Key takeaways

  • The real-time web exists to solve one thing request-response cannot: letting the server send data the client did not just ask for. Polling is the wasteful first answer; SSE and websockets are the good ones.
  • SSE is the one-way option: server streams to client over plain HTTP, with automatic reconnection. Use it when only the server pushes, which covers notifications, live feeds, progress, and streaming model tokens.
  • Websockets are the two-way option: a full-duplex byte pipe over one upgraded connection. Use them only for genuine low-latency two-way traffic, and accept that you own reconnection and scaling.
  • HTTP/2 multiplexes many streams over one connection, which removed SSE’s old six-connection cap and made SSE a strong default. It does not replace websockets.
  • HTTP/2’s catch is head-of-line blocking: many streams share one ordered TCP connection, so one lost packet stalls all of them. You cannot fix it inside HTTP/2 because the blocking lives in TCP.
  • HTTP/3 runs over QUIC over UDP and tracks order and loss per stream, so a lost packet stalls only its own stream. It also cuts handshake round-trips and survives a wifi-to-cellular switch.
  • Choose by direction first (request-response, SSE, or websockets), then let HTTP/2 or HTTP/3 at the edge improve the transport beneath whichever you picked.

Common questions

What is the difference between SSE and websockets in one line?

SSE is a one-way stream from server to client over plain HTTP, with automatic reconnection built in. Websockets are a two-way byte pipe over a single upgraded connection, where either side can send anytime. Use SSE when only the server pushes; use websockets when both sides need to push with low latency.

Does HTTP/2 make SSE and websockets obsolete?

No. HTTP/2 multiplexes many streams over one connection, which fixes the old browser limit of six SSE connections per domain, so SSE got much better under HTTP/2. But HTTP/2 is still a request-response and server-push model, not a symmetric pipe, so websockets keep their niche for true two-way traffic. They solve different problems.

What does HTTP/3 (QUIC) actually change?

HTTP/3 runs over QUIC, which runs over UDP instead of TCP. The headline win is fixing head-of-line blocking: under TCP, one lost packet stalls every multiplexed stream behind it; under QUIC, streams are independent, so a lost packet only stalls its own stream. QUIC also folds the TLS handshake into the connection setup, cutting round-trips, and it survives a network change (wifi to cellular) without a new handshake.

When should I just use plain HTTP request-response?

Most of the time. If the client asks and the server answers, and there is no need for the server to push unprompted, a normal request-response is simpler, cacheable, and stateless. Reach for SSE or websockets only when the server needs to send data the client did not just ask for, which is a narrower case than it first looks.

Keep reading

Prasad Subrahmanya
Prasad Subrahmanya

Founder & CEO at Luminik. 3x technical founder. I turn expensive, repetitive work into products people pay for.

Back to all writing