The Life of a Packet — End to End
A single HTTP request from address-bar keypress to rendered page — ARP, DHCP, DNS, IP, TCP, HTTP and back, narrated layer by layer.
Context#
“Walk me through what happens when you type example.com into your browser and hit Enter.” It is the most famous integrative question in networking interviews because it touches every layer of the stack — DHCP for the local IP lease, ARP for the gateway’s MAC, DNS for the name resolution, IP for routing, TCP for the byte stream, TLS for the encryption, HTTP for the request, and the link layer for every actual frame on the wire. A candidate who can narrate it end to end has demonstrated a working model of the whole protocol stack, not memorised facts about pieces of it.
This writeup is that narration. It is not a system to implement — it is the integration story for everything else in the workbook. Each layer is doing its small job, and the question is how the small jobs compose into “the page appears.” A good walk-through is dense enough to take ten minutes to deliver out loud.
Requirements (functional and non-functional)#
What a “fetch the page” path must do:
Functional
- Resolve the human-readable hostname (
example.com) to an IP address. - Discover the host’s own IP and default gateway if not already configured.
- Find the MAC address of the next hop on the local link.
- Establish a reliable byte stream to the destination IP on TCP port 443.
- Negotiate a TLS session and verify the server’s certificate.
- Issue an HTTP GET request and receive the response body.
- Parse the response, fetch sub-resources, render the page.
- Tear the connection down cleanly (or recycle it for the next request).
Non-functional
- Latency target: a typical page in
~ 1second over a healthy broadband link. - Robust to partial failure — a single dropped packet must not require restarting from DNS.
- Caches at every layer (DNS resolver, OS DNS cache, TCP fast-open cookies, HTTP cache, TLS session resumption) to amortise setup cost across repeated visits.
- No leaked credentials: TLS must be established before any sensitive HTTP header travels on the wire.
Use case diagram#
One actor, one use case, but with the entire stack mediating.
┌──────────────────────────────┐ │ "Fetch a web page" │ │ │ ┌─────────┐ │ resolve · connect · fetch │ │ User │────────►│ · render · close │ └─────────┘ │ │ │ └──────────────────────────────┘ │ │ │ │ │ │ ┌──────┘ │ │ └───────┐ │ ▼ ▼ ▼ ▼ │ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────────┐ └────────►│ DHCP │ │ DNS │ │ TCP │ │ HTTP │ │ ARP │ │ │ │ TLS │ │ server │ └──────┘ └──────┘ └──────┘ └──────────┘The user only invokes the “fetch” use case. Every other actor in the diagram is a participant — a service the use case depends on but does not name explicitly.
Class diagram#
The protocol stack itself can be drawn as a composition. Each layer holds a reference to the one below it and exposes a narrower interface upward.
┌─────────────────────────────────────────┐ │ Browser (URL parser, renderer) │ └──────────────────┬──────────────────────┘ │ uses ▼ ┌─────────────────────────────────────────┐ │ HTTP / TLS │ │ request, response, certificate │ └──────────────────┬──────────────────────┘ │ stream of bytes ▼ ┌─────────────────────────────────────────┐ │ TCP │ │ ports, seq/ack, retransmit, cwnd │ └──────────────────┬──────────────────────┘ │ segment ▼ ┌─────────────────────────────────────────┐ │ IP (v4 / v6) │ │ addresses, routing, fragmentation │ └──────────────────┬──────────────────────┘ │ packet ▼ ┌─────────────────────────────────────────┐ │ Link layer (Ethernet, Wi-Fi) │ │ framing, MAC addrs, CRC, MAC arbitration│ └──────────────────┬──────────────────────┘ │ frame on wire / RF ▼ ┌─────────────────────────────────────────┐ │ Physical layer │ │ bits on copper / fibre / radio │ └─────────────────────────────────────────┘Encapsulation is the directional arrow: each layer adds a header (and sometimes a trailer) when going down, strips it when going up. The total overhead for a typical HTTP-over-TLS-over-TCP-over-IP-over-Ethernet frame is around 80 bytes of headers wrapping however many bytes of payload, all bounded by the 1500-byte MTU on a default Ethernet link.
Sequence diagram (key flows)#
The full packet exchange for a cold fetch — first visit, empty caches, no prior connection. Lifelines: User, Browser, OS, Router (LAN gateway), Resolver (recursive DNS), Server.
User Browser OS Router Resolver Server │ │ │ │ │ │ │ type │ │ │ │ │ ├───────►│ URL │ │ │ │ │ ├────────►│ no IP yet? │ │ │ │ │──DHCP DISCOVER (broadcast)──►│ │ │ │ │◄──────DHCP OFFER─────────────│ │ │ │ ├──DHCP REQUEST──────────────►│ │ │ │ │◄──────DHCP ACK──────────────│ │ │ │ │ have IP + gateway + DNS srv │ │ │ │ │ │ │ │ │ ├─who has 192.168.1.1? (ARP)─►│ │ │ │ │◄──── 192.168.1.1 is AA:BB ──│ │ │ │ │ have gateway MAC │ │ │ │ │ │ │ │ ├────────►│ resolve example.com │ │ │ │ ├──DNS QUERY example.com─────►│ │ (via gateway) │ │ │◄────A 93.184.216.34 ────────│ │ │ │ │ │ │ │ │ ├──TCP SYN ─────────────────────────────►│ │ │ │◄──TCP SYN/ACK ─────────────────────────│ │ │ ├──TCP ACK ─────────────────────────────►│ │ │ │ │ │ │ ├──TLS ClientHello ────────────────────►│ │ │ │◄──TLS ServerHello + cert + finished ──│ │ │ ├──TLS Finished ───────────────────────►│ │ │ │ │ │ │ ├──HTTP GET / ─────────────────────────►│ │ │ │◄──HTTP 200 + body ────────────────────│ │ │ │ │ │ ├◄────────│ bytes │ │ render │ page │ │ │◄───────│ │ │ │ │ │ (later) TCP FIN, FIN/ACK, ACK │Round-trip count for the cold path: DHCP (2 RTT, only first time on this network), ARP (1 RTT, only first time for the gateway), DNS (1 RTT to the recursive resolver, possibly more upstream), TCP (1 RTT for the handshake), TLS 1.3 (1 RTT, or 0 RTT with session resumption), HTTP (1 RTT for request + first byte). Without DHCP/ARP, a fresh connection to a never-seen host needs roughly 3 RTTs before the first byte of HTML arrives.
Activity diagram (for non-trivial state)#
The browser’s view of “what step am I on?” — a state machine, because each step depends on the previous and any of them can fail.
┌──────────────────────┐ │ Start (URL typed) │ └──────────┬───────────┘ ▼ ┌──────────────────────┐ │ Parse URL │──malformed──►(error to user) │ scheme, host, path │ └──────────┬───────────┘ ▼ ┌──────────────────────┐ │ DNS lookup │──NXDOMAIN──►(error) │ cache → resolver │──timeout──►(retry then error) └──────────┬───────────┘ ▼ ┌──────────────────────┐ │ TCP connect │──RST──►(connection refused) │ SYN → SYN/ACK → ACK │──timeout──►(retry with backoff) └──────────┬───────────┘ ▼ ┌──────────────────────┐ │ TLS handshake │──cert fail──►(warning / abort) │ verify, derive keys │ └──────────┬───────────┘ ▼ ┌──────────────────────┐ │ HTTP request │──5xx──►(show error, maybe retry) │ send GET, recv body │──4xx──►(show error to user) └──────────┬───────────┘ ▼ ┌──────────────────────┐ │ Render │ │ parse HTML, fetch │──sub-resources fail──►(degraded) │ sub-resources │ └──────────┬───────────┘ ▼ ┌──────────────────────┐ │ Connection reuse or │ │ FIN tear-down │ └──────────────────────┘The states matter because each is also a place where caching short-circuits the flow. A warm DNS cache skips the resolver hop. A keep-alive connection skips TCP and TLS entirely on the second request. A TLS session ticket skips the certificate exchange. The cold path described above is the slowest possible path; every subsequent request is shorter.
Implementation walkthrough#
A numbered narration of the same flow, in the depth a senior interview expects.
1. Keypress and URL parsing. The user types example.com and presses Enter. The browser’s URL parser fills in the scheme (https:// by default in modern browsers via HSTS preload), the host (example.com), the port (443 for HTTPS), and the path (/). This is purely local — no network traffic yet. The browser also checks the HSTS preload list to refuse plaintext HTTP entirely for sites that have opted in.
2. DHCP (only if no IP yet). If the laptop just woke from sleep or joined a new Wi-Fi network, it has no IP address. The host sends a DHCP DISCOVER as a broadcast to 255.255.255.255 from source 0.0.0.0. A DHCP server on the LAN replies with an OFFER containing a proposed IP, subnet mask, default gateway, and DNS server addresses. The host requests it, the server acks, and the lease is bound for some hours. Without DHCP nothing else can happen — the host has no way to send a unicast packet because it has no IP to put in the source field.
3. ARP for the default gateway. The host now has an IP and knows the gateway’s IP (say 192.168.1.1), but doesn’t know the gateway’s MAC address. It broadcasts an ARP request: “who has 192.168.1.1, tell 192.168.1.42.” The gateway replies with its MAC. The host caches the mapping for a few minutes. Every packet leaving the LAN from this host will now have the gateway’s MAC as the Layer 2 destination, until the cache expires.
4. DNS resolution. The browser asks the OS to resolve example.com. The OS checks its own cache (negative or stale) and forwards the query to the recursive DNS resolver learned via DHCP. The resolver may have the answer cached; if not, it walks the hierarchy — root → .com TLD → authoritative server for example.com — and returns the A or AAAA record. The result is 93.184.216.34 (the famous IANA-allocated address for example.com). The whole DNS exchange is over UDP port 53, often one round trip from the host to the resolver but possibly several from the resolver outward.
5. TCP three-way handshake. The browser asks the OS to open a TCP connection to 93.184.216.34:443. The OS picks an ephemeral source port (say 54321), builds a TCP segment with SYN flag set and a random initial sequence number, wraps it in an IP packet, hands it to the Ethernet driver, which frames it with the source’s MAC and the gateway’s MAC and clocks it onto the wire. The server’s stack receives the SYN, allocates a TCB, replies with SYN/ACK and its own random sequence number. The host ACKs, and the connection is ESTABLISHED on both sides.
6. TLS handshake. Over the new TCP connection the browser sends a TLS ClientHello listing supported cipher suites, the requested hostname in SNI, and (in TLS 1.3) its key share. The server replies with ServerHello, its certificate chain, and its key share — enough for the client to verify the certificate against the host’s trust store and derive a session key. With TLS 1.3 the encrypted application data can flow immediately after the client sends its Finished. One RTT total; session resumption with a pre-shared key can drop that to zero.
7. HTTP request. The browser sends GET / HTTP/1.1 (or HTTP/2 framed equivalent) with headers — Host: example.com, User-Agent, Accept-Encoding: gzip, Cookie if any are stored for the host. The whole thing is encrypted inside the TLS session before it touches TCP. The server receives the bytes, decrypts inside its TLS stack, hands the HTTP request to the web server (nginx, Apache, Caddy, or whatever), which routes it to a handler.
8. Server processes the request. The server’s HTTP handler reads the path, maybe consults a cache (Varnish, CDN edge), maybe runs application code, maybe queries a database, and produces a response. Status 200 OK, headers including Content-Type: text/html, Content-Length, Cache-Control, and the HTML body. The response is handed back through the encryption layer.
9. HTTP response and body delivery. The server writes the response into the TLS stream, which writes ciphertext into the TCP stream. TCP segments the response according to MSS (~1460 bytes per segment on a default Ethernet path), each segment becomes an IP packet, each packet becomes a frame. The frames make their way back to the host — possibly across many routers, each doing a Layer 3 forwarding decision based on its routing table. The host re-orders any out-of-order segments, ACKs them, and hands a contiguous byte stream up to TLS, which decrypts, which hands plaintext to HTTP, which hands the parsed response to the browser.
10. HTML parsing and sub-resource fetching. The browser’s parser builds the DOM as bytes arrive — modern browsers stream-parse so rendering can begin before the whole page is downloaded. It discovers <link rel="stylesheet">, <script src="...">, <img src="..."> and queues sub-resource fetches. Each sub-resource that lives on the same host reuses the already-open TCP+TLS connection (HTTP/1.1 keep-alive, HTTP/2 multiplexed streams, HTTP/3 over QUIC). Cross-origin sub-resources may trigger new DNS lookups and new TLS handshakes.
11. Render. Style rules apply to DOM nodes, the layout engine computes positions, the compositor paints. JavaScript executes — possibly mutating the DOM, possibly issuing more network requests (XHR, fetch, WebSocket). The page is “loaded” by some definition (DOMContentLoaded, load, largest contentful paint) but may continue to update for the lifetime of the tab.
12. Connection lifecycle. When the browser is done with the connection it sends a TCP FIN. The server replies with FIN/ACK, the browser ACKs, the connection is CLOSED. In practice, modern browsers leave HTTPS connections in keep-alive for many seconds in case the user clicks another link to the same host — closing and reopening would waste another DNS + TCP + TLS round trip. The OS keeps the connection in TIME_WAIT for a couple of minutes after close to prevent stray packets from a recycled 5-tuple confusing a new connection.
13. Caches at every layer (the second visit is much faster). On the second visit to example.com, the DNS resolver answers from cache (zero new DNS work for the TTL window). The OS’s connection-tracking table still has the gateway ARP entry. If the browser left a keep-alive connection open, even the TCP and TLS are skipped — the HTTP GET flies on an existing encrypted stream. Cold-path 3 RTTs become warm-path 0 RTTs. This is the single biggest reason “the second visit is always fast” — every layer caches its own state.
Trade-offs and extensions#
tcpdump + curl -v, and supported everywhere. Three RTTs to first byte on a cold connection, but proven robust across thirty years of deployment. Head-of-line blocking inside TCP: one dropped segment stalls the whole stream until retransmit. Other extensions worth knowing:
- Connection reuse. HTTP/1.1 keep-alive and HTTP/2 multiplexing both reuse TCP connections across many requests. The amortised RTT to first byte for subsequent requests is one — sometimes zero with server push.
- DNS caching at every layer. Browser DNS cache, OS resolver cache, recursive resolver cache, authoritative server. TTLs determine how stale answers can be. A 60-second TTL is the practical floor for production services that need fast failover.
- TLS session resumption. Session tickets and PSKs let the client and server skip the certificate exchange on subsequent connections. TLS 1.3 0-RTT pushes early application data on the first packet — fast but replayable, so only safe for idempotent requests.
- CDN edge servers. Most popular hostnames resolve to a CDN edge close to the user, not the origin. The “end-to-end” path is really host → edge for most users, with edge → origin happening rarely and over a warm dedicated connection.
- IPv6 dual-stack. A modern host issues both A and AAAA DNS queries and races the two connections via Happy Eyeballs. Whichever connects first wins; the other is closed.
Mock interview follow-ups#
- “What if DNS returns multiple A records?” The client picks one (round-robin from the resolver’s perspective, often pseudo-random) and connects. On failure, browsers will retry the next address. CDNs use this for load distribution and failover.
- “What does the gateway’s routing table look for
93.184.216.34?” Longest-prefix match. The default route0.0.0.0/0points at the ISP. The ISP’s routers run BGP and have more specific routes that get the packet onto the right transit path toward Verizon/EdgeCast (whereexample.comlives). - “Why is TCP setup one RTT but TLS 1.2 was two?” TCP needs SYN → SYN/ACK → ACK, with the client able to send data piggy-backed on the third packet. TLS 1.2 needed
ClientHello → ServerHello + cert → ClientKeyExchange + Finished → ServerFinished— two full round trips. TLS 1.3 redesigned the handshake to fold the key exchange into the first round trip. - “Where does the packet go if the gateway is down?” Nowhere helpful. The host’s ARP entry for the gateway expires in a few minutes; until then the host keeps trying to send to a dead MAC. Re-issuing DHCP usually finds a backup gateway if one exists. Multi-homed hosts can fail over via routing-table priority.
- “How does Wi-Fi change this story?” The link layer changes: framing is 802.11 not Ethernet, the MAC protocol is CSMA/CA not switched-Ethernet, and there’s an extra association step before DHCP. Everything above the link layer is identical — Layer 3 onward does not know whether the underlying medium is copper, fibre, or radio.
- “What’s the smallest meaningful failure that breaks this whole flow?” A wrong DNS answer. The host then opens a TCP connection to an unrelated server and either gets a TLS certificate mismatch (best case — the browser refuses) or, worse, a valid certificate for an attacker-controlled domain if someone has compromised the DNS path without DNSSEC.
- “Where would you add observability if you owned this whole stack?” Per-stage timing — DNS lookup time, TCP connect time, TLS handshake time, TTFB, content download time. Real-user monitoring (RUM) and the Navigation Timing API expose exactly these. The shape of the breakdown tells you whether a slow page is a DNS problem, a network problem, a server problem, or a render problem.
Related concepts#