HTTP — The Foundational Protocol for APIs

Methods, status codes, headers, persistent connections — the parts of HTTP every API designer must own.

Building Block Foundational
13 min read
http protocol headers status-codes caching

What it is#

HTTP (Hypertext Transfer Protocol) is the application-layer protocol that carries almost every API call on the public Internet. Originally specified by Tim Berners-Lee in 1991 as a one-line request-response protocol for hypertext documents, it has grown into the universal substrate for distributed-system communication — RFC 9110 (HTTP semantics), RFC 9112 (HTTP/1.1 message syntax), RFC 9113 (HTTP/2), and RFC 9114 (HTTP/3) are the current normative specs.

For an API designer, HTTP is not just “the thing requests travel over”. It is a vocabulary of methods, status codes, and headers that already encodes most of the semantics your API needs. A REST API that respects HTTP gets caching, intermediaries, content negotiation, conditional updates, and partial responses for free. One that fights HTTP — tunnelling everything through POST with 200 OK and a body-encoded error code — throws all of that away.

The protocol is stateless: each request carries everything the server needs to process it. State that persists between calls (sessions, auth, cache validity) is carried explicitly in headers or cookies. Statelessness is what makes HTTP horizontally scalable and CDN-friendly; it is also what forces every API designer to think about idempotency, retries, and replay safety from day one.

When to use it#

Reach for HTTP when:

  • The API is consumed by browsers, mobile apps, or polyglot backends. HTTP is the lowest-common-denominator protocol in every language and runtime.
  • You want intermediaries to be useful. CDNs, reverse proxies, API gateways, WAFs, load balancers — every one of them understands HTTP semantics. They can cache, route, retry, throttle, and observe traffic without your application’s cooperation.
  • You need a well-supported security story. TLS 1.3 over HTTP is the default for the entire Internet. Browsers ship the cryptographic primitives; libraries handle the handshake.
  • You want documentation tooling for free. OpenAPI, Postman, curl, browser dev-tools, every HTTP debugger — all already know how to inspect HTTP traffic.

Avoid (or augment) HTTP when:

  • You need bidirectional streaming. HTTP/1.1 is half-duplex per request. Use WebSockets — Bidirectional Streaming or gRPC streaming over HTTP/2.
  • The fan-out per logical operation is tiny but constant. A trading-engine RPC at sub-millisecond latency may not tolerate HTTP framing overhead — a custom binary protocol over TCP can beat it.
  • The transport must run over UDP for loss-tolerance. Video and game traffic typically pick QUIC (The Evolution of HTTP — 1.1, 2, 3 discusses HTTP/3 over QUIC) or raw UDP.

How it works#

An HTTP exchange is one request followed by one response, both with the same five-part shape: a start line, a set of headers, an empty line, and an optional body.

A representative request
GET /v1/orders/ord_a3f9c2 HTTP/1.1
Host: api.example.com
Accept: application/json
Authorization: Bearer eyJhbGciOi...
If-None-Match: "W/c8f3"
User-Agent: example-cli/2.4.1
A representative response
HTTP/1.1 200 OK
Content-Type: application/json
ETag: "W/c8f3"
Cache-Control: private, max-age=60
Vary: Accept-Encoding
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 873
X-RateLimit-Reset: 1685440800
{ "id": "ord_a3f9c2", "status": "confirmed" }

Every word in those two messages is part of the contract. The verb, the path, the version, the headers, the body — and the same on the way back.

The verbs (methods)#

The eight standard methods, with the semantics every API designer must remember:

MethodPurposeSafeIdempotentBody?
GETRead a resourceyesyesno
HEADRead response metadata onlyyesyesno
OPTIONSDiscover allowed verbs / CORS preflightyesyesno
POSTCreate or trigger (server picks ID)nonoyes
PUTReplace a resource (client picks ID)noyesyes
PATCHPartial updatenodependsyes
DELETERemove a resourcenoyesno
TRACEEcho (debugging; usually disabled)yesyesno

Safe means the call does not mutate server state. Idempotent means N identical calls have the same effect as 1. POST is the only common verb that is neither — which is why every retry of a POST needs an idempotency key.

Status codes#

HTTP defines five families. The codes you actually need to handle for an API:

FamilyMeaningCodes that matter
1xxInformational100 Continue, 101 Switching Protocols (used by WebSocket upgrade)
2xxSuccess200 OK, 201 Created, 202 Accepted, 204 No Content, 206 Partial Content
3xxRedirect / cache301 Moved Permanently, 302 Found, 304 Not Modified, 307 Temporary Redirect, 308 Permanent Redirect
4xxClient error400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 405 Method Not Allowed, 409 Conflict, 410 Gone, 412 Precondition Failed, 415 Unsupported Media Type, 422 Unprocessable Entity, 429 Too Many Requests
5xxServer error500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout

A well-designed API picks the right code, every time, consistently. 200 OK with a body that says { "error": "..." } is the single most common API-design rookie mistake — it lies to every cache, retry library, and observability tool downstream.

Headers that actually matter#

Headers are name-value pairs that carry metadata. The HTTP spec defines hundreds; in API-design practice a small subset does most of the work.

Request headers

  • Host — required since HTTP/1.1; identifies the target server (one IP can host many virtual hosts).
  • Accept — what content types the client can parse (application/json, application/vnd.api+json).
  • Accept-Encoding — what compressions the client supports (gzip, br).
  • Authorization — auth credentials (Bearer <token>, Basic ..., custom schemes).
  • Content-Type — the type of the request body (application/json, multipart/form-data).
  • Content-Length — byte length of the body; matters for chunked vs framed transfer.
  • If-Match / If-None-Match — conditional requests against an ETag.
  • If-Modified-Since — conditional GET against a Last-Modified timestamp.
  • User-Agent — the calling library; useful for analytics and debugging.
  • X-Request-Id — caller-supplied request ID for tracing.

Response headers

  • Content-Type — the type of the response body.
  • Content-Encoding — the compression used (gzip, br).
  • ETag — opaque version token of the resource (used with conditional updates).
  • Last-Modified — timestamp version of the resource.
  • Cache-Control — caching directives (public, private, max-age=60, no-store, must-revalidate).
  • Vary — list of request headers that affect the response (Vary: Accept-Encoding, Authorization); critical for CDN correctness.
  • Location — used by 201 Created and 3xx redirects.
  • Retry-After — seconds until the caller should retry (paired with 429 and 503).
  • X-RateLimit-Limit / -Remaining / -Reset — rate-limit telemetry; widely-deployed convention, not a formal standard.
  • Strict-Transport-Security — pin clients to HTTPS for a duration.

Conditional requests and caching#

The ETag / If-None-Match pair is one of the most elegant pieces of HTTP semantics. The server tags every response with an opaque version (ETag: "W/c8f3"); the client stores it; on the next call the client sends If-None-Match: "W/c8f3"; if the resource has not changed, the server returns 304 Not Modified with an empty body. Bytes saved: the entire payload. Latency saved: one parse.

Conditional GET — initial request
GET /v1/users/u_42 HTTP/1.1
Host: api.example.com
Initial response
HTTP/1.1 200 OK
Content-Type: application/json
ETag: "v17"
Cache-Control: private, max-age=300
{ "id": "u_42", "name": "Ada", "role": "admin" }
Conditional GET — next request, cache still warm
GET /v1/users/u_42 HTTP/1.1
Host: api.example.com
If-None-Match: "v17"
304 — body skipped entirely
HTTP/1.1 304 Not Modified
ETag: "v17"

The same ETag token is also used for optimistic concurrency. A PUT with If-Match: "v17" succeeds only if the server’s current version is still v17; if someone else has written first, the server returns 412 Precondition Failed and the client must re-read, re-merge, retry.

A representative conditional GET in three languages#

The same call — read a user, send an If-None-Match, handle 304 cheaply — in Python, Go, and Node.

Conditional GET — Python
import requests
ETAG_CACHE = {} # url -> (etag, body)
def get_user(user_id: str) -> dict:
url = f"https://api.example.com/v1/users/{user_id}"
headers = {"Accept": "application/json",
"Authorization": "Bearer eyJhbGciOi..."}
cached = ETAG_CACHE.get(url)
if cached:
headers["If-None-Match"] = cached[0]
resp = requests.get(url, headers=headers, timeout=5)
if resp.status_code == 304 and cached:
return cached[1] # body unchanged
resp.raise_for_status()
body = resp.json()
ETAG_CACHE[url] = (resp.headers.get("ETag"), body)
return body

The wire is identical across all three; the language is style. That is HTTP’s superpower.

Persistent connections and connection pooling#

In HTTP/1.0, every request opened a fresh TCP connection. The three-way handshake plus TLS negotiation (1-RTT minimum on TLS 1.3, 2-RTT on TLS 1.2) easily added 100ms per call.

HTTP/1.1 introduced persistent connections by default — the same TCP connection carries many request-response pairs back-to-back. Connection: keep-alive is implicit; Connection: close is the explicit opt-out.

A modern HTTP client maintains a connection pool per host:

client_pool
├── api.example.com ── [conn-1] (idle, last-used 0.2s ago)
│ ── [conn-2] (in use, response streaming)
│ ── [conn-3] (idle, last-used 1.8s ago)
└── auth.example.com── [conn-4] (idle, last-used 0.5s ago)

When a new request arrives, the pool checks out an idle connection, sends the request, awaits the response, and returns the connection to the pool. The handshake amortises across many calls. The pool size, idle timeout, and keep-alive ping are all tunable — most production clients ship with sensible defaults (Go’s http.Transport: 100 idle conns per host, 90s timeout; Node’s undici: similar).

Chunked transfer encoding#

When the server doesn’t know the response size ahead of time (streaming JSON, server-sent events, file downloads in progress), it uses Transfer-Encoding: chunked. Each chunk carries its own length prefix; the stream ends with a zero-length chunk:

HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
1a
{"items": [{"id": "i_1"},
1c
{"id": "i_2"}, {"id": "i_3"}]}
0

Chunked encoding is what makes streaming responses possible over HTTP/1.1 without computing the full body up-front. HTTP/2 and HTTP/3 don’t need it — they have native framing.

Variants#

VariantMechanismWhen it fits
HTTP/1.0One TCP connection per requestLegacy systems; almost never seen today
HTTP/1.1Persistent connections, pipelining, chunked encodingStill the default fallback; what every client speaks
HTTP/2Multiplexed streams over one TCP connection, HPACK header compression, server pushModern public APIs; TLS-only in practice
HTTP/3QUIC over UDP; eliminates TCP head-of-line blockingMobile-heavy / lossy networks; CDN edges; covered in The Evolution of HTTP — 1.1, 2, 3
HTTPSHTTP over TLS — same protocol, encrypted transportMandatory for every public API today; see Transport Layer Security (TLS)

Trade-offs#

What HTTP gives you:

  • The largest tooling ecosystem on the planet. curl, Postman, browser dev-tools, every gateway and CDN, every language’s standard library.
  • Free caching, free routing, free observability. Intermediaries do useful work without your code.
  • A well-understood security envelope. TLS, CORS, HSTS, CSP — the browser ships the model.
  • Statelessness. Horizontal scaling is the default, not an afterthought.

What HTTP costs you:

  • Framing overhead. Headers can dwarf the body on small calls. HTTP/2 HPACK and HTTP/3 QPACK compress repeated headers but never eliminate the cost.
  • Latency on first call. TCP + TLS handshakes add 1-3 RTTs to a cold connection. Connection pooling, HTTP/2 multiplexing, and 0-RTT resumption help.
  • Half-duplex per request. Bidirectional streaming requires WebSockets, gRPC streams, or HTTP/3 datagrams.
  • Verbose for tiny RPCs. A 12-byte function call wrapped in 800 bytes of HTTP is wasteful — internal services sometimes pick gRPC or a custom binary protocol.

Common pitfalls#

  • 200 OK with a body-encoded error. Use the right status code. Otherwise CDNs cache the error, retry libraries don’t retry, monitoring dashboards show 100% success while the API is on fire.
  • POST for reads. Reads should be GET so caches, proxies, and browsers behave correctly. The only reason to POST a read is when the query is too large for a URL (rare) or carries secrets (use a header instead).
  • No idempotency keys on POST. Every retryable write needs one. Stripe’s Idempotency-Key is the industry-standard pattern; see The Role of Idempotency in API Design.
  • PUT and PATCH confused. PUT replaces the entire resource (idempotent). PATCH partially updates (often not idempotent unless you use JSON-Patch or an If-Match ETag).
  • Cache-Control left to defaults. Browsers heuristically cache responses with no caching headers; CDNs make their own assumptions. Spell out Cache-Control (and Vary) on every response.
  • No Retry-After on 429 or 503. Without it, every caller backs off with their own arbitrary strategy. Setting Retry-After: 30 lets well-behaved clients coordinate.
  • Headers carrying auth secrets logged at the gateway. Strip Authorization, Cookie, and any X-Api-Key from access logs at ingest. Splunk indexing a year of bearer tokens is a breach waiting to happen.
  • Forgetting Vary. A cache that serves the wrong response to the wrong client is worse than no cache at all.
Search ESC

Keyboard shortcuts

Shortcuts are disabled while typing in inputs.