HTTP — The Foundational Protocol for APIs
Methods, status codes, headers, persistent connections — the parts of HTTP every API designer must own.
What it is#
HTTP (Hypertext Transfer Protocol) is the application-layer protocol that carries almost every API call on the public Internet. Originally specified by Tim Berners-Lee in 1991 as a one-line request-response protocol for hypertext documents, it has grown into the universal substrate for distributed-system communication — RFC 9110 (HTTP semantics), RFC 9112 (HTTP/1.1 message syntax), RFC 9113 (HTTP/2), and RFC 9114 (HTTP/3) are the current normative specs.
For an API designer, HTTP is not just “the thing requests travel over”. It is a vocabulary of methods, status codes, and headers that already encodes most of the semantics your API needs. A REST API that respects HTTP gets caching, intermediaries, content negotiation, conditional updates, and partial responses for free. One that fights HTTP — tunnelling everything through POST with 200 OK and a body-encoded error code — throws all of that away.
The protocol is stateless: each request carries everything the server needs to process it. State that persists between calls (sessions, auth, cache validity) is carried explicitly in headers or cookies. Statelessness is what makes HTTP horizontally scalable and CDN-friendly; it is also what forces every API designer to think about idempotency, retries, and replay safety from day one.
When to use it#
Reach for HTTP when:
- The API is consumed by browsers, mobile apps, or polyglot backends. HTTP is the lowest-common-denominator protocol in every language and runtime.
- You want intermediaries to be useful. CDNs, reverse proxies, API gateways, WAFs, load balancers — every one of them understands HTTP semantics. They can cache, route, retry, throttle, and observe traffic without your application’s cooperation.
- You need a well-supported security story. TLS 1.3 over HTTP is the default for the entire Internet. Browsers ship the cryptographic primitives; libraries handle the handshake.
- You want documentation tooling for free. OpenAPI, Postman, curl, browser dev-tools, every HTTP debugger — all already know how to inspect HTTP traffic.
Avoid (or augment) HTTP when:
- You need bidirectional streaming. HTTP/1.1 is half-duplex per request. Use WebSockets — Bidirectional Streaming or gRPC streaming over HTTP/2.
- The fan-out per logical operation is tiny but constant. A trading-engine RPC at sub-millisecond latency may not tolerate HTTP framing overhead — a custom binary protocol over TCP can beat it.
- The transport must run over UDP for loss-tolerance. Video and game traffic typically pick QUIC (The Evolution of HTTP — 1.1, 2, 3 discusses HTTP/3 over QUIC) or raw UDP.
How it works#
An HTTP exchange is one request followed by one response, both with the same five-part shape: a start line, a set of headers, an empty line, and an optional body.
GET /v1/orders/ord_a3f9c2 HTTP/1.1Host: api.example.comAccept: application/jsonAuthorization: Bearer eyJhbGciOi...If-None-Match: "W/c8f3"User-Agent: example-cli/2.4.1HTTP/1.1 200 OKContent-Type: application/jsonETag: "W/c8f3"Cache-Control: private, max-age=60Vary: Accept-EncodingX-RateLimit-Limit: 1000X-RateLimit-Remaining: 873X-RateLimit-Reset: 1685440800
{ "id": "ord_a3f9c2", "status": "confirmed" }Every word in those two messages is part of the contract. The verb, the path, the version, the headers, the body — and the same on the way back.
The verbs (methods)#
The eight standard methods, with the semantics every API designer must remember:
| Method | Purpose | Safe | Idempotent | Body? |
|---|---|---|---|---|
GET | Read a resource | yes | yes | no |
HEAD | Read response metadata only | yes | yes | no |
OPTIONS | Discover allowed verbs / CORS preflight | yes | yes | no |
POST | Create or trigger (server picks ID) | no | no | yes |
PUT | Replace a resource (client picks ID) | no | yes | yes |
PATCH | Partial update | no | depends | yes |
DELETE | Remove a resource | no | yes | no |
TRACE | Echo (debugging; usually disabled) | yes | yes | no |
Safe means the call does not mutate server state. Idempotent means N identical calls have the same effect as 1. POST is the only common verb that is neither — which is why every retry of a POST needs an idempotency key.
Status codes#
HTTP defines five families. The codes you actually need to handle for an API:
| Family | Meaning | Codes that matter |
|---|---|---|
| 1xx | Informational | 100 Continue, 101 Switching Protocols (used by WebSocket upgrade) |
| 2xx | Success | 200 OK, 201 Created, 202 Accepted, 204 No Content, 206 Partial Content |
| 3xx | Redirect / cache | 301 Moved Permanently, 302 Found, 304 Not Modified, 307 Temporary Redirect, 308 Permanent Redirect |
| 4xx | Client error | 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 405 Method Not Allowed, 409 Conflict, 410 Gone, 412 Precondition Failed, 415 Unsupported Media Type, 422 Unprocessable Entity, 429 Too Many Requests |
| 5xx | Server error | 500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout |
A well-designed API picks the right code, every time, consistently. 200 OK with a body that says { "error": "..." } is the single most common API-design rookie mistake — it lies to every cache, retry library, and observability tool downstream.
Headers that actually matter#
Headers are name-value pairs that carry metadata. The HTTP spec defines hundreds; in API-design practice a small subset does most of the work.
Request headers
Host— required since HTTP/1.1; identifies the target server (one IP can host many virtual hosts).Accept— what content types the client can parse (application/json,application/vnd.api+json).Accept-Encoding— what compressions the client supports (gzip,br).Authorization— auth credentials (Bearer <token>,Basic ..., custom schemes).Content-Type— the type of the request body (application/json,multipart/form-data).Content-Length— byte length of the body; matters for chunked vs framed transfer.If-Match/If-None-Match— conditional requests against anETag.If-Modified-Since— conditional GET against aLast-Modifiedtimestamp.User-Agent— the calling library; useful for analytics and debugging.X-Request-Id— caller-supplied request ID for tracing.
Response headers
Content-Type— the type of the response body.Content-Encoding— the compression used (gzip,br).ETag— opaque version token of the resource (used with conditional updates).Last-Modified— timestamp version of the resource.Cache-Control— caching directives (public,private,max-age=60,no-store,must-revalidate).Vary— list of request headers that affect the response (Vary: Accept-Encoding, Authorization); critical for CDN correctness.Location— used by201 Createdand3xxredirects.Retry-After— seconds until the caller should retry (paired with429and503).X-RateLimit-Limit/-Remaining/-Reset— rate-limit telemetry; widely-deployed convention, not a formal standard.Strict-Transport-Security— pin clients to HTTPS for a duration.
Conditional requests and caching#
The ETag / If-None-Match pair is one of the most elegant pieces of HTTP semantics. The server tags every response with an opaque version (ETag: "W/c8f3"); the client stores it; on the next call the client sends If-None-Match: "W/c8f3"; if the resource has not changed, the server returns 304 Not Modified with an empty body. Bytes saved: the entire payload. Latency saved: one parse.
GET /v1/users/u_42 HTTP/1.1Host: api.example.comHTTP/1.1 200 OKContent-Type: application/jsonETag: "v17"Cache-Control: private, max-age=300
{ "id": "u_42", "name": "Ada", "role": "admin" }GET /v1/users/u_42 HTTP/1.1Host: api.example.comIf-None-Match: "v17"HTTP/1.1 304 Not ModifiedETag: "v17"The same ETag token is also used for optimistic concurrency. A PUT with If-Match: "v17" succeeds only if the server’s current version is still v17; if someone else has written first, the server returns 412 Precondition Failed and the client must re-read, re-merge, retry.
A representative conditional GET in three languages#
The same call — read a user, send an If-None-Match, handle 304 cheaply — in Python, Go, and Node.
import requests
ETAG_CACHE = {} # url -> (etag, body)
def get_user(user_id: str) -> dict: url = f"https://api.example.com/v1/users/{user_id}" headers = {"Accept": "application/json", "Authorization": "Bearer eyJhbGciOi..."}
cached = ETAG_CACHE.get(url) if cached: headers["If-None-Match"] = cached[0]
resp = requests.get(url, headers=headers, timeout=5) if resp.status_code == 304 and cached: return cached[1] # body unchanged resp.raise_for_status() body = resp.json() ETAG_CACHE[url] = (resp.headers.get("ETag"), body) return bodypackage main
import ( "encoding/json" "net/http" "sync")
type entry struct { etag string body map[string]any}
var ( mu sync.Mutex cache = map[string]entry{})
func getUser(userID string) (map[string]any, error) { url := "https://api.example.com/v1/users/" + userID req, _ := http.NewRequest("GET", url, nil) req.Header.Set("Accept", "application/json") req.Header.Set("Authorization", "Bearer eyJhbGciOi...")
mu.Lock() cached, ok := cache[url] mu.Unlock() if ok { req.Header.Set("If-None-Match", cached.etag) }
resp, err := http.DefaultClient.Do(req) if err != nil { return nil, err } defer resp.Body.Close()
if resp.StatusCode == http.StatusNotModified && ok { return cached.body, nil } var body map[string]any if err := json.NewDecoder(resp.Body).Decode(&body); err != nil { return nil, err } mu.Lock() cache[url] = entry{etag: resp.Header.Get("ETag"), body: body} mu.Unlock() return body, nil}const cache = new Map(); // url -> { etag, body }
async function getUser(userId) { const url = `https://api.example.com/v1/users/${userId}`; const headers = { Accept: "application/json", Authorization: "Bearer eyJhbGciOi...", };
const cached = cache.get(url); if (cached) headers["If-None-Match"] = cached.etag;
const resp = await fetch(url, { headers }); if (resp.status === 304 && cached) return cached.body; if (!resp.ok) throw new Error(`HTTP ${resp.status}`);
const body = await resp.json(); cache.set(url, { etag: resp.headers.get("etag"), body }); return body;}The wire is identical across all three; the language is style. That is HTTP’s superpower.
Persistent connections and connection pooling#
In HTTP/1.0, every request opened a fresh TCP connection. The three-way handshake plus TLS negotiation (1-RTT minimum on TLS 1.3, 2-RTT on TLS 1.2) easily added 100ms per call.
HTTP/1.1 introduced persistent connections by default — the same TCP connection carries many request-response pairs back-to-back. Connection: keep-alive is implicit; Connection: close is the explicit opt-out.
A modern HTTP client maintains a connection pool per host:
client_pool├── api.example.com ── [conn-1] (idle, last-used 0.2s ago)│ ── [conn-2] (in use, response streaming)│ ── [conn-3] (idle, last-used 1.8s ago)└── auth.example.com── [conn-4] (idle, last-used 0.5s ago)When a new request arrives, the pool checks out an idle connection, sends the request, awaits the response, and returns the connection to the pool. The handshake amortises across many calls. The pool size, idle timeout, and keep-alive ping are all tunable — most production clients ship with sensible defaults (Go’s http.Transport: 100 idle conns per host, 90s timeout; Node’s undici: similar).
Chunked transfer encoding#
When the server doesn’t know the response size ahead of time (streaming JSON, server-sent events, file downloads in progress), it uses Transfer-Encoding: chunked. Each chunk carries its own length prefix; the stream ends with a zero-length chunk:
HTTP/1.1 200 OKContent-Type: application/jsonTransfer-Encoding: chunked
1a{"items": [{"id": "i_1"},1c {"id": "i_2"}, {"id": "i_3"}]}0Chunked encoding is what makes streaming responses possible over HTTP/1.1 without computing the full body up-front. HTTP/2 and HTTP/3 don’t need it — they have native framing.
Variants#
| Variant | Mechanism | When it fits |
|---|---|---|
| HTTP/1.0 | One TCP connection per request | Legacy systems; almost never seen today |
| HTTP/1.1 | Persistent connections, pipelining, chunked encoding | Still the default fallback; what every client speaks |
| HTTP/2 | Multiplexed streams over one TCP connection, HPACK header compression, server push | Modern public APIs; TLS-only in practice |
| HTTP/3 | QUIC over UDP; eliminates TCP head-of-line blocking | Mobile-heavy / lossy networks; CDN edges; covered in The Evolution of HTTP — 1.1, 2, 3 |
| HTTPS | HTTP over TLS — same protocol, encrypted transport | Mandatory for every public API today; see Transport Layer Security (TLS) |
Trade-offs#
What HTTP gives you:
- The largest tooling ecosystem on the planet. curl, Postman, browser dev-tools, every gateway and CDN, every language’s standard library.
- Free caching, free routing, free observability. Intermediaries do useful work without your code.
- A well-understood security envelope. TLS, CORS, HSTS, CSP — the browser ships the model.
- Statelessness. Horizontal scaling is the default, not an afterthought.
What HTTP costs you:
- Framing overhead. Headers can dwarf the body on small calls. HTTP/2 HPACK and HTTP/3 QPACK compress repeated headers but never eliminate the cost.
- Latency on first call. TCP + TLS handshakes add 1-3 RTTs to a cold connection. Connection pooling, HTTP/2 multiplexing, and 0-RTT resumption help.
- Half-duplex per request. Bidirectional streaming requires WebSockets, gRPC streams, or HTTP/3 datagrams.
- Verbose for tiny RPCs. A 12-byte function call wrapped in 800 bytes of HTTP is wasteful — internal services sometimes pick gRPC or a custom binary protocol.
Common pitfalls#
200 OKwith a body-encoded error. Use the right status code. Otherwise CDNs cache the error, retry libraries don’t retry, monitoring dashboards show 100% success while the API is on fire.POSTfor reads. Reads should beGETso caches, proxies, and browsers behave correctly. The only reason toPOSTa read is when the query is too large for a URL (rare) or carries secrets (use a header instead).- No idempotency keys on
POST. Every retryable write needs one. Stripe’sIdempotency-Keyis the industry-standard pattern; see The Role of Idempotency in API Design. PUTandPATCHconfused.PUTreplaces the entire resource (idempotent).PATCHpartially updates (often not idempotent unless you use JSON-Patch or anIf-MatchETag).Cache-Controlleft to defaults. Browsers heuristically cache responses with no caching headers; CDNs make their own assumptions. Spell outCache-Control(andVary) on every response.- No
Retry-Afteron429or503. Without it, every caller backs off with their own arbitrary strategy. SettingRetry-After: 30lets well-behaved clients coordinate. - Headers carrying auth secrets logged at the gateway. Strip
Authorization,Cookie, and anyX-Api-Keyfrom access logs at ingest. Splunk indexing a year of bearer tokens is a breach waiting to happen. - Forgetting
Vary. A cache that serves the wrong response to the wrong client is worse than no cache at all.
Related building blocks#
- The Evolution of HTTP — 1.1, 2, 3 — how HTTP got faster: 1.1 → 2 → 3, and what each version buys API designers.
- The Narrow Waist of the Internet — why HTTP-over-TCP-over-IP is the default substrate for every API on the public Internet.
- REST — The Architectural Style — the architectural style that consciously builds on HTTP semantics.
- Caching at Different Layers — how
Cache-Control,ETag, andVarycascade through browser, CDN, gateway, and origin. - Transport Layer Security (TLS) — TLS, which every modern API runs HTTP over.