← All system designs

Operational Concerns

Versioning, rate limiting, caching, idempotency, retries, circuit breakers, monitoring — the production-grade concerns of a real API.

17 items 2 Foundational 15 Intermediate

The first 90% of API design is the contract. The second 90% is everything around it: how it changes over time, how it survives a partial outage, how the on-call diagnoses it at 3am. Operational Concerns covers the long tail: versioning strategy, rate limiting, the retry/circuit-breaker pair, idempotency keys, caching at every layer, monitoring and the four golden signals.

This is where junior and senior API designers separate. Both can produce a v1 spec; only the senior knows what v3 looks like when the assumption changes.

Key concepts

  • Versioning is a contract with future-you — pick a strategy and stick to it
  • Rate limiting protects the API from itself; the 429 response is part of the contract
  • Idempotency is the prerequisite for safe retries; payments APIs care most
  • Caching at every layer (browser, CDN, gateway, app) — the cache-invalidation problem is real
  • The four golden signals (latency, traffic, errors, saturation) define what monitoring shows
  • The walk-through (requirements → endpoints → data → constraints → auth → evolution → latency) is a repeatable recipe

Reference template

// The API-design walk-through (seven steps)
1. Requirements           (functional + non-functional, in scope + out)
2. Endpoints              (verbs, paths, status codes, pagination)
3. Data                   (request/response schemas, OpenAPI)
4. Constraints            (rate limits, payload sizes, idempotency)
5. Auth                   (AuthN + AuthZ + scopes + tokens)
6. Evolution              (versioning, deprecation, backward-compat)
7. Latency budget         (back-of-envelope, with breakdown)

Adapt to your problem; the structure is the load-bearing part.

Common pitfalls

  • Versioning in the URL then breaking semver because nobody enforces it
  • Retries without idempotency keys — payment double-charges follow
  • Caching at the wrong layer (response cached but ETag wrong) — staleness rules everything
  • Monitoring latency only at p50; the p99 is where the user pain is

Related topics

Items (17)

  • API Versioning

    URI versioning, header versioning, semantic versioning. The choice that ages well vs the one that bites every quarter.

    Concept Intermediate
  • Evolving an API Design

    Backward-compatible additions, breaking-change taxonomy, deprecation timelines. How successful APIs survive a decade.

    Concept Intermediate
  • Rate Limiting

    Token bucket, leaky bucket, fixed window, sliding window. Per-user vs per-IP, the 429 contract, and the burst question.

    Building Block Intermediate
  • Client-Adapting APIs

    When the server shapes its response to the client (mobile vs web vs partner). The BFF pattern in API form.

    Concept Intermediate
  • Data Fetching Patterns

    Eager vs lazy, batch vs single, paginated vs streamed. The four levers every API designer pulls.

    Concept Intermediate
  • Event-Driven Architecture Protocols

    Webhooks, server-sent events, Kafka, message queues. The push-shaped alternative to request/response.

    Building Block Intermediate
  • Cookies and Sessions for APIs

    Stateful sessions over stateless HTTP, the SameSite / Secure / HttpOnly trio, when JWTs replace cookies and when they shouldn't.

    Building Block Foundational
  • The Role of Idempotency in API Design

    Idempotency keys, safe retries, the difference between idempotent and safe verbs. Why payments APIs care most.

    Concept Intermediate
  • Server-Side Rendering vs Client-Side Rendering

    The render seam shapes the API contract. SSR's full-payload vs CSR's many-small-calls, and the hybrid in between.

    Concept Intermediate
  • Speeding Up Web Page Loading

    Critical render path, third-party blockers, what an API designer can give the front-end to win the LCP and CLS scores.

    Concept Intermediate
  • Resource Hints and Debouncing

    preload, prefetch, dns-prefetch, preconnect. Debouncing user input to API calls; the trade-off between fresh and floody.

    Concept Intermediate
  • The Circuit Breaker Pattern

    Closed → Open → Half-Open. Failing fast when a dependency is sick; the cascade-prevention pattern Netflix made famous.

    Building Block Intermediate
  • Managing Retries

    Exponential backoff, jitter, retry budgets, the retry-storm that takes down a recovering service. Idempotency is mandatory.

    Building Block Intermediate
  • Caching at Different Layers

    Browser, CDN, gateway, app, database. Where to cache, what to cache, the cache-invalidation problem the joke is about.

    Building Block Intermediate
  • API Monitoring

    Logs, metrics, traces, the four golden signals (latency, traffic, errors, saturation), what the on-call must see in 5 seconds.

    Building Block Intermediate
  • Estimating API Latency — Back-of-Envelope

    Processing time + network time + queueing. The numbers every engineer should know (memory, SSD, datacenter RTT, cross-continent RTT).

    Concept Intermediate
  • The API-Design Walk-through

    A repeatable seven-step recipe for an API-design interview: requirements, endpoints, data, constraints, auth, evolution, latency.

    Concept Foundational
Search ESC

Keyboard shortcuts

Shortcuts are disabled while typing in inputs.