Caching — the cheapest way to lose a weekend

Caching is fast in the happy path and ruinous on the unhappy ones. The unhappy ones are: stale data, stampedes, key collisions, invalidation races, and the long tail of “the cache lied.” Each section ends with what I pick and what failure mode it dodges.

The four layers, ranked by where the bug lives when it goes wrong#

CDN edge — invalidation is async and global; TTL is the truth.
Application memory (per-process) — invalidation is local; survives only as long as the process.
Shared cache (Redis/Memcached) — invalidation is explicit; coordination is yours.
Database (materialised views, query cache) — invalidation is automatic but transactional and expensive.

A single read often touches three of these. When prod is wrong, the question “which layer lied?” is the first one to answer, and you can only answer it if every layer logs what it served and why.

What I pick: edge for static + signed URLs, shared cache for hot reads, app memory for hot lookups bound to request lifetime. No DB-level cache — it hides cost behind the database that’s already the bottleneck.

TTL vs invalidation: pick TTL, default short#

There are two valid mental models for cache freshness:

\text{TTL}: \text{value is fresh for } t \text{ seconds, then re-fetch.}

\text{Invalidation}: \text{value is fresh until I tell you otherwise.}

TTL is forgiving — wrong key, missed event, network blip, the cache repairs itself in $t$ seconds. Invalidation is unforgiving — miss one event and the cache is permanently wrong until something else evicts it.

What this prevents: the silent class of bugs where a webhook handler crashed three weeks ago and your homepage has been showing yesterday’s data ever since.

Stampede protection: jitter, singleflight, stale-while-revalidate#

When a hot key expires, every concurrent request misses simultaneously and slams the origin. At even modest scale this is enough to take down a database. The shape of the problem:

Sequence diagram of three clients all missing on the same key at expiry and forwarding the load to the origin database — Concurrent misses on a hot key fan out to the origin

A useful mental map of the layers and where the lock should live:

Jitter the TTL. Set TTL to base_ttl + rand(0, base_ttl * 0.1). Spreads expiry across a window so requests don’t synchronise.
Singleflight on miss. Only the first request that misses a hot key fetches the value; concurrent misses wait for it. Implementations: golang.org/x/sync/singleflight, Redis SET NX with a short lock, or an in-process mutex keyed by cache key.
Serve stale while revalidating. When a value expires, return the stale value to the requester immediately, kick off a refresh in the background. The next request gets the fresh value. This is the SWR pattern; CDNs implement it natively.

What this prevents: the post-deploy thundering herd. The classic “we restarted the cache cluster at 3pm and the database fell over” incident.

Cache key hygiene#

Keys are a contract. Bad keys cause two failure modes:

Collisions — two distinct values share a key. Subtle, dangerous; usually only caught when a customer reports seeing another customer’s data.
Fragmentation — semantically-identical values get different keys (user:123 vs user:0123), so the cache is effectively useless.

// Bad: implicit, fragile.
const key = `user:${userId}:${locale}`;

// Better: explicit, versioned, namespaced, normalised.
const key = makeKey('v3', 'user', userId, locale.toLowerCase());

function makeKey(...parts: (string | number)[]): string {
  return parts.map(String).map((s) => s.replace(/[:\s]/g, '_')).join(':');
}

What this prevents: the cache-poisoning class of bugs, plus the migration headache when a value’s schema changes.

When not to cache#

The data is cheap to compute. Every cache layer is a place bugs can live. If the source-of-truth read is sub-millisecond, the cache is pure surface area.
The data is per-user and rarely re-read. Cache hit rate of 5% is worse than no cache — you’ve added latency on the miss path and memory pressure.
You can’t articulate the staleness budget. If a stakeholder can’t tell you “this data may be N seconds out of date,” don’t cache it. Every cache trades correctness for speed; if you don’t know the budget, you don’t know the trade.

The recurring lesson#

Every cache costs you correctness for speed. Make the trade explicit, write down the budget, log what you served, and default to TTL because TTL repairs itself. The systems that age well are the ones where the cache layer is small, observable, and never load-bearing for correctness.

Caching — the cheapest way to lose a weekend

The four layers, ranked by where the bug lives when it goes wrong#

TTL vs invalidation: pick TTL, default short#

Stampede protection: jitter, singleflight, stale-while-revalidate#

Cache key hygiene#

When not to cache#

The recurring lesson#

Related

Mentioned in

Keyboard shortcuts