URL Shortener — System Design · Engineering Playbook

Step 1 — Clarify Requirements#

Functional

Given a long URL, produce a short URL (~7-8 characters past the host).
Given a short URL, redirect to the original long URL.
Optional custom alias (/sm/my-talk) — first-come-first-served.
Optional expiration (default: never).
Clicks per shortened URL — basic analytics (count, last-clicked, geo).

Non-functional

99.99% availability for redirects (3 nines is too low — every broken redirect is a lost click).
p99 redirect latency: under 50 ms end-to-end.
Read:write ratio is ~100:1 — heavy redirect traffic, light shortening traffic.
5-year retention. Short codes never get reused (renaming breaks links forever).

Out of scope: editing existing shortened URLs after creation, fine-grained per-click analytics dashboards, programmable redirects.

Step 2 — Capacity Estimation#

Assume 500 M new short URLs per month (canonical exam number).

Writes: 500M / 30 / 86400 ≈ 190 writes/sec, peaks at ~500/sec.
Reads: 100× writes ≈ 19 000 reads/sec, peaks at ~50 000/sec.
Storage per record: ~500 bytes (long URL 200B, short code 8B, metadata 250B, indexes 50B).
Storage growth: 500M × 500 B = 250 GB/month, 15 TB over 5 years.
Bandwidth: redirects are tiny (302 + headers ≈ 300 B). 19 000 × 300 B ≈ 6 MB/sec read traffic; effectively nothing.
Cache: top 20% of URLs serve 80% of clicks (Pareto). With 1 B URLs over 5 years, hot set ≈ 200 M × 500 B = 100 GB of cache — sharded Redis at 4 × 32 GB.

These numbers reshape the design: shortening is a slow path; redirects need to be cache-served. The whole interesting design is around the read path.

Step 3 — System Interface#

POST  /shorten
      Body: { url: string, alias?: string, ttl_days?: number }
      Returns: { short_url: string }     (idempotent on (url, user))

GET   /:short_code
      Returns: HTTP 301 with Location: <original_url>
      Returns: HTTP 404 if missing or expired

GET   /:short_code/stats     (optional analytics endpoint)
      Returns: { clicks, last_clicked, top_geos }

POST /shorten should be idempotent on (url, user) — same input from same user returns the same short code (no need to re-encode). Idempotency-Key header for the API-call retry case.

Step 4 — High-Level Design#

                       ┌─→ analytics queue ──→ ClickHouse
                       │
client → CDN → LB → API gateway ──┬──→ Redis (short → long)  ─cache miss─┐
                       │          │                                       │
                       └─ /shorten│                                       ▼
                                  └──→ ID generator ─→ KV store (DynamoDB / Cassandra)

Two paths through the system:

Redirect path (hot): LB → API gateway → Redis lookup → 301. Misses fall through to the KV store and backfill the cache.
Shorten path (cold): API gateway → ID generator → write to KV → invalidate / write-through cache.

The CDN handles DDoS protection at the edge and can short-circuit popular redirects via cache-control headers (clients then redirect from their browser cache, never hitting our infrastructure).

Step 5 — Data Model#

The whole system is one wide-column or KV table:

table url_map
  short_code   string  PK    // 7-8 chars base62
  long_url     string
  user_id      string?       // null for anonymous
  created_at   timestamp
  expires_at   timestamp?    // null = never
  click_count  bigint        // updated asynchronously

Partition key: short_code. Reads are point lookups. No range queries needed; no secondary indexes needed unless we add user-facing dashboards.

A separate user_urls table (PK (user_id, created_at)) handles “show me my history” queries — kept out of the hot redirect path.

Generating the short code#

Three viable approaches:

Hash + base62 — base62(md5(long_url + salt))[:8]. Deterministic (same input → same code) but collision-prone at scale, requires a retry loop on conflict.

Counter + base62 — sequential monotonic ID via a sequencer service, encoded in base62. No collisions ever. Predictable URLs (information leakage about volume) — mitigate by encrypting the counter before encoding.

A third option is to pre-allocate ranges: each shortener instance reserves a block of 10,000 IDs from a central allocator and serves them locally. No coordination per-request; bounded waste on instance crash. This is what TinyURL-style systems usually run.

8 base62 characters = 62^8 = 218 trillion codes. We won’t run out.

Step 6 — Detailed Design#

The redirect path (the hot loop)#

GET /:code
  → check Redis (LRU, ~100 GB sharded)
  → on hit:  return 301
  → on miss: read KV, write-back to Redis, return 301
  → on KV miss: return 404

Target p99 of 50 ms. Realistic budget:

LB + TLS handshake: 10 ms
Edge auth / abuse check: 2 ms
Redis lookup: 1–2 ms
KV lookup (on cache miss): 5–10 ms
Network back to client: 5 ms

Redis hit rate at 95% means 5% of redirects pay an extra 5–10 ms — still well within budget. The CDN can short-circuit the top 0.1% of URLs entirely.

The shorten path#

POST /shorten
  → idempotency check (user, url) in Redis (5-minute TTL)
  → allocate short code (counter-based, encoded)
  → write to KV (consistency: write to majority, replicate async to others)
  → write-through Redis with TTL = expires_at - now
  → return short_url

The write must be durable (we just promised the user a URL). The cache write is best-effort — if it fails we’ll backfill on the first read.

Analytics#

Asynchronous. Every redirect emits an event to a queue (Kafka), which feeds a column store (ClickHouse) for aggregation. The hot path never blocks on analytics. Worst case: a few seconds of click delay during a Kafka hiccup — acceptable for our spec.

click_count in the main table is updated in batches (every 30 sec) from the column store. Live dashboards read directly from ClickHouse.

Custom aliases#

Custom aliases go through a CHECK-IF-EXISTS → CLAIM Lua script in Redis (atomic) before the KV write. Race conditions are bounded by the script’s atomicity. Reserved alias-prefix list (/admin, /api, /login, etc.) lives in config.

Expiration#

A daily cron scans expires_at and tombstones expired records. Reads on tombstones return 404 immediately. We don’t physically delete — preserves the no-reuse guarantee.

Step 7 — Evaluation & Trade-offs#

Bottleneck #1: the KV store at peak write. 500 writes/sec is well within DynamoDB / Cassandra capacity, but the analytics write amplification (one event per redirect = 50k writes/sec to Kafka) needs careful capacity planning.

Bottleneck #2: cache size. 100 GB hot set is large enough that we shard Redis. With consistent-hash sharding and N=3 replicas, single-shard failures lose 1/N of the hot set — recoverable in seconds from KV.

Bottleneck #3: thundering herds on viral URLs. A tweet linking to /abc123 gets 1 M clicks in 5 minutes. If Redis evicted it, the first 100 requests stampede the KV. Mitigations:

Single-flight at the cache (lock the key while one request fetches).
Probabilistic early expiration so we don’t synchronously expire popular keys.
CDN with Cache-Control: max-age=300 on the 301 response — the client redirects from browser cache after the first hit.

Alternative I’d push back on: storing per-click analytics in the main KV. Cheaper-looking but kills write IOPS budget on day one. Always offload analytics to a column store, even at small scale.

What breaks first at 10× scale (5 B URLs/month): the analytics pipeline. The hot redirect path scales linearly with more Redis shards; the analytics fan-out grows superlinearly with cardinality (top-N queries, per-user breakdowns). Plan for that re-architecture before it bites.

Companies this resembles#

bit.ly, TinyURL, t.co, goo.gl (RIP), Firebase Dynamic Links.

Twitter Newsfeed — same read-heavy / heavy-cache shape at a different magnitude.
Web Crawler — same base62 encoding trick for crawl URLs.
Rate Limiter — POST /shorten must be rate-limited or it becomes a free spam pipe.