Capacity Estimation Cheatsheet — System Design

Summary#

Capacity estimation is the math you’ll do reflexively in step 2 of every system-design interview. The numbers below aren’t trivia — they’re the constants you need cached in your head so the arithmetic flows without stalling the interview. The goal is order-of-magnitude correct in 30 seconds, not precise in 5 minutes.

Why it matters#

Interviewers grade two things during estimation: the answer and the path. Wrong number with the right reasoning chain — “100M DAU × 10 posts/day × 100 KB/post ≈ 100 TB/day” — beats a memorized correct answer with no shown work. The path is what reveals whether you can do this at scale during a design review.

The math also tells you which problems are real. “Do we need to shard?” becomes a 30-second calculation, not an opinion.

How it works#

Memorize three categories of constants. Then the math is just multiplication and unit-cancellation.

Latency numbers (every senior engineer should know these cold)#

Operation	Latency
L1 cache reference	~1 ns
L2 cache reference	~4 ns
Main memory reference	~100 ns
Compress 1 KB with Snappy	~2 µs
Send 1 KB over 10 Gbps LAN	~1 µs
SSD random read	~150 µs
Read 1 MB sequentially from SSD	~1 ms
Round trip within same datacenter	~0.5 ms
Read 1 MB sequentially from spinning disk	~20 ms
Disk seek	~10 ms
Round trip same region (cross-AZ)	~1–2 ms
Round trip cross-region (US east ↔ west)	~70 ms
Round trip cross-continent (US ↔ EU)	~100–150 ms
Round trip US ↔ Asia	~150–200 ms

The shape that matters: memory is ~1000× faster than SSD, SSD is ~100× faster than spinning disk, cross-region is ~100× slower than same-region. Most “is this slow?” questions answer themselves once you locate the operation on this ladder.

Throughput and size constants#

Quantity	Value
Modern server NIC	10–100 Gbps (≈ 1–12 GB/s)
Modern NVMe SSD throughput	3–7 GB/s sequential
Modern NVMe SSD IOPS	500k–1M random reads/sec
Single MySQL / Postgres node	~10k QPS reads, ~1k QPS writes (varies wildly)
Single Redis node	~100k ops/sec
Single Kafka partition	~10 MB/sec sustained
One cache miss vs one cache hit	~100–1000× cost difference
Bytes in 1 KB / 1 MB / 1 GB / 1 TB	`10^3` / `10^6` / `10^9` / `10^12` (powers of 10 for capacity math)
Seconds in a day	86,400 ≈ `10^5`
Seconds in a year	~3.15 × `10^7`

Per-object size baselines#

Object	Size (rough)
ASCII char	1 byte
UTF-8 emoji	4 bytes
Tweet / short message	100–300 bytes
Web page (HTML+text)	100 KB
JPEG photo	200 KB–2 MB
1080p video, 1 minute	~10 MB
4K video, 1 minute	~50 MB
One row in a typical OLTP table	~1 KB
UUID	16 bytes (36 chars as string)
`int64`	8 bytes

Variants and trade-offs#

Powers of 10 (1 KB = 1000 bytes) — friendlier for QPS / bandwidth / cost math. Used by storage vendors and networking. Most interview math should use this.

Powers of 2 (1 KiB = 1024 bytes) — friendlier for memory addressing and disk-block math. The 2.4% difference is usually noise at interview altitude — don’t get stuck switching back and forth.

The standard interview routine:

Start with DAU (daily active users). Quote the number with a source — “let’s say 100M, similar to early Twitter”.
Convert DAU → QPS via “X actions per day per user × DAU ÷ 86,400 seconds/day”. Use 100,000 sec/day for round-number math.
Apply a peak-to-average ratio of 2–3× for QPS. Diurnal traffic peaks roughly 2× the average; bursts (notifications, sport events) push higher.
Compute storage/year as “QPS of writes × bytes/write × 86,400 × 365 × replication factor”.
Compute bandwidth as “QPS of reads × bytes/read”, convert to Gbps with × 8.

Worked example: 100M DAU, 10 reads/day per user.

100M × 10 = 10^9 reads/day.
10^9 / 10^5 = 10,000 QPS average, ~25,000 QPS peak.
At 100 KB per read → 2.5 GB/s = 20 Gbps egress at peak. One 25 Gbps NIC per shard worth of cache.

Reasonable rounding shortcuts

Treat 86,400 ≈ 10^5 (off by 13%). Treat 1 year ≈ 3 × 10^7 seconds (off by 5%). Treat 1 GB = 10^9 bytes (off by 7% versus GiB). All three rounding errors are below interviewer-noticeable threshold; trying to be precise about them is what slows candidates down.

When this is asked in interviews#

Always, in step 2 of the walk-through. Skipping estimation is one of the most common signal flags — interviewers wait for it, and when it’s omitted, they ask for it directly. A candidate who reaches for these numbers without prompting reads as “has actually worked at scale”.

More common (and more demanding) at infrastructure-leaning shops — anywhere a hardware budget is part of the design conversation (AWS, GCP, Cloudflare, Akamai, Fastly, edge compute, GPU-cloud, telco). Product teams care about the shape of the math more than the exact numbers.

Common follow-ups:

“How much hot data fits in memory across the fleet?” — DAU × per-user state × hit-rate × replication.
“Can a single shard handle this?” — peak QPS / shard QPS limit; if > 1, you’re sharding.
“What’s the egress bill at this scale?” — bandwidth × $/GB; useful for “build vs CDN” framing.

Summary#

Why it matters#

How it works#

Latency numbers (every senior engineer should know these cold)#

Throughput and size constants#

Per-object size baselines#

Variants and trade-offs#

When this is asked in interviews#

Related concepts#