Twitter Newsfeed
Generate a personalized timeline at read time, write time, or both. The pull / push / hybrid trade-off and the celebrity-fanout problem.
Step 1 — Clarify Requirements#
Functional
- A user opens the app → sees a reverse-chronological feed of posts from people they follow.
- A user posts → followers’ feeds reflect the new post within seconds.
- Show a fixed page of ~20 posts; allow infinite scroll backwards.
- Out of scope for this design: search, DMs, the algorithmic ranking model, ads, notifications.
Non-functional
- 99.99% availability for the feed read path.
- p99 feed-load latency: under 200 ms.
- 300 M DAU, 100 posts/sec aggregate write rate, 100 K feed-loads/sec aggregate read rate.
- Eventual consistency is fine — followers seeing a new post 1–5 seconds late is acceptable.
Step 2 — Capacity Estimation#
- Reads (feed loads): 100 K/sec peak. The whole design is about making this cheap.
- Writes (posts): 100/sec average, ~1 K/sec peak.
- Storage: 100 posts/sec × 1 KB × 86 400 × 365 ≈ 3 TB/year of posts. Modest.
- Fan-out blow-up: average user has ~200 followers. Average post fans out to 200 follower-timelines. 100 posts/sec × 200 = 20 K timeline writes/sec baseline.
- Celebrities: top-1000 accounts have 1M–500M followers each. A single post from a celebrity could write 500M timeline entries — that’s the dominant bottleneck.
Step 3 — System Interface#
POST /posts Body: { text: string, media?: [url] } Returns: { post_id, created_at }
GET /feed?cursor=<opaque>&limit=20 Returns: { posts: [...], next_cursor: opaque }
POST /follow/:user_idDELETE /follow/:user_idThe cursor is opaque to clients: encode (timestamp, post_id) server-side for stable pagination as posts arrive.
Step 4 — High-Level Design#
┌─→ celebrity store (Redis Sorted Set, pull path) │client → CDN → LB → API gateway ─┬─ /feed ──→ feed assembler ──→ user timeline cache (Redis) │ ▲ │ │ writes (push path) │ │ └─ /posts ──→ post service ──→ post store (Cassandra) │ └──→ fan-out worker → user timeline cacheTwo paths:
- Post path (write): write the post once to durable storage; then fan out to followers’ timelines. Slow / async.
- Feed path (read): assemble a page of the user’s timeline from the cache (and the celebrity-pull store). Fast / sync.
The fan-out worker is the load-bearing component. Its strategy decides everything else.
Step 5 — Data Model#
Posts (Cassandra, partition by user_id):
table posts user_id uuid PK post_id timeuuid CK text string media list<url> created_at timestampTimelines (Redis, one sorted set per user, score = timestamp):
user:{follower_id}:timeline → ZSET of post_ids, capped at most recent 800Follow graph (Cassandra or a graph store):
table follows follower_id uuid PK followee_id uuid CKtable followers (inverse index) user_id uuid PK follower_id uuid CKThe inverse index followers is the workhorse — it’s read for every post (to fan out) and is hot.
Step 6 — Detailed Design#
The three fan-out strategies#
ZREVRANGE user:{me}:timeline 0 20. Push is the right default — most users have a few hundred followers; writing a row to a sorted set is microseconds. Pull breaks down at scale: a user following 500 accounts triggers 500 small queries per feed load.
But push breaks for celebrities. If Taylor Swift posts and we push to 100M followers, that’s 100M cache writes — minutes of work, hot keys, contention. Two options:
- Cap the fan-out at K: push only to active followers (logged in within 7 days). Lurkers fall back to the pull path.
- Hybrid (the chosen approach): pull for celebrities, push for everyone else.
The hybrid path#
Define a celebrity as a user with > 1 M followers (~1,000 accounts globally). When the feed assembler builds Bob’s timeline:
Bob's feed = MERGE( ZREVRANGE user:bob:timeline 0 20, // push: posts from non-celebrities Bob follows for each celebrity Bob follows: latest 20 posts from celebrity:{id}:posts // pull)The merge picks the top-20 by timestamp from the union. Pull cost is bounded by the number of celebrities Bob follows (almost always under 50), each query is a single sorted-set range, total under 50 ms.
Fan-out worker#
loop: pop post from Kafka topic 'new-posts' resolve followers: SELECT follower_id FROM followers WHERE user_id = post.user_id for each follower in batches of 1000: ZADD user:{follower_id}:timeline post.timestamp post.id ZREMRANGEBYRANK user:{follower_id}:timeline 0 -801 // cap at 800For a celebrity post, this worker would loop for hours; instead the post service tags the user as a celebrity (lookup table in Redis) and skips fan-out — the post goes into a dedicated celebrity:{id}:posts sorted set that pull queries hit directly.
Timeline cache eviction#
Bounded at 800 most recent posts per user. Older posts are reconstructed on demand from posts + follows when a user scrolls beyond — costly but rare. A typical user only ever sees the most recent 20–100 posts.
Read path latency budget (target p99 200 ms)#
LB + TLS: 10 msAuth + edge: 5 msFollow-graph lookup (am I following celebrities?): 5 msZREVRANGE on cached timeline: 2 msPull queries for celebrities (parallel): 20–40 msHydrate post bodies from Cassandra (batched): 30 msMerge + serialize: 5 msNetwork back: 20 ms total: ~100–120 ms p99Write path#
Post fan-out is async. The post itself is durable within 50 ms (write to Cassandra, acknowledge). The timeline write arrives at followers’ caches within 1–5 seconds. That’s the eventual-consistency window we acknowledged in step 1.
Step 7 — Evaluation & Trade-offs#
Bottleneck #1: the followers inverse index. Reading “who follows user X” for fan-out is the most frequent expensive query in the system. Aggressive caching of the follower list (with bounded TTL) absorbs most of it. A celebrity’s follower list won’t fit in a cache entry — keep it on disk and accept the read cost when posting (rare).
Bottleneck #2: hot keys on celebrity pull. celebrity:taylor-swift:posts is hammered by every Swift follower on every feed load. Use:
- Local cache layer (each feed assembler caches the top-20 for ~1 sec).
- Read replicas of that key on multiple Redis shards via consistent-hash with replication.
- CDN: GET
/api/feeditself can be cached per-user for 5 seconds, since users rarely refresh more often than that.
Bottleneck #3: the fan-out queue. A celebrity defection from a “push, capped” model to a “pull” model needs to drain the in-flight fan-out queue gracefully. The implementation flag is a per-user is_celebrity cell: when it flips, in-flight fan-outs check and skip.
Alternative I’d push back on: pure-push for everyone. Tempting because reads are O(1), but the cost of celebrity fan-out is non-linear in follower count and bursty — a single Eurovision-tier event would page on-call. The hybrid is the right trade.
What breaks first at 10× scale (3 B DAU): the follow-graph store. At that point we’d shard the follow graph by user_id and replicate hot followee rows; the timeline cache and fan-out worker would scale linearly with more shards.
Companies this resembles#
Twitter / X (the original), Threads, Bluesky (with AT Protocol’s pull-by-default twist), Mastodon (federated; each instance handles its own fan-out).
Related systems#
- Instagram — same hybrid pattern, with media bandwidth as a second axis.
- Newsfeed System — generic newsfeed pluggable rather than Twitter-specific.
- Distributed Cache — the substrate for every timeline cache decision here.