Twitter Newsfeed

Generate a personalized timeline at read time, write time, or both. The pull / push / hybrid trade-off and the celebrity-fanout problem.

System Intermediate
6 min read
newsfeed fan-out social-graph caching
Companies this resembles: Twitter / X · Threads · Mastodon · Bluesky

Step 1 — Clarify Requirements#

Functional

  • A user opens the app → sees a reverse-chronological feed of posts from people they follow.
  • A user posts → followers’ feeds reflect the new post within seconds.
  • Show a fixed page of ~20 posts; allow infinite scroll backwards.
  • Out of scope for this design: search, DMs, the algorithmic ranking model, ads, notifications.

Non-functional

  • 99.99% availability for the feed read path.
  • p99 feed-load latency: under 200 ms.
  • 300 M DAU, 100 posts/sec aggregate write rate, 100 K feed-loads/sec aggregate read rate.
  • Eventual consistency is fine — followers seeing a new post 1–5 seconds late is acceptable.

Step 2 — Capacity Estimation#

  • Reads (feed loads): 100 K/sec peak. The whole design is about making this cheap.
  • Writes (posts): 100/sec average, ~1 K/sec peak.
  • Storage: 100 posts/sec × 1 KB × 86 400 × 365 ≈ 3 TB/year of posts. Modest.
  • Fan-out blow-up: average user has ~200 followers. Average post fans out to 200 follower-timelines. 100 posts/sec × 200 = 20 K timeline writes/sec baseline.
  • Celebrities: top-1000 accounts have 1M–500M followers each. A single post from a celebrity could write 500M timeline entries — that’s the dominant bottleneck.

Step 3 — System Interface#

POST /posts
Body: { text: string, media?: [url] }
Returns: { post_id, created_at }
GET /feed?cursor=<opaque>&limit=20
Returns: { posts: [...], next_cursor: opaque }
POST /follow/:user_id
DELETE /follow/:user_id

The cursor is opaque to clients: encode (timestamp, post_id) server-side for stable pagination as posts arrive.

Step 4 — High-Level Design#

┌─→ celebrity store (Redis Sorted Set, pull path)
client → CDN → LB → API gateway ─┬─ /feed ──→ feed assembler ──→ user timeline cache (Redis)
│ ▲
│ │ writes (push path)
│ │
└─ /posts ──→ post service ──→ post store (Cassandra)
└──→ fan-out worker → user timeline cache

Two paths:

  • Post path (write): write the post once to durable storage; then fan out to followers’ timelines. Slow / async.
  • Feed path (read): assemble a page of the user’s timeline from the cache (and the celebrity-pull store). Fast / sync.

The fan-out worker is the load-bearing component. Its strategy decides everything else.

Step 5 — Data Model#

Posts (Cassandra, partition by user_id):

table posts
user_id uuid PK
post_id timeuuid CK
text string
media list<url>
created_at timestamp

Timelines (Redis, one sorted set per user, score = timestamp):

user:{follower_id}:timeline → ZSET of post_ids, capped at most recent 800

Follow graph (Cassandra or a graph store):

table follows
follower_id uuid PK
followee_id uuid CK
table followers (inverse index)
user_id uuid PK
follower_id uuid CK

The inverse index followers is the workhorse — it’s read for every post (to fan out) and is hot.

Step 6 — Detailed Design#

The three fan-out strategies#

Push (write-time fan-out) — when Alice posts, write the post_id into every follower’s timeline cache. Reads are O(1): just ZREVRANGE user:{me}:timeline 0 20.
Pull (read-time fan-out) — when Bob reads his feed, fetch the latest posts from each user he follows and merge them. No write fan-out; reads are O(N) in followee count.

Push is the right default — most users have a few hundred followers; writing a row to a sorted set is microseconds. Pull breaks down at scale: a user following 500 accounts triggers 500 small queries per feed load.

But push breaks for celebrities. If Taylor Swift posts and we push to 100M followers, that’s 100M cache writes — minutes of work, hot keys, contention. Two options:

  1. Cap the fan-out at K: push only to active followers (logged in within 7 days). Lurkers fall back to the pull path.
  2. Hybrid (the chosen approach): pull for celebrities, push for everyone else.

The hybrid path#

Define a celebrity as a user with > 1 M followers (~1,000 accounts globally). When the feed assembler builds Bob’s timeline:

Bob's feed = MERGE(
ZREVRANGE user:bob:timeline 0 20, // push: posts from non-celebrities Bob follows
for each celebrity Bob follows:
latest 20 posts from celebrity:{id}:posts // pull
)

The merge picks the top-20 by timestamp from the union. Pull cost is bounded by the number of celebrities Bob follows (almost always under 50), each query is a single sorted-set range, total under 50 ms.

Fan-out worker#

loop:
pop post from Kafka topic 'new-posts'
resolve followers: SELECT follower_id FROM followers WHERE user_id = post.user_id
for each follower in batches of 1000:
ZADD user:{follower_id}:timeline post.timestamp post.id
ZREMRANGEBYRANK user:{follower_id}:timeline 0 -801 // cap at 800

For a celebrity post, this worker would loop for hours; instead the post service tags the user as a celebrity (lookup table in Redis) and skips fan-out — the post goes into a dedicated celebrity:{id}:posts sorted set that pull queries hit directly.

Timeline cache eviction#

Bounded at 800 most recent posts per user. Older posts are reconstructed on demand from posts + follows when a user scrolls beyond — costly but rare. A typical user only ever sees the most recent 20–100 posts.

Read path latency budget (target p99 200 ms)#

LB + TLS: 10 ms
Auth + edge: 5 ms
Follow-graph lookup (am I following celebrities?): 5 ms
ZREVRANGE on cached timeline: 2 ms
Pull queries for celebrities (parallel): 20–40 ms
Hydrate post bodies from Cassandra (batched): 30 ms
Merge + serialize: 5 ms
Network back: 20 ms
total: ~100–120 ms p99

Write path#

Post fan-out is async. The post itself is durable within 50 ms (write to Cassandra, acknowledge). The timeline write arrives at followers’ caches within 1–5 seconds. That’s the eventual-consistency window we acknowledged in step 1.

Step 7 — Evaluation & Trade-offs#

Bottleneck #1: the followers inverse index. Reading “who follows user X” for fan-out is the most frequent expensive query in the system. Aggressive caching of the follower list (with bounded TTL) absorbs most of it. A celebrity’s follower list won’t fit in a cache entry — keep it on disk and accept the read cost when posting (rare).

Bottleneck #2: hot keys on celebrity pull. celebrity:taylor-swift:posts is hammered by every Swift follower on every feed load. Use:

  • Local cache layer (each feed assembler caches the top-20 for ~1 sec).
  • Read replicas of that key on multiple Redis shards via consistent-hash with replication.
  • CDN: GET /api/feed itself can be cached per-user for 5 seconds, since users rarely refresh more often than that.

Bottleneck #3: the fan-out queue. A celebrity defection from a “push, capped” model to a “pull” model needs to drain the in-flight fan-out queue gracefully. The implementation flag is a per-user is_celebrity cell: when it flips, in-flight fan-outs check and skip.

Alternative I’d push back on: pure-push for everyone. Tempting because reads are O(1), but the cost of celebrity fan-out is non-linear in follower count and bursty — a single Eurovision-tier event would page on-call. The hybrid is the right trade.

What breaks first at 10× scale (3 B DAU): the follow-graph store. At that point we’d shard the follow graph by user_id and replicate hot followee rows; the timeline cache and fan-out worker would scale linearly with more shards.

Companies this resembles#

Twitter / X (the original), Threads, Bluesky (with AT Protocol’s pull-by-default twist), Mastodon (federated; each instance handles its own fan-out).

  • Instagram — same hybrid pattern, with media bandwidth as a second axis.
  • Newsfeed System — generic newsfeed pluggable rather than Twitter-specific.
  • Distributed Cache — the substrate for every timeline cache decision here.
Search ESC

Keyboard shortcuts

Shortcuts are disabled while typing in inputs.