Generic Newsfeed System — System Design

Step 1 — Clarify Requirements#

A “generic” newsfeed is the abstract version of every social timeline: Twitter, Instagram, Facebook, LinkedIn, Reddit. The point of designing it generically is to expose the moving parts that apply regardless of content type.

Functional

Given a user, return an ordered list of items (posts, articles, videos) drawn from sources the user follows or that an algorithm has chosen.
Items must be deduplicated (don’t show the same post twice across reloads), fresh (something new every visit), and diverse (don’t show 10 posts from the same author).
Infinite scroll: stable cursor pagination as new items arrive at the top.
Out of scope: the ML ranker training pipeline (see /system-design/ml-data-infrastructure).

Non-functional

99.99% availability for feed reads.
p99 first-page latency under 300 ms.
500 M DAU; 200 K feed-loads/sec peak.
Eventual consistency is fine (1-10 second propagation delay for new content).

Step 2 — Capacity Estimation#

Feed-loads: 500 M DAU × 15 opens/day = 7.5 B/day ≈ 90 K/sec average, 250 K/sec peak.
Content writes: assume 100 M posts/day → ~1.2 K writes/sec average, ~5 K/sec peak.
Fan-out: average user has 200 followers → 1.2 K posts/sec × 200 = 240 K timeline writes/sec if naive push.
Per-user timeline cache: 500 last items × 50 B = 25 KB per user × 500 M users = 12 TB of Redis.
Candidate pool per user: ~1,000 items per ranking call (followed-source recent items + interest-graph candidates + trending).

These numbers force the same conclusion as Twitter newsfeed: hybrid push / pull for fan-out, precomputed candidate pools for ranking, aggressive caching at the timeline level.

Step 3 — System Interface#

GET /feed?cursor=<opaque>&limit=20
    → { items: [...], next_cursor }

POST /posts                    (any content type)
POST /follow/:user_id
POST /interact                 { item_id, kind: 'view'|'click'|'like'|'hide' }

The cursor is opaque, server-encoded (rank_score_floor, item_id) for stable pagination as the ranking score range shifts.

Step 4 — High-Level Design#

                                                       ┌── ranker (model service)
                                                       │
client → LB → API ─┬── /feed ──→ feed assembler ──┬── candidate pool (Redis ZSET per user)
                   │                              │
                   │                              ├── celebrity / public pool (Redis ZSET per source)
                   │                              │
                   │                              └── ranker → final order
                   │
                   ├── /posts ─→ content service ─→ content store (Cassandra) ─→ Kafka
                   │                                                              │
                   │                                                              ▼
                   │                                                  fan-out worker (push) ──→ candidate pool
                   │                                                  signal pipeline ──→ ranking features
                   │
                   └── /interact ─→ Kafka ─→ feature store + dedup cache

The candidate pool is the abstraction that makes this generic. A user’s pool is the union of:

Followed-source pool — push fan-out from sources the user follows (non-celebrity).
Public pool — high-fanout sources pulled on read.
Algorithmic candidates — items the recommendation system suggests, regardless of follows.

Ranking picks the top-N from the pool. Dedup removes already-seen items. Diversity rules trim consecutive same-source items.

Step 5 — Data Model#

Content (Cassandra or any wide-column, partitioned by source_id):

table items
  source_id   uuid    PK
  item_id     timeuuid CK
  kind        enum(post, article, video, ...)
  payload     bytes   // type-specific serialized
  created_at  timestamp

Candidate pool (Redis ZSET per user, score = freshness × source_affinity):

key: pool:{user_id} → ZSET of item_ids, capped at 500

Public / source pools (Redis ZSET per source, score = timestamp):

key: source:{source_id}:items → ZSET, last 100 items

Seen index (for dedup; Bloom filter per user, refreshed daily):

key: seen:{user_id} → Bloom filter, ~50 KB, holds last ~7 days of viewed item_ids

Ranking features (online feature store):

keyed: user_id × item_id → cached features (recency, source_affinity, similarity, ...)

Step 6 — Detailed Design#

Candidate pool construction (hybrid fan-out)#

when new item I posted by source S:
   if followers(S) > THRESHOLD:    // celebrity source
       ZADD source:{S}:items score=ts item=I
       (no push)
   else:
       for each follower F in followers(S):
           ZADD pool:{F} score=freshness_score(I) item=I
           ZREMRANGEBYRANK pool:{F} 0 -501   // cap at 500

A high-fanout source’s items are pulled on read, not pushed. The feed assembler queries the source pool at request time and merges.

The feed assembly pipeline#

1. Load candidate pool: union of
     a. ZREVRANGE pool:{user} 0 200
     b. for each celebrity source the user follows: pull top-20
     c. algorithmic candidates: query recommender service
   → up to ~500 unique items
2. Filter:
     - dedup against seen Bloom filter
     - hide blocked sources / muted topics
     - safety / quality filter
3. Rank:
     - fetch features per item (precomputed where possible)
     - score via model (gradient-boosted trees or DNN)
     - sort descending
4. Diversify:
     - cap consecutive same-source items
     - cap consecutive same-topic items
5. Return top-20; record `last_rank_score` for cursor.

The model output is the “ranked feed score” — fed into the cursor for stable pagination.

Dedup with Bloom filters#

A per-user Bloom filter holds item_ids viewed in the last 7 days. ~50 KB at 1% false-positive rate for 50 K items. Each feed-load:

Test each candidate against the filter; skip on hit.
Add returned items to the filter (after the user sees them).

False positives mean we occasionally hide an item the user has not seen. Acceptable trade-off; better than re-showing duplicates which annoys users intensely.

Freshness vs ranking#

Pure-ranking can leave fresh content at the bottom. Two mitigations:

Freshness boost in the ranker (a sigmoid that decays with item age).
Reserved fresh slots: positions 3 and 7 always show items younger than 1 hour, even if their ranker score is lower.

The exact slot policy is product-specific. The infrastructure must allow per-product configuration of these rules.

Pagination#

Stable infinite scroll requires that newer items don’t shift older items’ positions. Options:

Snapshot the feed on first load — render top-N once, paginate within that snapshot. Subsequent reloads pull a fresh snapshot. Simple, but new items don’t appear unless the user explicitly refreshes.

Score-based cursor — cursor = (score_floor, item_id). Next page returns items with score < score_floor. New items inserted “above” the cursor are visible only on top-of-feed reload. This is the modern standard.

The score-based cursor handles “load more” cleanly. A separate “new items above” indicator at the top tells the user that newer items are available without disrupting their scroll.

Read path latency budget (target 300 ms)#

LB + TLS:                       15 ms
Candidate pool fetch (Redis):    5 ms
Source pulls (parallel, ~5):    20 ms
Algorithmic candidates:         20 ms
Feature lookup:                 30 ms
Ranker inference (top-500):     60 ms
Diversity pass:                  5 ms
Hydrate top-20 payloads:        30 ms
Network back:                   50 ms
                       total:  ~200 ms p99 server

Step 7 — Evaluation & Trade-offs#

Bottleneck #1: ranker inference latency. A DNN scoring 500 candidates is the largest single cost. Cache features aggressively per (user, item) for 1-5 minutes. Use a cheap first-stage ranker over the candidate pool to cut down to 200 before the heavy ranker runs.

Bottleneck #2: the candidate pool cache. 12 TB of Redis for per-user pools is expensive. Tradeoffs: cap pools more aggressively (200 instead of 500 entries) and rely more on pull-on-read; tier hot users to fast Redis, cold users to a slower KV.

Bottleneck #3: cold-start users. A user with no follows has an empty push pool. Default to topic-based and trending pools as the algorithmic candidate source. The model’s predictions are weaker but the feed isn’t empty.

Alternative I’d push back on: building this without separation of candidates and ranking. A monolith that does both is impossible to evolve — every ranking change touches fan-out and vice versa. Insist on the two-stage architecture from day one.

What breaks first at 10× scale: feature store IOPS. Per-feed-load feature lookup grows linearly with candidate count and user count. Pre-aggregating features per source rather than per (user, source) cuts the cardinality dramatically.

Companies this resembles#

Facebook News Feed, LinkedIn Feed, Reddit’s home feed, Pinterest, TikTok For You. The abstraction generalizes most consumer feeds built since ~2015.

Twitter Newsfeed — a concrete instance of this design.
Instagram — same shape with media-heavy items.
Quora — Q&A specialization with question-page ranking instead of feed.
AI / ML Data Infrastructure — where the ranker and feature store live.