Generic Newsfeed System

Pluggable feed engine: ranking, freshness, deduplication, infinite scroll, caching strategy.

System Intermediate
7 min read
feed ranking fan-out caching
Companies this resembles: Facebook · LinkedIn · Reddit · Pinterest

Step 1 — Clarify Requirements#

A “generic” newsfeed is the abstract version of every social timeline: Twitter, Instagram, Facebook, LinkedIn, Reddit. The point of designing it generically is to expose the moving parts that apply regardless of content type.

Functional

  • Given a user, return an ordered list of items (posts, articles, videos) drawn from sources the user follows or that an algorithm has chosen.
  • Items must be deduplicated (don’t show the same post twice across reloads), fresh (something new every visit), and diverse (don’t show 10 posts from the same author).
  • Infinite scroll: stable cursor pagination as new items arrive at the top.
  • Out of scope: the ML ranker training pipeline (see /system-design/ml-data-infrastructure).

Non-functional

  • 99.99% availability for feed reads.
  • p99 first-page latency under 300 ms.
  • 500 M DAU; 200 K feed-loads/sec peak.
  • Eventual consistency is fine (1-10 second propagation delay for new content).

Step 2 — Capacity Estimation#

  • Feed-loads: 500 M DAU × 15 opens/day = 7.5 B/day ≈ 90 K/sec average, 250 K/sec peak.
  • Content writes: assume 100 M posts/day → ~1.2 K writes/sec average, ~5 K/sec peak.
  • Fan-out: average user has 200 followers → 1.2 K posts/sec × 200 = 240 K timeline writes/sec if naive push.
  • Per-user timeline cache: 500 last items × 50 B = 25 KB per user × 500 M users = 12 TB of Redis.
  • Candidate pool per user: ~1,000 items per ranking call (followed-source recent items + interest-graph candidates + trending).

These numbers force the same conclusion as Twitter newsfeed: hybrid push / pull for fan-out, precomputed candidate pools for ranking, aggressive caching at the timeline level.

Step 3 — System Interface#

GET /feed?cursor=<opaque>&limit=20
→ { items: [...], next_cursor }
POST /posts (any content type)
POST /follow/:user_id
POST /interact { item_id, kind: 'view'|'click'|'like'|'hide' }

The cursor is opaque, server-encoded (rank_score_floor, item_id) for stable pagination as the ranking score range shifts.

Step 4 — High-Level Design#

┌── ranker (model service)
client → LB → API ─┬── /feed ──→ feed assembler ──┬── candidate pool (Redis ZSET per user)
│ │
│ ├── celebrity / public pool (Redis ZSET per source)
│ │
│ └── ranker → final order
├── /posts ─→ content service ─→ content store (Cassandra) ─→ Kafka
│ │
│ ▼
│ fan-out worker (push) ──→ candidate pool
│ signal pipeline ──→ ranking features
└── /interact ─→ Kafka ─→ feature store + dedup cache

The candidate pool is the abstraction that makes this generic. A user’s pool is the union of:

  1. Followed-source pool — push fan-out from sources the user follows (non-celebrity).
  2. Public pool — high-fanout sources pulled on read.
  3. Algorithmic candidates — items the recommendation system suggests, regardless of follows.

Ranking picks the top-N from the pool. Dedup removes already-seen items. Diversity rules trim consecutive same-source items.

Step 5 — Data Model#

Content (Cassandra or any wide-column, partitioned by source_id):

table items
source_id uuid PK
item_id timeuuid CK
kind enum(post, article, video, ...)
payload bytes // type-specific serialized
created_at timestamp

Candidate pool (Redis ZSET per user, score = freshness × source_affinity):

key: pool:{user_id} → ZSET of item_ids, capped at 500

Public / source pools (Redis ZSET per source, score = timestamp):

key: source:{source_id}:items → ZSET, last 100 items

Seen index (for dedup; Bloom filter per user, refreshed daily):

key: seen:{user_id} → Bloom filter, ~50 KB, holds last ~7 days of viewed item_ids

Ranking features (online feature store):

keyed: user_id × item_id → cached features (recency, source_affinity, similarity, ...)

Step 6 — Detailed Design#

Candidate pool construction (hybrid fan-out)#

when new item I posted by source S:
if followers(S) > THRESHOLD: // celebrity source
ZADD source:{S}:items score=ts item=I
(no push)
else:
for each follower F in followers(S):
ZADD pool:{F} score=freshness_score(I) item=I
ZREMRANGEBYRANK pool:{F} 0 -501 // cap at 500

A high-fanout source’s items are pulled on read, not pushed. The feed assembler queries the source pool at request time and merges.

The feed assembly pipeline#

1. Load candidate pool: union of
a. ZREVRANGE pool:{user} 0 200
b. for each celebrity source the user follows: pull top-20
c. algorithmic candidates: query recommender service
→ up to ~500 unique items
2. Filter:
- dedup against seen Bloom filter
- hide blocked sources / muted topics
- safety / quality filter
3. Rank:
- fetch features per item (precomputed where possible)
- score via model (gradient-boosted trees or DNN)
- sort descending
4. Diversify:
- cap consecutive same-source items
- cap consecutive same-topic items
5. Return top-20; record `last_rank_score` for cursor.

The model output is the “ranked feed score” — fed into the cursor for stable pagination.

Dedup with Bloom filters#

A per-user Bloom filter holds item_ids viewed in the last 7 days. ~50 KB at 1% false-positive rate for 50 K items. Each feed-load:

  • Test each candidate against the filter; skip on hit.
  • Add returned items to the filter (after the user sees them).

False positives mean we occasionally hide an item the user has not seen. Acceptable trade-off; better than re-showing duplicates which annoys users intensely.

Freshness vs ranking#

Pure-ranking can leave fresh content at the bottom. Two mitigations:

  • Freshness boost in the ranker (a sigmoid that decays with item age).
  • Reserved fresh slots: positions 3 and 7 always show items younger than 1 hour, even if their ranker score is lower.

The exact slot policy is product-specific. The infrastructure must allow per-product configuration of these rules.

Pagination#

Stable infinite scroll requires that newer items don’t shift older items’ positions. Options:

Snapshot the feed on first load — render top-N once, paginate within that snapshot. Subsequent reloads pull a fresh snapshot. Simple, but new items don’t appear unless the user explicitly refreshes.
Score-based cursorcursor = (score_floor, item_id). Next page returns items with score < score_floor. New items inserted “above” the cursor are visible only on top-of-feed reload. This is the modern standard.

The score-based cursor handles “load more” cleanly. A separate “new items above” indicator at the top tells the user that newer items are available without disrupting their scroll.

Read path latency budget (target 300 ms)#

LB + TLS: 15 ms
Candidate pool fetch (Redis): 5 ms
Source pulls (parallel, ~5): 20 ms
Algorithmic candidates: 20 ms
Feature lookup: 30 ms
Ranker inference (top-500): 60 ms
Diversity pass: 5 ms
Hydrate top-20 payloads: 30 ms
Network back: 50 ms
total: ~200 ms p99 server

Step 7 — Evaluation & Trade-offs#

Bottleneck #1: ranker inference latency. A DNN scoring 500 candidates is the largest single cost. Cache features aggressively per (user, item) for 1-5 minutes. Use a cheap first-stage ranker over the candidate pool to cut down to 200 before the heavy ranker runs.

Bottleneck #2: the candidate pool cache. 12 TB of Redis for per-user pools is expensive. Tradeoffs: cap pools more aggressively (200 instead of 500 entries) and rely more on pull-on-read; tier hot users to fast Redis, cold users to a slower KV.

Bottleneck #3: cold-start users. A user with no follows has an empty push pool. Default to topic-based and trending pools as the algorithmic candidate source. The model’s predictions are weaker but the feed isn’t empty.

Alternative I’d push back on: building this without separation of candidates and ranking. A monolith that does both is impossible to evolve — every ranking change touches fan-out and vice versa. Insist on the two-stage architecture from day one.

What breaks first at 10× scale: feature store IOPS. Per-feed-load feature lookup grows linearly with candidate count and user count. Pre-aggregating features per source rather than per (user, source) cuts the cardinality dramatically.

Companies this resembles#

Facebook News Feed, LinkedIn Feed, Reddit’s home feed, Pinterest, TikTok For You. The abstraction generalizes most consumer feeds built since ~2015.

Search ESC

Keyboard shortcuts

Shortcuts are disabled while typing in inputs.