Quora

Q&A platform: ranking answers, follow graph, notification fan-out, search.

System Intermediate
7 min read
qa ranking feed search
Companies this resembles: Quora · Stack Overflow · Reddit · Hacker News

Step 1 — Clarify Requirements#

Functional

  • A user asks a question; others answer; readers upvote / downvote answers.
  • A question page shows answers ranked by quality.
  • Users follow topics and other users; a personalized feed surfaces relevant new questions and answers.
  • Search across questions and answers.
  • Notifications when followed-content gets new answers, when a user is mentioned, when an answer gets significant upvotes.
  • Out of scope: monetization, moderation tooling, anonymous answers (consider variant).

Non-functional

  • 99.95% availability.
  • p99 question-page load under 400 ms.
  • 300 M MAU; ~50 M questions; ~200 M answers; ~50 K writes/sec peak (votes are the dominant write).
  • Strong consistency on answer authorship and vote ownership; eventual consistency on aggregate vote counts and feed rankings.

Step 2 — Capacity Estimation#

  • Question reads: 300 M MAU × ~20 question pages/day = 6 B reads/day70 K reads/sec average, 300 K/sec peak.
  • Writes: votes (50 K/sec peak), new answers (~50/sec), new questions (~5/sec).
  • Storage: 50 M questions × 1 KB + 200 M answers × 3 KB = 50 + 600 GB = 650 GB of text content. Tiny. Indexes, vote logs, and edit history dominate the total — ~10 TB.
  • Search index: full-text inverted index over 250 M docs, ~50 GB on disk per shard.
  • Notifications: ~1 M notifications/sec at peak (mostly aggregated into digests, not delivered individually).

The system is read-heavy and search-heavy. The interesting parts are ranking, search, and fan-out.

Step 3 — System Interface#

POST /questions { title, body, topics: [...] }
POST /answers { question_id, body }
POST /votes { answer_id, value: +1|-1 }
POST /follow { type: 'user'|'topic'|'question', target_id }
GET /questions/:id (title + answers, paginated by ranking)
GET /search?q=... (across questions and answers)
GET /feed?cursor=... (personalized)
GET /notifications (paginated, with read/unread)

Posting and voting endpoints are idempotent on (user_id, target_id) — a stale retry must not double-vote.

Step 4 — High-Level Design#

┌── search index (sharded)
client → LB → API ──┬── /questions/:id ──→ question service ──→ relational store (Postgres, sharded)
│ │ │
│ ▼ ▼
│ ranking cache vote counters (sharded)
│ (Redis ZSET)
├── /votes ─→ vote service ─→ Postgres + sharded counters
│ │
│ └→ async: rerank affected answer
├── /feed ─→ feed service ─→ feed cache (Redis ZSET per user)
│ ▲
│ │
└── /search ─→ search service (Elasticsearch / Vespa)
└─ async: questions / answers indexed via Kafka
Writes also fan out to: search index, notifications, feed personalization model.

Step 5 — Data Model#

Questions (Postgres, sharded by question_id):

table questions
question_id uuid PK
title string
body text
topic_ids array<uuid>
asker_id uuid
created_at timestamp
view_count bigint // async
answer_count int // async

Answers:

table answers
answer_id uuid PK
question_id uuid
author_id uuid
body text
created_at timestamp
net_votes int // sharded counter; periodically rolled up
rank_score float // precomputed for fast question-page rendering

Votes (immutable log; idempotent):

table votes
user_id uuid
answer_id uuid
value int // +1 or -1
ts timestamp
PK (user_id, answer_id)

Follow graph:

table follows
follower_id uuid
target_type enum(user, topic, question)
target_id uuid
PK (follower_id, target_type, target_id)

Ranking cache (Redis ZSET per question, score = rank_score):

key: q:{question_id}:answers → ZSET of answer_ids

Step 6 — Detailed Design#

Answer ranking#

The question page shows the “best” answer first. The classic naive metric (raw vote count) is dominated by old answers that have accumulated votes over years. Better:

score(answer) = (net_votes + α) / (hours_since_post + β)^γ // Hacker-News-like
+ author_credibility_bonus
+ asker_acceptance_bonus
+ personalization_term(viewer, answer)

The first three terms are global; personalization is computed per-viewer at read time. Quora-style ranking adds heavy ML ranker on top (a BERT-class model scoring question x answer for topic relevance, hedged so simple high-quality answers still surface).

The pre-ranked ZSET is invalidated on:

  • Vote change on any of the question’s answers (debounced; recompute every ~30 s).
  • New answer posted on the question.
  • Periodic decay refresh (every ~1 hour, score depends on age).

Votes at scale#

A viral answer gets 10 K upvotes/minute. The net_votes field can’t be a single row — it’s a hot key. Implementation:

  • Each upvote writes an immutable row to votes (idempotent on (user, answer)).
  • A sharded counter (net_votes:answer:{id}:shard:{N}) is incremented.
  • Periodically, the counter rolls up to the answer’s net_votes summary and triggers a re-rank.

See /system-design/sharded-counters for the pattern.

Personalized feed#

The feed surfaces:

  • Recent questions in topics the user follows.
  • Recent answers from users the user follows.
  • Trending in topics with high user affinity.
  • Editorial picks (“Best of Quora today”).

Implementation is a per-user Redis ZSET seeded by a personalization model (offline-batched) and topped up by streaming fan-out from new content. Same hybrid push/pull pattern as /system-design/twitter-newsfeed:

when new question created in topic T:
for each follower of T:
ZADD feed:{follower} score new_question_id
trim to most recent ~500

For high-volume topics (e.g., “Programming” with millions of followers), this is unaffordable to push to everyone — pull-on-read for high-fanout topics.

Inverted index over questions and answers, sharded by document hash. Query path:

search "how to learn rust"
→ tokenize, expand synonyms (Rust language vs metal rust)
→ query each shard, fetch top-K from each
→ global merge by score (BM25 + recency + popularity + ML reranker)
→ return top 20

See /system-design/distributed-search for the substrate. Quora’s twist is the ML reranker on top, which often re-orders based on question-question similarity.

Notifications#

When something interesting happens, push a record to the recipient’s notification queue:

event: "Alice answered a question you follow"
event: "Your answer reached 100 upvotes"
event: "Bob mentioned you in an answer"

Notifications are aggregated into digests (per-hour, per-day) for users who don’t want every ping. The notification service is a fan-out engine with rate limits per user.

Question page latency budget (target 400 ms p99)#

LB + TLS: 15 ms
Auth: 5 ms
Question fetch (Postgres): 20 ms
Answer list (Redis ZSET): 3 ms
Hydrate top-5 answer bodies: 30 ms
Personalization rerank (top-5): 20 ms
Vote-state for viewer (Redis): 5 ms
Comments first page: 20 ms
Serialize + network back: 80 ms
total: ~200 ms p99 (server)
+ image/font load happens in parallel

Step 7 — Evaluation & Trade-offs#

Bottleneck #1: search index update lag. A new answer should be findable within a minute. Async indexing via Kafka means the index can lag during spikes. A fallback for fresh content: query the relational store directly for very recent answers and merge into search results.

Bottleneck #2: ranking recompute storms. A high-traffic question with hundreds of answers gets re-ranked on every vote. Debounce per question (recompute at most once every 30 s); use sharded counters so the recompute reads a single aggregated value, not 200 sub-counters.

Bottleneck #3: feed personalization cost. Online ranking per feed-load is expensive (200+ candidates × model inference). Offline precompute the per-user candidate pool; only the final rescoring is online. A user with no recent activity gets a generic feed.

Alternative I’d push back on: storing vote counts as a row in the answer record updated on every vote. A celebrity Q&A would tombstone the row with contention. Always sharded counter for write-heavy aggregates.

What breaks first at 10× scale (3 B MAU): the search index. Already large at present scale; at 10× we’d need shard counts in the hundreds, with cross-shard merge becoming the dominant query cost. Pre-partition the index by topic so queries scope to relevant shards by default.

Companies this resembles#

Quora, Stack Overflow (heavier moderation, lighter personalization), Reddit (community-scoped, voting-first), Hacker News (single global feed, simpler ranking).

Search ESC

Keyboard shortcuts

Shortcuts are disabled while typing in inputs.