Quora
Q&A platform: ranking answers, follow graph, notification fan-out, search.
Step 1 — Clarify Requirements#
Functional
- A user asks a question; others answer; readers upvote / downvote answers.
- A question page shows answers ranked by quality.
- Users follow topics and other users; a personalized feed surfaces relevant new questions and answers.
- Search across questions and answers.
- Notifications when followed-content gets new answers, when a user is mentioned, when an answer gets significant upvotes.
- Out of scope: monetization, moderation tooling, anonymous answers (consider variant).
Non-functional
- 99.95% availability.
- p99 question-page load under 400 ms.
- 300 M MAU; ~50 M questions; ~200 M answers; ~50 K writes/sec peak (votes are the dominant write).
- Strong consistency on answer authorship and vote ownership; eventual consistency on aggregate vote counts and feed rankings.
Step 2 — Capacity Estimation#
- Question reads: 300 M MAU × ~20 question pages/day = 6 B reads/day ≈ 70 K reads/sec average, 300 K/sec peak.
- Writes: votes (50 K/sec peak), new answers (~50/sec), new questions (~5/sec).
- Storage: 50 M questions × 1 KB + 200 M answers × 3 KB = 50 + 600 GB = 650 GB of text content. Tiny. Indexes, vote logs, and edit history dominate the total — ~10 TB.
- Search index: full-text inverted index over 250 M docs, ~50 GB on disk per shard.
- Notifications: ~1 M notifications/sec at peak (mostly aggregated into digests, not delivered individually).
The system is read-heavy and search-heavy. The interesting parts are ranking, search, and fan-out.
Step 3 — System Interface#
POST /questions { title, body, topics: [...] }POST /answers { question_id, body }POST /votes { answer_id, value: +1|-1 }POST /follow { type: 'user'|'topic'|'question', target_id }
GET /questions/:id (title + answers, paginated by ranking)GET /search?q=... (across questions and answers)GET /feed?cursor=... (personalized)
GET /notifications (paginated, with read/unread)Posting and voting endpoints are idempotent on (user_id, target_id) — a stale retry must not double-vote.
Step 4 — High-Level Design#
┌── search index (sharded) │client → LB → API ──┬── /questions/:id ──→ question service ──→ relational store (Postgres, sharded) │ │ │ │ ▼ ▼ │ ranking cache vote counters (sharded) │ (Redis ZSET) │ ├── /votes ─→ vote service ─→ Postgres + sharded counters │ │ │ └→ async: rerank affected answer │ ├── /feed ─→ feed service ─→ feed cache (Redis ZSET per user) │ ▲ │ │ └── /search ─→ search service (Elasticsearch / Vespa) │ └─ async: questions / answers indexed via Kafka
Writes also fan out to: search index, notifications, feed personalization model.Step 5 — Data Model#
Questions (Postgres, sharded by question_id):
table questions question_id uuid PK title string body text topic_ids array<uuid> asker_id uuid created_at timestamp view_count bigint // async answer_count int // asyncAnswers:
table answers answer_id uuid PK question_id uuid author_id uuid body text created_at timestamp net_votes int // sharded counter; periodically rolled up rank_score float // precomputed for fast question-page renderingVotes (immutable log; idempotent):
table votes user_id uuid answer_id uuid value int // +1 or -1 ts timestamp PK (user_id, answer_id)Follow graph:
table follows follower_id uuid target_type enum(user, topic, question) target_id uuid PK (follower_id, target_type, target_id)Ranking cache (Redis ZSET per question, score = rank_score):
key: q:{question_id}:answers → ZSET of answer_idsStep 6 — Detailed Design#
Answer ranking#
The question page shows the “best” answer first. The classic naive metric (raw vote count) is dominated by old answers that have accumulated votes over years. Better:
score(answer) = (net_votes + α) / (hours_since_post + β)^γ // Hacker-News-like + author_credibility_bonus + asker_acceptance_bonus + personalization_term(viewer, answer)The first three terms are global; personalization is computed per-viewer at read time. Quora-style ranking adds heavy ML ranker on top (a BERT-class model scoring question x answer for topic relevance, hedged so simple high-quality answers still surface).
The pre-ranked ZSET is invalidated on:
- Vote change on any of the question’s answers (debounced; recompute every ~30 s).
- New answer posted on the question.
- Periodic decay refresh (every ~1 hour, score depends on age).
Votes at scale#
A viral answer gets 10 K upvotes/minute. The net_votes field can’t be a single row — it’s a hot key. Implementation:
- Each upvote writes an immutable row to
votes(idempotent on(user, answer)). - A sharded counter (
net_votes:answer:{id}:shard:{N}) is incremented. - Periodically, the counter rolls up to the answer’s
net_votessummary and triggers a re-rank.
See /system-design/sharded-counters for the pattern.
Personalized feed#
The feed surfaces:
- Recent questions in topics the user follows.
- Recent answers from users the user follows.
- Trending in topics with high user affinity.
- Editorial picks (“Best of Quora today”).
Implementation is a per-user Redis ZSET seeded by a personalization model (offline-batched) and topped up by streaming fan-out from new content. Same hybrid push/pull pattern as /system-design/twitter-newsfeed:
when new question created in topic T: for each follower of T: ZADD feed:{follower} score new_question_id trim to most recent ~500For high-volume topics (e.g., “Programming” with millions of followers), this is unaffordable to push to everyone — pull-on-read for high-fanout topics.
Search#
Inverted index over questions and answers, sharded by document hash. Query path:
search "how to learn rust" → tokenize, expand synonyms (Rust language vs metal rust) → query each shard, fetch top-K from each → global merge by score (BM25 + recency + popularity + ML reranker) → return top 20See /system-design/distributed-search for the substrate. Quora’s twist is the ML reranker on top, which often re-orders based on question-question similarity.
Notifications#
When something interesting happens, push a record to the recipient’s notification queue:
event: "Alice answered a question you follow"event: "Your answer reached 100 upvotes"event: "Bob mentioned you in an answer"Notifications are aggregated into digests (per-hour, per-day) for users who don’t want every ping. The notification service is a fan-out engine with rate limits per user.
Question page latency budget (target 400 ms p99)#
LB + TLS: 15 msAuth: 5 msQuestion fetch (Postgres): 20 msAnswer list (Redis ZSET): 3 msHydrate top-5 answer bodies: 30 msPersonalization rerank (top-5): 20 msVote-state for viewer (Redis): 5 msComments first page: 20 msSerialize + network back: 80 ms total: ~200 ms p99 (server) + image/font load happens in parallelStep 7 — Evaluation & Trade-offs#
Bottleneck #1: search index update lag. A new answer should be findable within a minute. Async indexing via Kafka means the index can lag during spikes. A fallback for fresh content: query the relational store directly for very recent answers and merge into search results.
Bottleneck #2: ranking recompute storms. A high-traffic question with hundreds of answers gets re-ranked on every vote. Debounce per question (recompute at most once every 30 s); use sharded counters so the recompute reads a single aggregated value, not 200 sub-counters.
Bottleneck #3: feed personalization cost. Online ranking per feed-load is expensive (200+ candidates × model inference). Offline precompute the per-user candidate pool; only the final rescoring is online. A user with no recent activity gets a generic feed.
Alternative I’d push back on: storing vote counts as a row in the answer record updated on every vote. A celebrity Q&A would tombstone the row with contention. Always sharded counter for write-heavy aggregates.
What breaks first at 10× scale (3 B MAU): the search index. Already large at present scale; at 10× we’d need shard counts in the hundreds, with cross-shard merge becoming the dominant query cost. Pre-partition the index by topic so queries scope to relevant shards by default.
Companies this resembles#
Quora, Stack Overflow (heavier moderation, lighter personalization), Reddit (community-scoped, voting-first), Hacker News (single global feed, simpler ranking).
Related systems#
- Generic Newsfeed System — abstraction of the personalized-feed component.
- Twitter Newsfeed — same hybrid fan-out pattern for follows.
- Distributed Search — substrate for question / answer search.
- Typeahead Suggestion — autocomplete as the user types a question.