Design a Gaming API
Matchmaking, lobby, real-time game state, leaderboards, anti-cheat hooks. Latency-sensitive APIs where 50 ms is everything.
Context#
A gaming API for a competitive online title is not a single API — it is four loosely-coupled surfaces stitched together by a common identity:
- Matchmaking — find me an opponent (or a team) at my skill level.
- Lobby — once matched, hold us in a room until everyone is ready.
- Real-time game state — once we start, sync the world between server and clients on a tick budget measured in milliseconds.
- Leaderboards and persistence — once we end, store the outcome, update rankings, surface them to the public.
Plus a fifth, cutting across all four: anti-cheat hooks. A signal stream from the client (and from the server’s view of the client’s behaviour) that flows to a separate offline analysis pipeline.
The reason this is an Advanced-tier system: each of the five surfaces has a fundamentally different latency budget, consistency model, and protocol. A candidate who reaches for “let’s put it all behind REST” has lost. A candidate who tries to make all of it real-time-WebSocket has also lost — the leaderboard read path is cacheable and should not be on a stateful connection.
Real platforms to reference: Valve’s Steamworks (game services for Steam titles), Riot’s RPC backbone (League of Legends, Valorant), Epic Online Services (cross-platform infrastructure offered to third-party game studios). All three split exactly along the four-surface boundary.
Hidden objectives an interviewer is probing:
- Can you separate the four surfaces without trying to unify their protocols?
- Can you defend a
<= 50 msserver round-trip for state sync at the API layer (the engine-side render budget is then independent)? - Can you pick the right protocol per surface — WebSocket for lobby, UDP-via-QUIC or custom-UDP for game state, REST for leaderboards, request-reply for matchmaking?
- Can you handle partial connectivity — a client dropping mid-match must reconnect and resync state without the server believing they’ve cheated?
- Can you talk about anti-cheat as an out-of-band signal, not an in-band gate?
Requirements (functional and non-functional)#
Functional — in scope:
- Player queues for a match in a specific game mode at their current skill rating.
- Server matches players within a configurable rating window; expands the window over time.
- Players enter a lobby; each must ready-up; lobby host or quorum starts the match.
- Game server allocates an authoritative session; clients connect; state syncs at the engine’s tick rate (e.g. 64 Hz).
- Per-tick state messages from server to client; per-tick input messages from client to server.
- On match end, server reports the outcome to the leaderboard service.
- Public leaderboards by game mode, time window (daily / weekly / season), region.
- Anti-cheat signals (input patterns, view angles, hardware fingerprints) streamed from client to a separate ingestion endpoint.
Functional — out of scope:
- The game-engine integration itself (how clients render, how physics simulate). The API ends at the wire.
- Anti-cheat detection logic (heuristics, ML models). We design the signal-collection surface; the analysis pipeline is downstream.
- Cosmetics / payment economy. Separate API behind the same identity.
- Voice chat. Separate transport, separate API, often a third-party SDK.
Non-functional:
- State-sync round-trip:
<= 50 ms p95between client and authoritative game server within a region. This is the headline number; everything else is in service of it. - Matchmaking time-to-match:
<= 30 s p95at high traffic; degrade window-expansion if longer. - Leaderboard read:
<= 100 ms p95, CDN-cached, eventually consistent within 60 s of a match ending. - Throughput: 100k concurrent players in a region peak; ~1M concurrent globally.
- Availability: 99.95% on matchmaking and lobby; 99.99% on the game-state plane (a disconnect is more painful than a queue wait).
Use case diagram#
┌──────────────┐ │ Player │ └──────┬───────┘ │ ┌─────────┬─────────┼─────────┬────────────┐ ▼ ▼ ▼ ▼ ▼[queue] [ready up] [play] [view rank] [report cheat] │ │ │ │ │ ▼ ▼ ▼ ▼ ▼┌──────┐ ┌──────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐│ MM │ │Lobby │ │ Game │ │ Leader- │ │ Anti-cheat ││ API │ │ API │ │ Server │ │ board API│ │ ingest │└──────┘ └──────┘ └────┬─────┘ └──────────┘ └──────────────┘ │ │ outcome on match end ▼ ┌──────────┐ │ Leader- │ │ board DB │ └──────────┘Five surfaces, one player. The player’s auth token gets them through all five — but the protocol shapes are very different.
Class diagram#
┌────────────────────────┐ ┌──────────────────────┐ │ Player │ │ MatchTicket │ ├────────────────────────┤ ├──────────────────────┤ │ id : uuid │ ──────► │ id : uuid │ │ display_name : string │ │ player_id : uuid │ │ region : enum │ │ mode : enum │ │ ratings : map<mode,int>│ │ rating_at_queue : int│ │ ban_state : enum │ │ window_lower : int │ └────────────────────────┘ │ window_upper : int │ │ queued_at : ts │ │ state : enum │ └──────────────────────┘ │ ▼ match found ┌──────────────────────┐ │ Lobby │ ├──────────────────────┤ │ id : uuid │ │ mode : enum │ │ players : Player[] │ │ ready : map<id,bool> │ │ created_at : ts │ │ state : enum │ └──────────────────────┘ │ ▼ all ready ┌──────────────────────┐ │ GameSession │ ├──────────────────────┤ │ id : uuid │ │ server_addr : addr │ │ session_token : str │ │ tick_rate : int │ │ players : Player[] │ │ state : enum │ │ started_at : ts │ │ ended_at? : ts │ └──────────────────────┘ │ ▼ on end ┌──────────────────────┐ │ MatchResult │ ├──────────────────────┤ │ session_id : uuid │ │ winner : team/player │ │ duration_s : int │ │ rating_deltas : map │ │ events : event[] │ └──────────────────────┘MatchTicket.window_lower/upper is the dynamic rating window — starts narrow, widens over time so the queue resolves. GameSession.server_addr is the address of the authoritative server the client connects to next; session_token authorises that connection (separate from the player’s account token so a token leak doesn’t compromise their account).
Sequence diagram (key flows)#
Matchmaking through to game start#
Client MM API MatchMaker Lobby API GameServerAlloc │ POST /queue │ │ │ │ │────────────►│ │ │ │ │ │ enqueue │ │ │ │ │──────────►│ │ │ │ │ ticket_id │ │ │ │ 201 + WS URL │ │ │ │◄────────────│ │ │ │ │ WS subscribe ticket_id │ │ │ │────────────────────────►│ │ │ │ │ window expand loop │ │ │ │ match found │ │ │ │──────────►│ │ │ │ │ │ create lobby│ │ │ │ │────────────►│ │ │ │ │ │ lobby_id │ │ │ │◄────────────│ │ │ WS push: lobby_id, lobby URL │ │ │◄────────────────────────│ │ │ │ POST /lobbies/{id}/ready │ │ │──────────────────────────────────────►│ │ │ │ │ │ all ready │ │ │ │ │─────────────►│ │ │ │ │ allocate │ │ │ │ │ │ session │ │ │ │ server_addr │ │ │ │ │◄─────────────│ │ WS push: server_addr, session_token │ │ │◄──────────────────────────────────────│ │ │ UDP/QUIC connect server_addr + token │ │ │──────────────────────────────────────────────────────────►Three protocol switches inside a single user flow: REST for the initial queue request, WebSocket for the matchmaking-progress stream, REST for ready-up, UDP/QUIC for the actual game state. Each is the correct choice for its phase.
Tick loop (the inner hot loop)#
Client GameServer │ input frame (UDP) │ │────────────────────────────────────────►│ │ │ simulate tick │ │ compute deltas │ state delta (UDP) │ │◄────────────────────────────────────────│ │ input frame (UDP) │ │────────────────────────────────────────►│ │ │ simulate tick │ state delta (UDP) │ │◄────────────────────────────────────────│The protocol detail that matters: state messages are deltas (what changed since the last tick), not snapshots. A periodic snapshot anchors the delta stream (every ~64 ticks) so a packet loss is recoverable. UDP because retransmitting a stale tick is worse than skipping it — TCP head-of-line blocking is fatal here.
Activity diagram (for non-trivial state)#
The GameSession lifecycle:
[lobby ready quorum] │ ▼ ┌───────────────┐ │ Allocating │── alloc failed → ┌──────────┐ └───────┬───────┘ │ Aborted │ │ └──────────┘ ▼ ┌───────────────┐ │ Starting │── client conn timeout → Aborted └───────┬───────┘ │ all connected ▼ ┌───────────────┐ │ Active │── player drops ─► reconnect window └───────┬───────┘ │ end condition ▼ ┌───────────────┐ │ Ending │── grace, persist └───────┬───────┘ │ ▼ ┌───────────────┐ │ Persisted │── MatchResult written to leaderboard └───────────────┘The reconnect window in Active is the non-obvious bit. When a player drops, the server holds their slot open for 90 seconds. If they reconnect inside the window with the same session_token, they resume; the engine pauses or AFK-flags them depending on game mode. Outside the window, they take a desertion penalty in matchmaking rating. The API surface for reconnect is just a re-issue of the UDP/QUIC handshake — same session_token, same server.
API implementation#
Endpoint catalogue#
Across the four surfaces:
Matchmaking (REST + WebSocket)
| Method | Path | Purpose |
|---|---|---|
POST | /v1/matchmaking/queue | Join a queue; returns ticket + WS URL |
DELETE | /v1/matchmaking/queue/{ticket} | Leave queue |
WS | /v1/matchmaking/stream?ticket={t} | Progress updates |
Lobby (REST + WebSocket)
| Method | Path | Purpose |
|---|---|---|
GET | /v1/lobbies/{id} | Lobby state |
POST | /v1/lobbies/{id}/ready | Ready-up |
POST | /v1/lobbies/{id}/leave | Leave |
WS | /v1/lobbies/{id}/stream | Member updates |
Game state (UDP / QUIC, not REST)
| “Endpoint” | Direction | Cadence | Purpose |
|---|---|---|---|
| Input frames | C → S | 60-128 Hz | Player inputs |
| State deltas | S → C | tick rate | World updates |
| Anchor snapshots | S → C | every ~1 s | Full state, for catch-up |
| Ping / heartbeat | both | 1 Hz | Liveness, RTT |
Leaderboards (REST, CDN-cacheable)
| Method | Path | Purpose |
|---|---|---|
GET | /v1/leaderboards/{game_id}/{mode} | Top-N by mode |
GET | /v1/leaderboards/{game_id}/{mode}/around/{player_id} | Around a specific player |
Anti-cheat ingest (REST, internal)
| Method | Path | Purpose |
|---|---|---|
POST | /v1/anticheat/signals | Batched client telemetry |
OpenAPI schema (excerpt)#
paths: /v1/matchmaking/queue: post: operationId: enqueue requestBody: required: true content: application/json: schema: type: object required: [mode, region] properties: mode: { type: string, enum: [ranked-1v1, ranked-3v3, casual-5v5] } region: { type: string, enum: [na-east, na-west, eu-west, ap-se] } party_id: { type: string, nullable: true } responses: '201': description: Queued content: application/json: schema: type: object properties: ticket_id: { type: string } stream_url: { type: string, format: uri } estimated_wait_seconds: { type: integer } /v1/lobbies/{id}/ready: post: operationId: readyUp parameters: - { name: id, in: path, required: true, schema: { type: string } } responses: '200': description: Ready state acknowledged content: application/json: schema: type: object properties: lobby_state: { type: string, enum: [waiting, all_ready, starting, aborted] } ready_count: { type: integer } required_count: { type: integer }A representative WS push during lobby:
{ "type": "lobby.ready.changed", "lobby_id": "lob_01HXYZ...", "ready_count": 4, "required_count": 5, "next_state": "waiting"}And once the session is allocated:
{ "type": "lobby.session.ready", "session": { "id": "ses_01HABC...", "server_addr": "udp://gs-eu-west-42.api.example:7777", "session_token": "eyJhbGciOi...short-lived...", "tick_rate": 64 }}Client samples — three languages#
Matchmaking enqueue in three languages. The state-sync loop uses UDP/QUIC and is engine-side, so the public API client code stops here.
import requests
def enqueue(mode, region, token): resp = requests.post( "https://api.gaming.example/v1/matchmaking/queue", json={"mode": mode, "region": region}, headers={"Authorization": f"Bearer {token}"}, timeout=3, ) resp.raise_for_status() return resp.json()
result = enqueue("ranked-3v3", "eu-west", token="eyJhbGciOi...")print("ticket:", result["ticket_id"])print("stream:", result["stream_url"])package main
import ( "bytes" "encoding/json" "fmt" "net/http")
type EnqueueReq struct { Mode string `json:"mode"` Region string `json:"region"`}
type EnqueueResp struct { TicketID string `json:"ticket_id"` StreamURL string `json:"stream_url"` EstWaitSec int `json:"estimated_wait_seconds"`}
func enqueue(mode, region, token string) (*EnqueueResp, error) { body, _ := json.Marshal(EnqueueReq{Mode: mode, Region: region}) req, _ := http.NewRequest("POST", "https://api.gaming.example/v1/matchmaking/queue", bytes.NewReader(body)) req.Header.Set("Authorization", "Bearer "+token) req.Header.Set("Content-Type", "application/json")
resp, err := http.DefaultClient.Do(req) if err != nil { return nil, err } defer resp.Body.Close()
var r EnqueueResp if err := json.NewDecoder(resp.Body).Decode(&r); err != nil { return nil, err } return &r, nil}
func main() { r, _ := enqueue("ranked-3v3", "eu-west", "eyJhbGciOi...") fmt.Println("ticket:", r.TicketID, "stream:", r.StreamURL)}async function enqueue(mode, region, token) { const resp = await fetch("https://api.gaming.example/v1/matchmaking/queue", { method: "POST", headers: { "Content-Type": "application/json", "Authorization": `Bearer ${token}`, }, body: JSON.stringify({ mode, region }), }); if (!resp.ok) throw new Error(`HTTP ${resp.status}`); return resp.json();}
const r = await enqueue("ranked-3v3", "eu-west", "eyJhbGciOi...");console.log("ticket:", r.ticket_id);const ws = new WebSocket(r.stream_url);ws.onmessage = (e) => console.log("update:", JSON.parse(e.data));Latency budget — the 50 ms round-trip#
The headline number for state sync, broken down for a player in the same region as their game server:
| Phase | Budget | Notes |
|---|---|---|
| Client encode + send | 1 ms | Input frame, packed binary. |
| Last-mile + ISP | 10-20 ms | Floor. Cannot improve from API side. |
| Datacenter ingress | 2 ms | Anycast UDP load balancer. |
| Game server dispatch | 1 ms | Already in user-space, lockless ring buffer. |
| Tick simulation | 5-15 ms | Engine cost; bounded by tick rate. |
| Delta encode + send | 1 ms | Per-player delta from authoritative state. |
| Return path | 10-20 ms | Same as outbound. |
| Total | 30-60 ms | Competitive titles sit at the lower end. |
The API-design implication: anything that adds even 5 ms to this path is unacceptable. No JSON parsing on the hot loop (binary protocol), no TLS handshake per message (long-lived QUIC connection), no auth check per message (session_token validated once at connection, signed thereafter).
Trade-offs and extensions#
| Decision | Why | Cost if requirements change |
|---|---|---|
| Four separate protocols across surfaces | Each surface gets the right tool | Higher operational complexity; more SDKs |
| UDP/QUIC for state sync | Packet loss doesn’t head-of-line-block | Harder to debug; firewalls less friendly |
| 90 s reconnect window | Matches mobile-handoff timescale | A 10-minute window would invite griefing |
| Anti-cheat as out-of-band signal | Doesn’t add latency to the hot loop | Cheaters get away with more inside a single match |
| Leaderboard CDN-cached | Cheap and fast at scale | 60 s lag after match end |
| Session token separate from account token | Token leak from game server doesn’t compromise account | Token rotation complexity |
| Regional sharding of matchmakers | Latency-aware matching | Cross-region matches need an explicit federation |
Likely follow-up extensions:
- Cross-region matchmaking during low-traffic hours. Federated matchmaker queries peer regions if local pool is too small; selects the lower-RTT one for the chosen server location.
- Parties / pre-formed groups. Add
party_id(already in the schema). Matchmaker accepts parties as atomic units; widens rating window faster for parties to avoid starvation. - Spectator mode. Read-only join to an active
GameSession. Separate “spectator” QUIC connection that receives state deltas only, no input. Latency budget relaxed. - Replay capture. Server records state deltas to object storage on session end. A separate API serves the replays.
- Cross-platform play. Identity layer abstracts the platform (Steam, PSN, Xbox Live). Matchmaking is platform-agnostic; the game server enforces input-method fairness (controller vs keyboard/mouse) as a separate dimension.
Mock interview follow-ups#
- “Why not put game state behind WebSocket?” Head-of-line blocking on TCP makes packet loss expensive. UDP / QUIC are designed exactly for this case — drop a stale tick, keep going.
- “How do you prevent cheating?” Server is authoritative for everything game-affecting (positions, hits, hitboxes). Clients send inputs only. Anti-cheat signals (input timing, view-angle micro-patterns) stream to a separate offline pipeline that decides bans asynchronously. Per-match cheat detection at the API layer would have to be perfect not to false-positive, and it can’t be.
- “How do you scale matchmakers?” Per-region, per-mode shards. Each shard is a bounded queue plus a window-expansion loop. Cross-shard match-finding is opt-in for low-traffic hours.
- “What if the game server crashes mid-match?” Players see disconnect. Reconnect window opens. If session is unrecoverable, match is voided and players’ MMR is restored. The leaderboard never gets a
MatchResultso the rating doesn’t move. - “How does the leaderboard not become a hot spot?” CDN with 60 s TTL. Writes (match results) are aggregated upstream and flushed to the underlying store at most every 30 s. Top-100 lists are pre-computed.
- “Why isn’t your game state API a documented public REST API?” Because the engine talks the wire protocol directly. The public surface is the SDK; the wire is an implementation detail and changes with the engine version. REST is the wrong granularity for 64 Hz ticks.
- “How do you handle a DDoS on the matchmaking API?” Rate-limit per IP at the gateway; token-bucket per authenticated player. Drop unauthenticated traffic at the edge. The state-plane (UDP) is protected by the fact that
session_tokenis required and short-lived — flooding random ports gets nowhere.
Related#
- Design the Zoom API — the other latency-sensitive system in this workbook; reuses SFU + jitter-buffer ideas.
- Design a Pub-Sub Service API — how match-end events fan out to leaderboards, anti-cheat, party-systems.
- Design the Facebook Messenger API — long-lived connection model and reconnect semantics, applied differently.
- WebSockets — Bidirectional Streaming — when WS is right, when it’s not — see the call-out above.
- Latency and Throughput — the underlying numbers behind the 50 ms budget.