Design a Gaming API

Matchmaking, lobby, real-time game state, leaderboards, anti-cheat hooks. Latency-sensitive APIs where 50 ms is everything.

System Advanced
15 min read
gaming real-time websockets matchmaking leaderboards latency

Context#

A gaming API for a competitive online title is not a single API — it is four loosely-coupled surfaces stitched together by a common identity:

  1. Matchmaking — find me an opponent (or a team) at my skill level.
  2. Lobby — once matched, hold us in a room until everyone is ready.
  3. Real-time game state — once we start, sync the world between server and clients on a tick budget measured in milliseconds.
  4. Leaderboards and persistence — once we end, store the outcome, update rankings, surface them to the public.

Plus a fifth, cutting across all four: anti-cheat hooks. A signal stream from the client (and from the server’s view of the client’s behaviour) that flows to a separate offline analysis pipeline.

The reason this is an Advanced-tier system: each of the five surfaces has a fundamentally different latency budget, consistency model, and protocol. A candidate who reaches for “let’s put it all behind REST” has lost. A candidate who tries to make all of it real-time-WebSocket has also lost — the leaderboard read path is cacheable and should not be on a stateful connection.

Real platforms to reference: Valve’s Steamworks (game services for Steam titles), Riot’s RPC backbone (League of Legends, Valorant), Epic Online Services (cross-platform infrastructure offered to third-party game studios). All three split exactly along the four-surface boundary.

Hidden objectives an interviewer is probing:

  • Can you separate the four surfaces without trying to unify their protocols?
  • Can you defend a <= 50 ms server round-trip for state sync at the API layer (the engine-side render budget is then independent)?
  • Can you pick the right protocol per surface — WebSocket for lobby, UDP-via-QUIC or custom-UDP for game state, REST for leaderboards, request-reply for matchmaking?
  • Can you handle partial connectivity — a client dropping mid-match must reconnect and resync state without the server believing they’ve cheated?
  • Can you talk about anti-cheat as an out-of-band signal, not an in-band gate?

Requirements (functional and non-functional)#

Functional — in scope:

  • Player queues for a match in a specific game mode at their current skill rating.
  • Server matches players within a configurable rating window; expands the window over time.
  • Players enter a lobby; each must ready-up; lobby host or quorum starts the match.
  • Game server allocates an authoritative session; clients connect; state syncs at the engine’s tick rate (e.g. 64 Hz).
  • Per-tick state messages from server to client; per-tick input messages from client to server.
  • On match end, server reports the outcome to the leaderboard service.
  • Public leaderboards by game mode, time window (daily / weekly / season), region.
  • Anti-cheat signals (input patterns, view angles, hardware fingerprints) streamed from client to a separate ingestion endpoint.

Functional — out of scope:

  • The game-engine integration itself (how clients render, how physics simulate). The API ends at the wire.
  • Anti-cheat detection logic (heuristics, ML models). We design the signal-collection surface; the analysis pipeline is downstream.
  • Cosmetics / payment economy. Separate API behind the same identity.
  • Voice chat. Separate transport, separate API, often a third-party SDK.

Non-functional:

  • State-sync round-trip: <= 50 ms p95 between client and authoritative game server within a region. This is the headline number; everything else is in service of it.
  • Matchmaking time-to-match: <= 30 s p95 at high traffic; degrade window-expansion if longer.
  • Leaderboard read: <= 100 ms p95, CDN-cached, eventually consistent within 60 s of a match ending.
  • Throughput: 100k concurrent players in a region peak; ~1M concurrent globally.
  • Availability: 99.95% on matchmaking and lobby; 99.99% on the game-state plane (a disconnect is more painful than a queue wait).

Use case diagram#

┌──────────────┐
│ Player │
└──────┬───────┘
┌─────────┬─────────┼─────────┬────────────┐
▼ ▼ ▼ ▼ ▼
[queue] [ready up] [play] [view rank] [report cheat]
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
┌──────┐ ┌──────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐
│ MM │ │Lobby │ │ Game │ │ Leader- │ │ Anti-cheat │
│ API │ │ API │ │ Server │ │ board API│ │ ingest │
└──────┘ └──────┘ └────┬─────┘ └──────────┘ └──────────────┘
│ outcome on match end
┌──────────┐
│ Leader- │
│ board DB │
└──────────┘

Five surfaces, one player. The player’s auth token gets them through all five — but the protocol shapes are very different.

Class diagram#

┌────────────────────────┐ ┌──────────────────────┐
│ Player │ │ MatchTicket │
├────────────────────────┤ ├──────────────────────┤
│ id : uuid │ ──────► │ id : uuid │
│ display_name : string │ │ player_id : uuid │
│ region : enum │ │ mode : enum │
│ ratings : map<mode,int>│ │ rating_at_queue : int│
│ ban_state : enum │ │ window_lower : int │
└────────────────────────┘ │ window_upper : int │
│ queued_at : ts │
│ state : enum │
└──────────────────────┘
▼ match found
┌──────────────────────┐
│ Lobby │
├──────────────────────┤
│ id : uuid │
│ mode : enum │
│ players : Player[] │
│ ready : map<id,bool> │
│ created_at : ts │
│ state : enum │
└──────────────────────┘
▼ all ready
┌──────────────────────┐
│ GameSession │
├──────────────────────┤
│ id : uuid │
│ server_addr : addr │
│ session_token : str │
│ tick_rate : int │
│ players : Player[] │
│ state : enum │
│ started_at : ts │
│ ended_at? : ts │
└──────────────────────┘
▼ on end
┌──────────────────────┐
│ MatchResult │
├──────────────────────┤
│ session_id : uuid │
│ winner : team/player │
│ duration_s : int │
│ rating_deltas : map │
│ events : event[] │
└──────────────────────┘

MatchTicket.window_lower/upper is the dynamic rating window — starts narrow, widens over time so the queue resolves. GameSession.server_addr is the address of the authoritative server the client connects to next; session_token authorises that connection (separate from the player’s account token so a token leak doesn’t compromise their account).

Sequence diagram (key flows)#

Matchmaking through to game start#

Client MM API MatchMaker Lobby API GameServerAlloc
│ POST /queue │ │ │ │
│────────────►│ │ │ │
│ │ enqueue │ │ │
│ │──────────►│ │ │
│ │ ticket_id │ │ │
│ 201 + WS URL │ │ │
│◄────────────│ │ │ │
│ WS subscribe ticket_id │ │ │
│────────────────────────►│ │ │
│ │ window expand loop │ │
│ │ match found │ │
│ │──────────►│ │ │
│ │ │ create lobby│ │
│ │ │────────────►│ │
│ │ │ │ lobby_id │
│ │ │◄────────────│ │
│ WS push: lobby_id, lobby URL │ │
│◄────────────────────────│ │ │
│ POST /lobbies/{id}/ready │ │
│──────────────────────────────────────►│ │
│ │ │ │ all ready │
│ │ │ │─────────────►│
│ │ │ │ allocate │
│ │ │ │ │ session
│ │ │ │ server_addr │
│ │ │ │◄─────────────│
│ WS push: server_addr, session_token │ │
│◄──────────────────────────────────────│ │
│ UDP/QUIC connect server_addr + token │ │
│──────────────────────────────────────────────────────────►

Three protocol switches inside a single user flow: REST for the initial queue request, WebSocket for the matchmaking-progress stream, REST for ready-up, UDP/QUIC for the actual game state. Each is the correct choice for its phase.

Tick loop (the inner hot loop)#

Client GameServer
│ input frame (UDP) │
│────────────────────────────────────────►│
│ │ simulate tick
│ │ compute deltas
│ state delta (UDP) │
│◄────────────────────────────────────────│
│ input frame (UDP) │
│────────────────────────────────────────►│
│ │ simulate tick
│ state delta (UDP) │
│◄────────────────────────────────────────│

The protocol detail that matters: state messages are deltas (what changed since the last tick), not snapshots. A periodic snapshot anchors the delta stream (every ~64 ticks) so a packet loss is recoverable. UDP because retransmitting a stale tick is worse than skipping it — TCP head-of-line blocking is fatal here.

Activity diagram (for non-trivial state)#

The GameSession lifecycle:

[lobby ready quorum]
┌───────────────┐
│ Allocating │── alloc failed → ┌──────────┐
└───────┬───────┘ │ Aborted │
│ └──────────┘
┌───────────────┐
│ Starting │── client conn timeout → Aborted
└───────┬───────┘
│ all connected
┌───────────────┐
│ Active │── player drops ─► reconnect window
└───────┬───────┘
│ end condition
┌───────────────┐
│ Ending │── grace, persist
└───────┬───────┘
┌───────────────┐
│ Persisted │── MatchResult written to leaderboard
└───────────────┘

The reconnect window in Active is the non-obvious bit. When a player drops, the server holds their slot open for 90 seconds. If they reconnect inside the window with the same session_token, they resume; the engine pauses or AFK-flags them depending on game mode. Outside the window, they take a desertion penalty in matchmaking rating. The API surface for reconnect is just a re-issue of the UDP/QUIC handshake — same session_token, same server.

API implementation#

Endpoint catalogue#

Across the four surfaces:

Matchmaking (REST + WebSocket)

MethodPathPurpose
POST/v1/matchmaking/queueJoin a queue; returns ticket + WS URL
DELETE/v1/matchmaking/queue/{ticket}Leave queue
WS/v1/matchmaking/stream?ticket={t}Progress updates

Lobby (REST + WebSocket)

MethodPathPurpose
GET/v1/lobbies/{id}Lobby state
POST/v1/lobbies/{id}/readyReady-up
POST/v1/lobbies/{id}/leaveLeave
WS/v1/lobbies/{id}/streamMember updates

Game state (UDP / QUIC, not REST)

“Endpoint”DirectionCadencePurpose
Input framesC → S60-128 HzPlayer inputs
State deltasS → Ctick rateWorld updates
Anchor snapshotsS → Cevery ~1 sFull state, for catch-up
Ping / heartbeatboth1 HzLiveness, RTT

Leaderboards (REST, CDN-cacheable)

MethodPathPurpose
GET/v1/leaderboards/{game_id}/{mode}Top-N by mode
GET/v1/leaderboards/{game_id}/{mode}/around/{player_id}Around a specific player

Anti-cheat ingest (REST, internal)

MethodPathPurpose
POST/v1/anticheat/signalsBatched client telemetry

OpenAPI schema (excerpt)#

OpenAPI 3.1 — Gaming API (matchmaking + lobby slice)
paths:
/v1/matchmaking/queue:
post:
operationId: enqueue
requestBody:
required: true
content:
application/json:
schema:
type: object
required: [mode, region]
properties:
mode: { type: string, enum: [ranked-1v1, ranked-3v3, casual-5v5] }
region: { type: string, enum: [na-east, na-west, eu-west, ap-se] }
party_id: { type: string, nullable: true }
responses:
'201':
description: Queued
content:
application/json:
schema:
type: object
properties:
ticket_id: { type: string }
stream_url: { type: string, format: uri }
estimated_wait_seconds: { type: integer }
/v1/lobbies/{id}/ready:
post:
operationId: readyUp
parameters:
- { name: id, in: path, required: true, schema: { type: string } }
responses:
'200':
description: Ready state acknowledged
content:
application/json:
schema:
type: object
properties:
lobby_state: { type: string, enum: [waiting, all_ready, starting, aborted] }
ready_count: { type: integer }
required_count: { type: integer }

A representative WS push during lobby:

WS frame — lobby state update
{
"type": "lobby.ready.changed",
"lobby_id": "lob_01HXYZ...",
"ready_count": 4,
"required_count": 5,
"next_state": "waiting"
}

And once the session is allocated:

WS frame — game session ready
{
"type": "lobby.session.ready",
"session": {
"id": "ses_01HABC...",
"server_addr": "udp://gs-eu-west-42.api.example:7777",
"session_token": "eyJhbGciOi...short-lived...",
"tick_rate": 64
}
}

Client samples — three languages#

Matchmaking enqueue in three languages. The state-sync loop uses UDP/QUIC and is engine-side, so the public API client code stops here.

Matchmaking enqueue — Python
import requests
def enqueue(mode, region, token):
resp = requests.post(
"https://api.gaming.example/v1/matchmaking/queue",
json={"mode": mode, "region": region},
headers={"Authorization": f"Bearer {token}"},
timeout=3,
)
resp.raise_for_status()
return resp.json()
result = enqueue("ranked-3v3", "eu-west", token="eyJhbGciOi...")
print("ticket:", result["ticket_id"])
print("stream:", result["stream_url"])

Latency budget — the 50 ms round-trip#

The headline number for state sync, broken down for a player in the same region as their game server:

PhaseBudgetNotes
Client encode + send1 msInput frame, packed binary.
Last-mile + ISP10-20 msFloor. Cannot improve from API side.
Datacenter ingress2 msAnycast UDP load balancer.
Game server dispatch1 msAlready in user-space, lockless ring buffer.
Tick simulation5-15 msEngine cost; bounded by tick rate.
Delta encode + send1 msPer-player delta from authoritative state.
Return path10-20 msSame as outbound.
Total30-60 msCompetitive titles sit at the lower end.

The API-design implication: anything that adds even 5 ms to this path is unacceptable. No JSON parsing on the hot loop (binary protocol), no TLS handshake per message (long-lived QUIC connection), no auth check per message (session_token validated once at connection, signed thereafter).

Trade-offs and extensions#

DecisionWhyCost if requirements change
Four separate protocols across surfacesEach surface gets the right toolHigher operational complexity; more SDKs
UDP/QUIC for state syncPacket loss doesn’t head-of-line-blockHarder to debug; firewalls less friendly
90 s reconnect windowMatches mobile-handoff timescaleA 10-minute window would invite griefing
Anti-cheat as out-of-band signalDoesn’t add latency to the hot loopCheaters get away with more inside a single match
Leaderboard CDN-cachedCheap and fast at scale60 s lag after match end
Session token separate from account tokenToken leak from game server doesn’t compromise accountToken rotation complexity
Regional sharding of matchmakersLatency-aware matchingCross-region matches need an explicit federation

Likely follow-up extensions:

  • Cross-region matchmaking during low-traffic hours. Federated matchmaker queries peer regions if local pool is too small; selects the lower-RTT one for the chosen server location.
  • Parties / pre-formed groups. Add party_id (already in the schema). Matchmaker accepts parties as atomic units; widens rating window faster for parties to avoid starvation.
  • Spectator mode. Read-only join to an active GameSession. Separate “spectator” QUIC connection that receives state deltas only, no input. Latency budget relaxed.
  • Replay capture. Server records state deltas to object storage on session end. A separate API serves the replays.
  • Cross-platform play. Identity layer abstracts the platform (Steam, PSN, Xbox Live). Matchmaking is platform-agnostic; the game server enforces input-method fairness (controller vs keyboard/mouse) as a separate dimension.

Mock interview follow-ups#

  • “Why not put game state behind WebSocket?” Head-of-line blocking on TCP makes packet loss expensive. UDP / QUIC are designed exactly for this case — drop a stale tick, keep going.
  • “How do you prevent cheating?” Server is authoritative for everything game-affecting (positions, hits, hitboxes). Clients send inputs only. Anti-cheat signals (input timing, view-angle micro-patterns) stream to a separate offline pipeline that decides bans asynchronously. Per-match cheat detection at the API layer would have to be perfect not to false-positive, and it can’t be.
  • “How do you scale matchmakers?” Per-region, per-mode shards. Each shard is a bounded queue plus a window-expansion loop. Cross-shard match-finding is opt-in for low-traffic hours.
  • “What if the game server crashes mid-match?” Players see disconnect. Reconnect window opens. If session is unrecoverable, match is voided and players’ MMR is restored. The leaderboard never gets a MatchResult so the rating doesn’t move.
  • “How does the leaderboard not become a hot spot?” CDN with 60 s TTL. Writes (match results) are aggregated upstream and flushed to the underlying store at most every 30 s. Top-100 lists are pre-computed.
  • “Why isn’t your game state API a documented public REST API?” Because the engine talks the wire protocol directly. The public surface is the SDK; the wire is an implementation detail and changes with the engine version. REST is the wrong granularity for 64 Hz ticks.
  • “How do you handle a DDoS on the matchmaking API?” Rate-limit per IP at the gateway; token-bucket per authenticated player. Drop unauthenticated traffic at the edge. The state-plane (UDP) is protected by the fact that session_token is required and short-lived — flooding random ports gets nowhere.
Search ESC

Keyboard shortcuts

Shortcuts are disabled while typing in inputs.