Design the Google Maps API

Tiles, geocoding, routing, places, distance matrix. The geospatial endpoints behind a billion daily queries.

System Advanced
19 min read
api-design geospatial cdn
Companies this resembles: Google

Context#

Maps is the canonical “multi-tenant geospatial API” question and an Advanced-tier prompt because the surface is enormous. A candidate who tries to design every endpoint loses the round; a candidate who refuses to bound the scope from the start has the same problem.

This writeup is an API-design round, not an HLD round. That means:

  • The actual map data pipeline — ingestion of satellite imagery, OSM-style edits, Street View, road graph extraction — is a black box. We design endpoints over the rendered result.
  • Traffic ingestion (probes from Android, Waze, etc.) is a black box. The routing endpoint queries a real-time-traffic service that we treat as a dependency.
  • Street View, Indoor Maps, AR navigation — out of scope for one round. They are sibling APIs that share infrastructure but not contract.
  • The actual routing algorithm (contraction hierarchies, A* variants, hub labelling) is a black box. The API contract is origin + destination -> route; the engine is a tuned C++ service.

What remains is still a six-sub-API design:

  • Tiles — raster and vector, by z/x/y. Heavily cached at the CDN; the API server barely sees a tile request that isn’t a cache miss.
  • Geocoding — address string in, lat/lng + structured address out.
  • Reverse geocoding — lat/lng in, address out.
  • Routing (Directions) — origin + destination + travel mode + waypoints, returns the path + ETA.
  • Distance Matrix — N origins × M destinations, returns the cost matrix.
  • Places — text search, autocomplete, photos, details.

Each sub-API has its own latency profile, its own cache strategy, and its own quota. The art of the round is laying this out clearly without descending into the algorithm for any of them.

The interviewer’s hidden objectives, roughly in order:

  • Can you enumerate the sub-APIs and not panic-merge them?
  • Can you set per-category latency budgets with the right numbers? Tiles are CDN-fast (sub-50 ms); routing is computation-heavy (sub-300 ms).
  • Can you treat quota and billing as a first-class API concern, not an afterthought?
  • Can you handle the autocomplete-vs-search distinction the way you’d handle suggest-vs-query in a search API?
  • Can you decide what is immutable and CDN-cacheable (tiles, place details) vs stale-tolerant (route ETAs) vs per-request (distance matrix)?

Requirements (functional and non-functional)#

Functional — in scope:

  • Tiles: serve raster (PNG / JPEG) and vector (Protobuf MVT) tiles by z/x/y, with style variants (roadmap, satellite, hybrid, terrain).
  • Geocoding: free-text address → latitude/longitude + structured address components.
  • Reverse geocoding: latitude/longitude → human-readable address.
  • Directions: route between an origin, destination, and up to 25 waypoints. Travel modes: driving, walking, bicycling, transit.
  • Distance Matrix: pairwise costs (distance and ETA) between up to 25 origins and 25 destinations (625 cells).
  • Places — text search: free-text “coffee near me” with optional bias.
  • Places — autocomplete: typeahead for partial place strings.
  • Places — details: fetch full place record by place_id (hours, photos, phone, etc.).

Functional — out of scope:

  • Street View imagery API (separate surface).
  • Indoor Maps and Air Quality data.
  • Roads API (snap-to-road) — a thin variant of routing, out for this round.
  • Static Maps and Maps Embed (the iframe). These are convenience surfaces over the Tiles API.
  • The map data pipeline itself.
  • Traffic-data ingestion sources.

Non-functional:

  • Tile latency: <= 50 ms p95 from the edge (CDN). Server fallback <= 200 ms p95 on cache miss.
  • Geocoding latency: <= 150 ms p95.
  • Routing latency: <= 300 ms p95 for a 25-waypoint route; <= 100 ms p95 for a simple A → B.
  • Distance Matrix latency: <= 500 ms p95 for a 25 × 25 matrix.
  • Places autocomplete: <= 100 ms p95 (typeahead bound).
  • Throughput: 1B requests / day globally → ~12k QPS sustained, 150k QPS peak. Tile traffic dominates; 70% of all requests.
  • Availability: 99.95% per category; tile failures can degrade to “blue water” placeholder tiles client-side.
  • Quota: per-API-key, per-API-category, with monthly billing aggregation.
  • Freshness: tiles refresh on a weekly cadence; routing ETAs update every 5 minutes from the traffic backend; place details every 24 hours.

Use case diagram#

┌─────────────────────┐
│ Developer (API key)│
└──────────┬──────────┘
┌─────────────────────┼──────────────────────┐
▼ ▼ ▼ ▼ ▼
[tile fetch] [geocode] [route] [distance [places search]
matrix] [places auto-c]
│ │ │ │ │
└──────────┴──────────┴──────────┴────────────┘
┌─────────────────────┐
│ Maps Platform │
└──────────┬──────────┘
┌─────────────────────┐
│ Quota / Billing │ ── meter every request
└─────────────────────┘

One actor (a developer or end-user proxy holding an API key). Six sub-APIs. The quota/billing seam is non-optional — every API has to meter, every API has to attribute back to a key.

Class diagram#

┌──────────────────────────┐
│ TileService │
├──────────────────────────┤
│ getTile(z,x,y,style,fmt) │ immutable per (z,x,y,style,version)
└──────────────────────────┘
┌──────────────────────────┐ ┌────────────────────┐
│ GeocodeService │ │ GeocodeResult │
├──────────────────────────┤ returns ├────────────────────┤
│ geocode(address, bias?) │────────►│ lat, lng │
│ reverseGeocode(lat,lng) │────────►│ formatted_address │
└──────────────────────────┘ │ components[] │
│ place_id │
└────────────────────┘
┌──────────────────────────┐ ┌────────────────────┐
│ DirectionsService │ │ Route │
├──────────────────────────┤ returns ├────────────────────┤
│ route(req) │────────►│ legs[] │
└──────────────────────────┘ │ polyline │
│ duration_sec │
│ duration_traffic │
│ distance_meters │
│ steps[] │
└────────────────────┘
┌──────────────────────────┐ ┌────────────────────┐
│ DistanceMatrixService │ │ MatrixResult │
├──────────────────────────┤ returns ├────────────────────┤
│ matrix(origins, │────────►│ rows[] │
│ destinations, mode) │ │ elements[] │
└──────────────────────────┘ │ duration, dist │
└────────────────────┘
┌──────────────────────────┐ ┌────────────────────┐
│ PlacesService │ │ Place │
├──────────────────────────┤ returns ├────────────────────┤
│ textSearch(q, bias?) │────────►│ place_id │
│ autocomplete(q, bias?) │────────►│ name, address │
│ details(place_id) │────────►│ rating, opening │
│ photo(photo_ref) │ │ photos[], etc. │
└──────────────────────────┘ └────────────────────┘

Six services, each owning one sub-API. Notice no service stores state — every endpoint is read-only or read-mostly. The place_id is the cross-API stable id that links geocoding, places, and directions.

Sequence diagram (key flows)#

Flow 1: tile fetch.

Browser CDN TileOrigin TileStore
│ GET tiles/v1/{z}/{x}/{y}.png?style=roadmap │
│─────────────►│ │ │
│ │ cache hit? │ │
│ │ yes ───► return cached PNG │
│ PNG bytes │ │ │
│◄─────────────│ │ │
│ │ no, fetch from origin │
│ │─────────────►│ │
│ │ │ key = z/x/y/v │
│ │ │────────────────►│
│ │ │ PNG bytes │
│ │ │◄────────────────│
│ │ PNG + Cache-Control: 7d │
│ │◄─────────────│ │
│ PNG bytes (now cached) │ │
│◄─────────────│ │ │

The CDN does almost all the work. Tile responses have Cache-Control: public, max-age=604800, immutable and the URL includes a version segment so cache busts are URL changes, not invalidations.

Flow 2: routing with traffic.

Client DirectionsAPI TrafficService RouteEngine
│ GET /v1/directions?origin=A&dest=B&mode=driving
│──────────────────►│ │ │
│ │ traffic snapshot │ │
│ │ for region(A,B) │ │
│ │─────────────────►│ │
│ │ edge weights │ │
│ │◄─────────────────│ │
│ │ compute route │ │
│ │─────────────────────────────────►│
│ │ polyline + ETA │ │
│ │◄─────────────────────────────────│
│ 200 + route │ │ │
│◄──────────────────│ │ │

The traffic snapshot is cached at 5-minute granularity per regional tile; the route engine is the per-request compute. The polyline is encoded with the standard Google polyline algorithm — a textual compression that knocks a 10 KB lat/lng array down to ~1 KB.

Flow 3: places autocomplete (typeahead).

Client PlacesAPI AutocompleteIndex
│ GET /v1/places/autocomplete?q=star
│──────────────────►│ │
│ │ prefix lookup │
│ │ weighted by bias │
│ │──────────────────►│
│ │ top-5 candidates │
│ │◄──────────────────│
│ 200 + 5 results │ │
│◄──────────────────│ │
│ │
│ GET /v1/places/autocomplete?q=starbu │
│──────────────────►│ │
│ ...refine... │ │

Each keystroke is one request; the API has no concept of a “session” — though the client passes a session_token so billing can group keystrokes-into-one-search for the autocomplete-then-details pattern. (This is one of the few cases where the API exposes a billing primitive directly.)

Activity diagram (for non-trivial state)#

Most endpoints are stateless. The one piece of structured logic is the quota / billing state machine every request flows through:

[request arrives]
┌─────────────────┐
│ resolve API key │── missing / invalid ─► 401
└────────┬────────┘
┌─────────────────┐
│ key enabled? │── disabled ──► 403
└────────┬────────┘
┌─────────────────┐
│ per-API quota │── exceeded ──► 429 + Retry-After
│ bucket OK? │
└────────┬────────┘
┌─────────────────┐
│ per-min rate │── exceeded ──► 429
│ limit OK? │
└────────┬────────┘
┌─────────────────┐
│ serve request │
└────────┬────────┘
┌─────────────────┐
│ meter usage │── async to billing pipeline
│ (1 unit per req)│
└─────────────────┘
respond

Invariants:

  • Quota and rate limits are per-API category, not global. A maxed-out Routing quota does not block Tile requests on the same key.
  • Metering is synchronous in the response (the X-Quota-Used header reports counts post-increment) but asynchronous to billing (the billing pipeline aggregates over a 24-hour window and reconciles).
  • A request that returns 200 but is partially served (e.g. routing returned only 3 of 5 requested alternative routes) is still billed as one unit. Partial success is not partial billing.
  • Tile requests served entirely from the CDN edge don’t hit the origin and therefore aren’t billed against the API key until the SDK reports edge-hit telemetry (best-effort, sampling-based). The contract: tiles are billed by approximate usage, not exact.

API implementation#

Endpoint catalogue#

MethodPathSub-API
GET/v1/tiles/{style}/{z}/{x}/{y}.{fmt}Tiles
GET/v1/geocode?address=...Geocoding
GET/v1/geocode/reverse?latlng=...Reverse geocoding
GET/v1/directionsDirections
POST/v1/distancematrixDistance Matrix (POST since payloads can be large)
GET/v1/places/textsearchPlaces — text search
GET/v1/places/autocompletePlaces — autocomplete
GET/v1/places/{place_id}Places — details
GET/v1/places/photo/{photo_ref}Places — photo bytes

Nine endpoints across six sub-APIs. Note that Tiles, Geocode, Directions, Places-text, Places-auto, Place-details, and Places-photo are all GET to maximise CDN cacheability. Distance Matrix is POST because a 25×25 origins/destinations payload can run past URL-length limits.

OpenAPI schema (excerpt)#

OpenAPI 3.1 — Maps API (core endpoints)
paths:
/v1/directions:
get:
operationId: directions
security: [{ apiKeyAuth: [] }]
parameters:
- name: origin
in: query
required: true
schema:
type: string
description: lat,lng or place_id:... or free-text address
- name: destination
in: query
required: true
schema: { type: string }
- name: mode
in: query
schema:
type: string
enum: [driving, walking, bicycling, transit]
default: driving
- name: waypoints
in: query
schema:
type: string
description: pipe-separated, up to 25
- name: departure_time
in: query
schema: { type: string, description: 'unix-seconds or "now"' }
- name: alternatives
in: query
schema: { type: boolean, default: false }
responses:
'200':
description: One or more route alternatives
headers:
X-Quota-Used: { schema: { type: integer } }
X-Quota-Remaining: { schema: { type: integer } }
content:
application/json:
schema: { $ref: '#/components/schemas/DirectionsResponse' }
'400': { description: Malformed origin or destination }
'429': { description: Quota or rate limit exceeded }
/v1/distancematrix:
post:
operationId: distanceMatrix
security: [{ apiKeyAuth: [] }]
requestBody:
required: true
content:
application/json:
schema:
type: object
required: [origins, destinations]
properties:
origins:
type: array
items: { type: string }
maxItems: 25
destinations:
type: array
items: { type: string }
maxItems: 25
mode:
type: string
enum: [driving, walking, bicycling, transit]
default: driving
departure_time: { type: string, nullable: true }
responses:
'200':
description: Matrix of pairwise costs
content:
application/json:
schema: { $ref: '#/components/schemas/MatrixResponse' }
/v1/places/autocomplete:
get:
operationId: placesAutocomplete
security: [{ apiKeyAuth: [] }]
parameters:
- { name: q, in: query, required: true, schema: { type: string, minLength: 1, maxLength: 100 } }
- name: session_token
in: query
schema: { type: string, description: UUID; groups autocomplete+details billing }
- name: location
in: query
schema: { type: string, description: 'lat,lng for bias' }
- name: radius
in: query
schema: { type: integer, description: meters }
responses:
'200':
description: Up to 5 predictions
content:
application/json:
schema:
type: object
properties:
predictions:
type: array
items: { $ref: '#/components/schemas/Prediction' }
components:
schemas:
DirectionsResponse:
type: object
required: [routes]
properties:
routes:
type: array
items: { $ref: '#/components/schemas/Route' }
Route:
type: object
required: [legs, overview_polyline, duration_seconds, distance_meters]
properties:
legs:
type: array
items: { $ref: '#/components/schemas/Leg' }
overview_polyline: { type: string, description: encoded polyline }
duration_seconds: { type: integer }
duration_in_traffic_seconds: { type: integer, nullable: true }
distance_meters: { type: integer }
warnings: { type: array, items: { type: string } }
fare:
type: object
nullable: true
properties:
currency: { type: string }
amount: { type: number }
Leg:
type: object
properties:
start_address: { type: string }
end_address: { type: string }
duration_seconds: { type: integer }
distance_meters: { type: integer }
steps:
type: array
items: { $ref: '#/components/schemas/Step' }
Step:
type: object
properties:
html_instructions: { type: string }
distance_meters: { type: integer }
duration_seconds: { type: integer }
polyline: { type: string }
travel_mode: { type: string }
Prediction:
type: object
properties:
place_id: { type: string }
description: { type: string }
matched_substrings:
type: array
items:
type: object
properties:
offset: { type: integer }
length: { type: integer }

Client samples — three languages#

A directions request in Python, Go, and Node.

Directions client — Python
import requests
API = "https://maps.example/v1"
KEY = "AIzaSy..."
def directions(origin, destination, mode="driving", waypoints=None, alternatives=False):
params = {
"origin": origin,
"destination": destination,
"mode": mode,
"alternatives": "true" if alternatives else "false",
"key": KEY,
}
if waypoints:
params["waypoints"] = "|".join(waypoints)
resp = requests.get(f"{API}/directions", params=params, timeout=2)
resp.raise_for_status()
return resp.json()
route = directions("37.7749,-122.4194", "37.3382,-121.8863", waypoints=["Palo Alto, CA"])
for r in route["routes"]:
print(r["duration_seconds"], "s ;", r["distance_meters"], "m")

Latency budget — per sub-API#

Each sub-API has its own budget. The discriminating insight is that they do not share a budget; you cannot trade routing latency for cheaper tiles.

Sub-APIp95 targetDominant costCache strategy
Tiles50 ms (edge)CDN egressCDN, 7-day TTL, URL versioned
Geocoding150 msParsing + index lookupApp-level cache, 1-hour TTL keyed on canonical form
Reverse geocoding150 msSpatial index lookupApp-level, 1-hour TTL keyed on (lat, lng) rounded to ~10m
Directions300 msRoute computation + traffic fetchNone for routes (traffic moves); 5-min traffic snapshot
Distance Matrix500 msN×M route computationsNone (combinatorial; never repeat exactly)
Places — text search250 msRanking passApp cache 5 min on (query, bias)
Places — autocomplete100 msPrefix indexApp cache, very short TTL (typeahead)
Places — details200 msDB fetch + hydrationApp cache 24 h keyed on place_id

The routing budget breaks down further:

PhaseBudget
Auth + quota check5 ms
Resolve origin/dest (geocode if free-text)30 ms
Traffic snapshot fetch25 ms
Route compute180 ms
Polyline encode + serialize30 ms
Margin30 ms
Total300 ms

If origin and dest are already lat/lng, the geocode step drops to zero — a meaningful 30 ms saved, which is why SDKs encourage clients to pass place_ids when they have them.

Trade-offs and extensions#

DecisionWhyCost if requirements change
Per-sub-API budget + quotaEach has different cost profileShared budgets can cross-subsidise; mistake at scale
Tiles via CDN at the edge70% of traffic; cacheability is enormousCDN versioning is a coordination burden on tile-data refresh
URL-versioned tile pathsTrivial cache bustingURL changes propagate slowly through embeds
GET for everything except Distance MatrixCacheable, idempotent, debuggableDistance Matrix payloads can be very large
Per-key quotas, billable per requestHonest cost attributionHeavy users need carve-outs; sales engineering territory
session_token for autocompleteBills one search instead of N keystrokesAdds a client-side bookkeeping requirement
Polyline encoding (lossy)Saves 90% of route payload sizeUp to ~5m precision loss; trade-off for any consumer doing offline routing
Geocoding returns top match by defaultFaster, simplerAmbiguous addresses need a second result_components walkthrough
25-cell limit on matrixBounds compute cost per requestLarger problems force client-side tiling
No long-poll / streamingStateless, CDN-friendlyLive traffic updates require client polling

Likely follow-up extensions and how the API absorbs them:

  • Snap-to-roads. A new endpoint POST /v1/roads/snap taking a GPS trail and returning the road-graph-projected coordinates. New sub-API; same quota / auth shape.
  • Elevation. GET /v1/elevation?path=... returning altitude samples along a path. Pure read endpoint; cacheable.
  • Time zones. GET /v1/timezone?location=lat,lng&timestamp=... — pure transform; CDN-friendly.
  • Geofencing webhooks. Subscribe to a polygon and receive a webhook when an authenticated user crosses it. New write endpoint (POST /v1/geofences) + event-channel; significant scope creep for a 45-min round.
  • Real-time location sharing. WebSocket channel for live location streams between participants. Different architecture; not a fit for the existing read-only Maps surface.

Mock interview follow-ups#

  • “Why is Distance Matrix POST when everything else is GET?” — URL length. Twenty-five origins of lat,lng plus twenty-five destinations plus the API key blows past 2 KB on most gateways. POST also lets us pass complex departure-time arrays without serialising them into the query.
  • “How do tiles handle map-data updates?” — Each tile-set release ships with a new version segment in the URL (/v1/tiles/roadmap/v_2026_w22/{z}/{x}/{y}.png). Clients fetch the active version from a tiny manifest endpoint and from then on, every tile URL is content-addressable. Old tiles age out of the CDN; no purge needed.
  • “How does autocomplete bias by user location?” — Optional location=lat,lng&radius=... parameters. The ranker weights candidates inside the radius higher; without bias, it falls back to global popularity. The biasing is a hint, not a filter — a query for “London” near San Francisco still returns London, UK.
  • “What’s session_token actually for?” — Billing. A typical “find a place” flow is autocomplete (N keystrokes) → details (1 fetch). Without a session token, that’s N + 1 billable calls. With a session token, the API charges as one session, capping autocomplete cost. The token’s lifetime is 3 minutes.
  • “How does the routing API handle a traffic-data outage?” — Falls back to static (free-flow) ETAs and sets duration_in_traffic_seconds: null. The client renders a warning. Better degraded data than no route.
  • “How would you support a 100×100 distance matrix?” — You wouldn’t from one endpoint. The client tiles into 16 sub-matrices of 25×25 and stitches client-side. The API limit exists because of the per-request compute budget; lifting it without per-cell billing would invite abuse.
  • “How do you prevent API-key leakage from a JavaScript map embed?” — Two answers: (1) HTTP-Referer restrictions baked into the key configuration; (2) per-key URL signing for sensitive endpoints. Client-side keys are inherently semi-public; we limit the blast radius.
  • “At 10x scale, what breaks first?” — The Directions service’s traffic-snapshot fetch. We’d pre-shard the traffic state by H3 hexagons and locate the route engine in the same region as the relevant hex’s owner. Tiles already scale at CDN economics. Geocoding scales by sharding the index.
  • “How do you handle places that disappear (closed restaurants)?” — Details endpoint returns business_status: CLOSED_PERMANENTLY. Autocomplete and text search down-rank but don’t suppress, since some users search by name to confirm closure. Place_id is stable across status changes.

One unified /v1/maps endpoint. Sounds clean, ages terribly. Tiles want CDN caching, geocoding wants short app-cache TTLs, routing has no cache at all. One endpoint forces one cache policy, which means one of the three is wrong.

Six sub-APIs with per-category contracts. Each has the cache, quota, latency budget, and shape that fits its workload. Developers pay attention to the one or two they use and the others stay invisible. Maps has shipped this shape for over a decade; the contract is the architecture.

Search ESC

Keyboard shortcuts

Shortcuts are disabled while typing in inputs.