Design the Google Maps API
Tiles, geocoding, routing, places, distance matrix. The geospatial endpoints behind a billion daily queries.
Context#
Maps is the canonical “multi-tenant geospatial API” question and an Advanced-tier prompt because the surface is enormous. A candidate who tries to design every endpoint loses the round; a candidate who refuses to bound the scope from the start has the same problem.
This writeup is an API-design round, not an HLD round. That means:
- The actual map data pipeline — ingestion of satellite imagery, OSM-style edits, Street View, road graph extraction — is a black box. We design endpoints over the rendered result.
- Traffic ingestion (probes from Android, Waze, etc.) is a black box. The routing endpoint queries a real-time-traffic service that we treat as a dependency.
- Street View, Indoor Maps, AR navigation — out of scope for one round. They are sibling APIs that share infrastructure but not contract.
- The actual routing algorithm (contraction hierarchies, A* variants, hub labelling) is a black box. The API contract is
origin + destination -> route; the engine is a tuned C++ service.
What remains is still a six-sub-API design:
- Tiles — raster and vector, by
z/x/y. Heavily cached at the CDN; the API server barely sees a tile request that isn’t a cache miss. - Geocoding — address string in, lat/lng + structured address out.
- Reverse geocoding — lat/lng in, address out.
- Routing (Directions) — origin + destination + travel mode + waypoints, returns the path + ETA.
- Distance Matrix — N origins × M destinations, returns the cost matrix.
- Places — text search, autocomplete, photos, details.
Each sub-API has its own latency profile, its own cache strategy, and its own quota. The art of the round is laying this out clearly without descending into the algorithm for any of them.
The interviewer’s hidden objectives, roughly in order:
- Can you enumerate the sub-APIs and not panic-merge them?
- Can you set per-category latency budgets with the right numbers? Tiles are CDN-fast (sub-50 ms); routing is computation-heavy (sub-300 ms).
- Can you treat quota and billing as a first-class API concern, not an afterthought?
- Can you handle the autocomplete-vs-search distinction the way you’d handle suggest-vs-query in a search API?
- Can you decide what is immutable and CDN-cacheable (tiles, place details) vs stale-tolerant (route ETAs) vs per-request (distance matrix)?
Requirements (functional and non-functional)#
Functional — in scope:
- Tiles: serve raster (PNG / JPEG) and vector (Protobuf MVT) tiles by
z/x/y, with style variants (roadmap, satellite, hybrid, terrain). - Geocoding: free-text address → latitude/longitude + structured address components.
- Reverse geocoding: latitude/longitude → human-readable address.
- Directions: route between an origin, destination, and up to 25 waypoints. Travel modes: driving, walking, bicycling, transit.
- Distance Matrix: pairwise costs (distance and ETA) between up to 25 origins and 25 destinations (625 cells).
- Places — text search: free-text “coffee near me” with optional bias.
- Places — autocomplete: typeahead for partial place strings.
- Places — details: fetch full place record by
place_id(hours, photos, phone, etc.).
Functional — out of scope:
- Street View imagery API (separate surface).
- Indoor Maps and Air Quality data.
- Roads API (snap-to-road) — a thin variant of routing, out for this round.
- Static Maps and Maps Embed (the iframe). These are convenience surfaces over the Tiles API.
- The map data pipeline itself.
- Traffic-data ingestion sources.
Non-functional:
- Tile latency:
<= 50 ms p95from the edge (CDN). Server fallback<= 200 ms p95on cache miss. - Geocoding latency:
<= 150 ms p95. - Routing latency:
<= 300 ms p95for a 25-waypoint route;<= 100 ms p95for a simple A → B. - Distance Matrix latency:
<= 500 ms p95for a 25 × 25 matrix. - Places autocomplete:
<= 100 ms p95(typeahead bound). - Throughput: 1B requests / day globally → ~12k QPS sustained, 150k QPS peak. Tile traffic dominates; 70% of all requests.
- Availability: 99.95% per category; tile failures can degrade to “blue water” placeholder tiles client-side.
- Quota: per-API-key, per-API-category, with monthly billing aggregation.
- Freshness: tiles refresh on a weekly cadence; routing ETAs update every 5 minutes from the traffic backend; place details every 24 hours.
Use case diagram#
┌─────────────────────┐ │ Developer (API key)│ └──────────┬──────────┘ │ ┌─────────────────────┼──────────────────────┐ ▼ ▼ ▼ ▼ ▼[tile fetch] [geocode] [route] [distance [places search] matrix] [places auto-c] │ │ │ │ │ └──────────┴──────────┴──────────┴────────────┘ │ ▼ ┌─────────────────────┐ │ Maps Platform │ └──────────┬──────────┘ │ ▼ ┌─────────────────────┐ │ Quota / Billing │ ── meter every request └─────────────────────┘One actor (a developer or end-user proxy holding an API key). Six sub-APIs. The quota/billing seam is non-optional — every API has to meter, every API has to attribute back to a key.
Class diagram#
┌──────────────────────────┐ │ TileService │ ├──────────────────────────┤ │ getTile(z,x,y,style,fmt) │ immutable per (z,x,y,style,version) └──────────────────────────┘
┌──────────────────────────┐ ┌────────────────────┐ │ GeocodeService │ │ GeocodeResult │ ├──────────────────────────┤ returns ├────────────────────┤ │ geocode(address, bias?) │────────►│ lat, lng │ │ reverseGeocode(lat,lng) │────────►│ formatted_address │ └──────────────────────────┘ │ components[] │ │ place_id │ └────────────────────┘
┌──────────────────────────┐ ┌────────────────────┐ │ DirectionsService │ │ Route │ ├──────────────────────────┤ returns ├────────────────────┤ │ route(req) │────────►│ legs[] │ └──────────────────────────┘ │ polyline │ │ duration_sec │ │ duration_traffic │ │ distance_meters │ │ steps[] │ └────────────────────┘
┌──────────────────────────┐ ┌────────────────────┐ │ DistanceMatrixService │ │ MatrixResult │ ├──────────────────────────┤ returns ├────────────────────┤ │ matrix(origins, │────────►│ rows[] │ │ destinations, mode) │ │ elements[] │ └──────────────────────────┘ │ duration, dist │ └────────────────────┘
┌──────────────────────────┐ ┌────────────────────┐ │ PlacesService │ │ Place │ ├──────────────────────────┤ returns ├────────────────────┤ │ textSearch(q, bias?) │────────►│ place_id │ │ autocomplete(q, bias?) │────────►│ name, address │ │ details(place_id) │────────►│ rating, opening │ │ photo(photo_ref) │ │ photos[], etc. │ └──────────────────────────┘ └────────────────────┘Six services, each owning one sub-API. Notice no service stores state — every endpoint is read-only or read-mostly. The place_id is the cross-API stable id that links geocoding, places, and directions.
Sequence diagram (key flows)#
Flow 1: tile fetch.
Browser CDN TileOrigin TileStore │ GET tiles/v1/{z}/{x}/{y}.png?style=roadmap │ │─────────────►│ │ │ │ │ cache hit? │ │ │ │ yes ───► return cached PNG │ │ PNG bytes │ │ │ │◄─────────────│ │ │ │ │ no, fetch from origin │ │ │─────────────►│ │ │ │ │ key = z/x/y/v │ │ │ │────────────────►│ │ │ │ PNG bytes │ │ │ │◄────────────────│ │ │ PNG + Cache-Control: 7d │ │ │◄─────────────│ │ │ PNG bytes (now cached) │ │ │◄─────────────│ │ │The CDN does almost all the work. Tile responses have Cache-Control: public, max-age=604800, immutable and the URL includes a version segment so cache busts are URL changes, not invalidations.
Flow 2: routing with traffic.
Client DirectionsAPI TrafficService RouteEngine │ GET /v1/directions?origin=A&dest=B&mode=driving │──────────────────►│ │ │ │ │ traffic snapshot │ │ │ │ for region(A,B) │ │ │ │─────────────────►│ │ │ │ edge weights │ │ │ │◄─────────────────│ │ │ │ compute route │ │ │ │─────────────────────────────────►│ │ │ polyline + ETA │ │ │ │◄─────────────────────────────────│ │ 200 + route │ │ │ │◄──────────────────│ │ │The traffic snapshot is cached at 5-minute granularity per regional tile; the route engine is the per-request compute. The polyline is encoded with the standard Google polyline algorithm — a textual compression that knocks a 10 KB lat/lng array down to ~1 KB.
Flow 3: places autocomplete (typeahead).
Client PlacesAPI AutocompleteIndex │ GET /v1/places/autocomplete?q=star │──────────────────►│ │ │ │ prefix lookup │ │ │ weighted by bias │ │ │──────────────────►│ │ │ top-5 candidates │ │ │◄──────────────────│ │ 200 + 5 results │ │ │◄──────────────────│ │ │ │ │ GET /v1/places/autocomplete?q=starbu │ │──────────────────►│ │ │ ...refine... │ │Each keystroke is one request; the API has no concept of a “session” — though the client passes a session_token so billing can group keystrokes-into-one-search for the autocomplete-then-details pattern. (This is one of the few cases where the API exposes a billing primitive directly.)
Activity diagram (for non-trivial state)#
Most endpoints are stateless. The one piece of structured logic is the quota / billing state machine every request flows through:
[request arrives] │ ▼ ┌─────────────────┐ │ resolve API key │── missing / invalid ─► 401 └────────┬────────┘ │ ▼ ┌─────────────────┐ │ key enabled? │── disabled ──► 403 └────────┬────────┘ │ ▼ ┌─────────────────┐ │ per-API quota │── exceeded ──► 429 + Retry-After │ bucket OK? │ └────────┬────────┘ │ ▼ ┌─────────────────┐ │ per-min rate │── exceeded ──► 429 │ limit OK? │ └────────┬────────┘ │ ▼ ┌─────────────────┐ │ serve request │ └────────┬────────┘ │ ▼ ┌─────────────────┐ │ meter usage │── async to billing pipeline │ (1 unit per req)│ └─────────────────┘ │ ▼ respondInvariants:
- Quota and rate limits are per-API category, not global. A maxed-out Routing quota does not block Tile requests on the same key.
- Metering is synchronous in the response (the
X-Quota-Usedheader reports counts post-increment) but asynchronous to billing (the billing pipeline aggregates over a 24-hour window and reconciles). - A request that returns
200but is partially served (e.g. routing returned only 3 of 5 requested alternative routes) is still billed as one unit. Partial success is not partial billing. - Tile requests served entirely from the CDN edge don’t hit the origin and therefore aren’t billed against the API key until the SDK reports edge-hit telemetry (best-effort, sampling-based). The contract: tiles are billed by approximate usage, not exact.
API implementation#
Endpoint catalogue#
| Method | Path | Sub-API |
|---|---|---|
GET | /v1/tiles/{style}/{z}/{x}/{y}.{fmt} | Tiles |
GET | /v1/geocode?address=... | Geocoding |
GET | /v1/geocode/reverse?latlng=... | Reverse geocoding |
GET | /v1/directions | Directions |
POST | /v1/distancematrix | Distance Matrix (POST since payloads can be large) |
GET | /v1/places/textsearch | Places — text search |
GET | /v1/places/autocomplete | Places — autocomplete |
GET | /v1/places/{place_id} | Places — details |
GET | /v1/places/photo/{photo_ref} | Places — photo bytes |
Nine endpoints across six sub-APIs. Note that Tiles, Geocode, Directions, Places-text, Places-auto, Place-details, and Places-photo are all GET to maximise CDN cacheability. Distance Matrix is POST because a 25×25 origins/destinations payload can run past URL-length limits.
OpenAPI schema (excerpt)#
paths: /v1/directions: get: operationId: directions security: [{ apiKeyAuth: [] }] parameters: - name: origin in: query required: true schema: type: string description: lat,lng or place_id:... or free-text address - name: destination in: query required: true schema: { type: string } - name: mode in: query schema: type: string enum: [driving, walking, bicycling, transit] default: driving - name: waypoints in: query schema: type: string description: pipe-separated, up to 25 - name: departure_time in: query schema: { type: string, description: 'unix-seconds or "now"' } - name: alternatives in: query schema: { type: boolean, default: false } responses: '200': description: One or more route alternatives headers: X-Quota-Used: { schema: { type: integer } } X-Quota-Remaining: { schema: { type: integer } } content: application/json: schema: { $ref: '#/components/schemas/DirectionsResponse' } '400': { description: Malformed origin or destination } '429': { description: Quota or rate limit exceeded }
/v1/distancematrix: post: operationId: distanceMatrix security: [{ apiKeyAuth: [] }] requestBody: required: true content: application/json: schema: type: object required: [origins, destinations] properties: origins: type: array items: { type: string } maxItems: 25 destinations: type: array items: { type: string } maxItems: 25 mode: type: string enum: [driving, walking, bicycling, transit] default: driving departure_time: { type: string, nullable: true } responses: '200': description: Matrix of pairwise costs content: application/json: schema: { $ref: '#/components/schemas/MatrixResponse' }
/v1/places/autocomplete: get: operationId: placesAutocomplete security: [{ apiKeyAuth: [] }] parameters: - { name: q, in: query, required: true, schema: { type: string, minLength: 1, maxLength: 100 } } - name: session_token in: query schema: { type: string, description: UUID; groups autocomplete+details billing } - name: location in: query schema: { type: string, description: 'lat,lng for bias' } - name: radius in: query schema: { type: integer, description: meters } responses: '200': description: Up to 5 predictions content: application/json: schema: type: object properties: predictions: type: array items: { $ref: '#/components/schemas/Prediction' }
components: schemas: DirectionsResponse: type: object required: [routes] properties: routes: type: array items: { $ref: '#/components/schemas/Route' } Route: type: object required: [legs, overview_polyline, duration_seconds, distance_meters] properties: legs: type: array items: { $ref: '#/components/schemas/Leg' } overview_polyline: { type: string, description: encoded polyline } duration_seconds: { type: integer } duration_in_traffic_seconds: { type: integer, nullable: true } distance_meters: { type: integer } warnings: { type: array, items: { type: string } } fare: type: object nullable: true properties: currency: { type: string } amount: { type: number } Leg: type: object properties: start_address: { type: string } end_address: { type: string } duration_seconds: { type: integer } distance_meters: { type: integer } steps: type: array items: { $ref: '#/components/schemas/Step' } Step: type: object properties: html_instructions: { type: string } distance_meters: { type: integer } duration_seconds: { type: integer } polyline: { type: string } travel_mode: { type: string } Prediction: type: object properties: place_id: { type: string } description: { type: string } matched_substrings: type: array items: type: object properties: offset: { type: integer } length: { type: integer }Client samples — three languages#
A directions request in Python, Go, and Node.
import requests
API = "https://maps.example/v1"KEY = "AIzaSy..."
def directions(origin, destination, mode="driving", waypoints=None, alternatives=False): params = { "origin": origin, "destination": destination, "mode": mode, "alternatives": "true" if alternatives else "false", "key": KEY, } if waypoints: params["waypoints"] = "|".join(waypoints) resp = requests.get(f"{API}/directions", params=params, timeout=2) resp.raise_for_status() return resp.json()
route = directions("37.7749,-122.4194", "37.3382,-121.8863", waypoints=["Palo Alto, CA"])for r in route["routes"]: print(r["duration_seconds"], "s ;", r["distance_meters"], "m")package main
import ( "encoding/json" "fmt" "net/http" "net/url" "strings")
const API = "https://maps.example/v1"const Key = "AIzaSy..."
type Route struct { DurationSeconds int `json:"duration_seconds"` DistanceMeters int `json:"distance_meters"` OverviewPolyline string `json:"overview_polyline"`}
type DirectionsResponse struct { Routes []Route `json:"routes"`}
func directions(origin, dest, mode string, waypoints []string) (*DirectionsResponse, error) { u, _ := url.Parse(API + "/directions") qs := u.Query() qs.Set("origin", origin) qs.Set("destination", dest) qs.Set("mode", mode) qs.Set("key", Key) if len(waypoints) > 0 { qs.Set("waypoints", strings.Join(waypoints, "|")) } u.RawQuery = qs.Encode() resp, err := http.Get(u.String()) if err != nil { return nil, err } defer resp.Body.Close() var out DirectionsResponse json.NewDecoder(resp.Body).Decode(&out) return &out, nil}
func main() { r, _ := directions("37.7749,-122.4194", "37.3382,-121.8863", "driving", nil) for _, x := range r.Routes { fmt.Printf("%ds, %dm\n", x.DurationSeconds, x.DistanceMeters) }}const API = "https://maps.example/v1";const KEY = "AIzaSy...";
export async function directions(origin, destination, opts = {}) { const { mode = "driving", waypoints, alternatives = false } = opts; const params = new URLSearchParams({ origin, destination, mode, alternatives: String(alternatives), key: KEY, }); if (waypoints) params.set("waypoints", waypoints.join("|")); const resp = await fetch(`${API}/directions?${params}`); if (!resp.ok) throw new Error(`HTTP ${resp.status}`); return resp.json();}
const route = await directions("37.7749,-122.4194", "37.3382,-121.8863", { waypoints: ["Palo Alto, CA"],});for (const r of route.routes) { console.log(`${r.duration_seconds}s, ${r.distance_meters}m`);}Latency budget — per sub-API#
Each sub-API has its own budget. The discriminating insight is that they do not share a budget; you cannot trade routing latency for cheaper tiles.
| Sub-API | p95 target | Dominant cost | Cache strategy |
|---|---|---|---|
| Tiles | 50 ms (edge) | CDN egress | CDN, 7-day TTL, URL versioned |
| Geocoding | 150 ms | Parsing + index lookup | App-level cache, 1-hour TTL keyed on canonical form |
| Reverse geocoding | 150 ms | Spatial index lookup | App-level, 1-hour TTL keyed on (lat, lng) rounded to ~10m |
| Directions | 300 ms | Route computation + traffic fetch | None for routes (traffic moves); 5-min traffic snapshot |
| Distance Matrix | 500 ms | N×M route computations | None (combinatorial; never repeat exactly) |
| Places — text search | 250 ms | Ranking pass | App cache 5 min on (query, bias) |
| Places — autocomplete | 100 ms | Prefix index | App cache, very short TTL (typeahead) |
| Places — details | 200 ms | DB fetch + hydration | App cache 24 h keyed on place_id |
The routing budget breaks down further:
| Phase | Budget |
|---|---|
| Auth + quota check | 5 ms |
| Resolve origin/dest (geocode if free-text) | 30 ms |
| Traffic snapshot fetch | 25 ms |
| Route compute | 180 ms |
| Polyline encode + serialize | 30 ms |
| Margin | 30 ms |
| Total | 300 ms |
If origin and dest are already lat/lng, the geocode step drops to zero — a meaningful 30 ms saved, which is why SDKs encourage clients to pass place_ids when they have them.
Trade-offs and extensions#
| Decision | Why | Cost if requirements change |
|---|---|---|
| Per-sub-API budget + quota | Each has different cost profile | Shared budgets can cross-subsidise; mistake at scale |
| Tiles via CDN at the edge | 70% of traffic; cacheability is enormous | CDN versioning is a coordination burden on tile-data refresh |
| URL-versioned tile paths | Trivial cache busting | URL changes propagate slowly through embeds |
| GET for everything except Distance Matrix | Cacheable, idempotent, debuggable | Distance Matrix payloads can be very large |
| Per-key quotas, billable per request | Honest cost attribution | Heavy users need carve-outs; sales engineering territory |
session_token for autocomplete | Bills one search instead of N keystrokes | Adds a client-side bookkeeping requirement |
| Polyline encoding (lossy) | Saves 90% of route payload size | Up to ~5m precision loss; trade-off for any consumer doing offline routing |
| Geocoding returns top match by default | Faster, simpler | Ambiguous addresses need a second result_components walkthrough |
| 25-cell limit on matrix | Bounds compute cost per request | Larger problems force client-side tiling |
| No long-poll / streaming | Stateless, CDN-friendly | Live traffic updates require client polling |
Likely follow-up extensions and how the API absorbs them:
- Snap-to-roads. A new endpoint
POST /v1/roads/snaptaking a GPS trail and returning the road-graph-projected coordinates. New sub-API; same quota / auth shape. - Elevation.
GET /v1/elevation?path=...returning altitude samples along a path. Pure read endpoint; cacheable. - Time zones.
GET /v1/timezone?location=lat,lng×tamp=...— pure transform; CDN-friendly. - Geofencing webhooks. Subscribe to a polygon and receive a webhook when an authenticated user crosses it. New write endpoint (
POST /v1/geofences) + event-channel; significant scope creep for a 45-min round. - Real-time location sharing. WebSocket channel for live location streams between participants. Different architecture; not a fit for the existing read-only Maps surface.
Mock interview follow-ups#
- “Why is Distance Matrix
POSTwhen everything else isGET?” — URL length. Twenty-five origins oflat,lngplus twenty-five destinations plus the API key blows past 2 KB on most gateways. POST also lets us pass complex departure-time arrays without serialising them into the query. - “How do tiles handle map-data updates?” — Each tile-set release ships with a new version segment in the URL (
/v1/tiles/roadmap/v_2026_w22/{z}/{x}/{y}.png). Clients fetch the active version from a tiny manifest endpoint and from then on, every tile URL is content-addressable. Old tiles age out of the CDN; no purge needed. - “How does autocomplete bias by user location?” — Optional
location=lat,lng&radius=...parameters. The ranker weights candidates inside the radius higher; without bias, it falls back to global popularity. The biasing is a hint, not a filter — a query for “London” near San Francisco still returns London, UK. - “What’s
session_tokenactually for?” — Billing. A typical “find a place” flow is autocomplete (N keystrokes) → details (1 fetch). Without a session token, that’s N + 1 billable calls. With a session token, the API charges as one session, capping autocomplete cost. The token’s lifetime is 3 minutes. - “How does the routing API handle a traffic-data outage?” — Falls back to static (free-flow) ETAs and sets
duration_in_traffic_seconds: null. The client renders a warning. Better degraded data than no route. - “How would you support a 100×100 distance matrix?” — You wouldn’t from one endpoint. The client tiles into 16 sub-matrices of 25×25 and stitches client-side. The API limit exists because of the per-request compute budget; lifting it without per-cell billing would invite abuse.
- “How do you prevent API-key leakage from a JavaScript map embed?” — Two answers: (1) HTTP-Referer restrictions baked into the key configuration; (2) per-key URL signing for sensitive endpoints. Client-side keys are inherently semi-public; we limit the blast radius.
- “At 10x scale, what breaks first?” — The Directions service’s traffic-snapshot fetch. We’d pre-shard the traffic state by H3 hexagons and locate the route engine in the same region as the relevant hex’s owner. Tiles already scale at CDN economics. Geocoding scales by sharding the index.
- “How do you handle places that disappear (closed restaurants)?” — Details endpoint returns
business_status: CLOSED_PERMANENTLY. Autocomplete and text search down-rank but don’t suppress, since some users search by name to confirm closure. Place_id is stable across status changes.
One unified /v1/maps endpoint. Sounds clean, ages terribly. Tiles want CDN caching, geocoding wants short app-cache TTLs, routing has no cache at all. One endpoint forces one cache policy, which means one of the three is wrong.
Six sub-APIs with per-category contracts. Each has the cache, quota, latency budget, and shape that fits its workload. Developers pay attention to the one or two they use and the others stay invisible. Maps has shipped this shape for over a decade; the contract is the architecture.
Related#
- Caching at Different Layers — the CDN-edge-app-cache stack the Tiles sub-API depends on entirely.
- Rate Limiting — the per-key, per-API quota mechanism.
- Estimating API Latency — Back-of-Envelope — the back-of-envelope numbers that justify the per-sub-API budgets.
- The API-Design Walk-through — the seven-step recipe this writeup followed.
- REST — The Architectural Style — the architectural style behind the endpoint shape.