Design the YouTube Streaming API
Upload pipeline, transcode, ABR manifests, CDN, recommendation. The biggest video service in the world, from the API's side.
Context#
YouTube is the canonical large-scale video question and an Advanced-tier prompt for a reason: most candidates conflate the HLD round with the API round and try to design every box on the page. The trick is to acknowledge the overlap up front and cut hard.
This writeup is an API-design round, not an HLD round. That means:
- The transcode pipeline is a black box behind an async webhook. We don’t design the encoder fleet.
- The CDN is a black box that returns a hostname; we don’t design eviction, peering, or fill paths.
- The recommendation engine is a black box that returns a list of video IDs; we don’t design candidate generation, ranking, or the embedding store.
- Monetisation, Content-ID copyright matching, comment moderation — all out of scope. The interviewer will respect a clear bound far more than a vague attempt to cover everything.
What remains is still a rich API surface:
- A three-sub-API design — Upload, Playback, Recommendation — each with its own latency profile and contract.
- A resumable multipart upload protocol that has to survive the user closing the laptop with 4 GB left to send.
- A playback API that hands back signed URLs to ABR manifests (HLS / DASH) without leaking the underlying CDN topology.
- A transcode-completion webhook that the upload pipeline posts back to as a separate event, decoupling the synchronous upload from the async work.
- A per-region CDN selection the API has to make at request time using the client’s IP.
The interviewer’s hidden objectives, roughly in order:
- Can you decompose a giant product into three clean sub-APIs without panic-merging them?
- Can you produce a resumable-upload protocol that survives partial failures?
- Can you separate the synchronous upload contract from the async transcode contract with a clear webhook seam?
- Can you give a sensible playback latency budget (sub-2-second start-to-first-frame)?
- Can you talk about CDN selection as an API concern without designing the CDN itself?
Requirements (functional and non-functional)#
Functional — in scope:
- Initiate, perform, and complete a resumable upload of a video file up to 256 GB.
- Trigger an async transcode on upload completion; deliver result via webhook to the uploader’s server (or app push to the mobile client).
- Return a playback session for an authenticated viewer: ABR manifest URL, DRM token if required, geo-routed CDN host.
- Return a per-channel recommendation list keyed on viewer + channel context.
- Expose video metadata read endpoints (title, description, duration, available qualities, captions).
- Range-request friendly download is explicitly not in scope for end users — the only legitimate read path is the ABR playback manifest.
Functional — out of scope:
- Monetisation, ads-decisioning, partner programs.
- Content-ID copyright matching (a separate background pipeline; not API-exposed).
- Comment moderation API (separate workbook: see Comment Service API).
- Live streaming. VOD only in this round.
- The transcode pipeline’s internals (encoder fleet, codec selection, the ABR ladder construction).
- The recommendation model itself; we expose a thin endpoint over a black-box ranker.
Non-functional:
- Upload: support files up to 256 GB; resumable across network drops; chunks up to 64 MB.
- Transcode latency: not user-facing; SLA is “transcode completes within 4x video duration p95”. Webhook fires inside 5 s of completion.
- Playback start: time-to-first-frame
<= 2 s p95from the moment the manifest URL is requested. - Manifest latency: manifest fetch
<= 100 ms p95(it’s a small JSON / XML doc fetched once per session). - Recommendation latency:
<= 200 ms p95(it’s a hot read at app open). - Throughput: 500k playback session opens per second globally; 50k uploads per second.
- Availability: 99.99% on playback; 99.9% on upload (uploads are resumable so a brief outage is recoverable).
- Durability: 11-nines on the original video bytes; transcode outputs are regeneratable.
Use case diagram#
┌─────────────────┐ │ Creator (UA) │ └────────┬────────┘ │ ┌──────────────┼──────────────┐ ▼ ▼ ▼ [initiate [upload parts] [complete upload] upload] │ │ │ └──────────────┴──────────────┘ │ ▼ ┌─────────────────┐ │ Upload API │──── webhook ───► creator's server └─────────────────┘
┌─────────────────┐ │ Viewer (UA) │ └────────┬────────┘ │ ┌──────────────┼──────────────┐ ▼ ▼ ▼ [get playback [get rec. [get video session] list] metadata] │ │ │ └──────────────┴──────────────┘ │ ▼ ┌─────────────────┐ │ Playback / │ │ Recommend API │ └─────────────────┘Two actors. Creator’s surface is the Upload API. Viewer’s surface is Playback + Recommendation + Metadata. The webhook from Upload back to the creator’s server is the seam between the sync upload contract and the async transcode work.
Class diagram#
┌──────────────────────────┐ │ UploadService │ ├──────────────────────────┤ │ initiateUpload(req) │ │ putPart(session, idx, b) │ │ completeUpload(session) │ │ cancelUpload(session) │ └──────────────┬───────────┘ │ creates ▼ ┌──────────────────────────┐ │ UploadSession │ ├──────────────────────────┤ │ id : UUID │ │ video_id : str │ │ owner_id : str │ │ total_bytes : int │ │ part_size : int (=64MB) │ │ parts_received : [int] │ │ state : enum │ │ expires_at : timestamp │ └──────────────────────────┘
┌──────────────────────────┐ ┌─────────────────────┐ │ PlaybackService │ │ PlaybackSession │ ├──────────────────────────┤ ├─────────────────────┤ │ openSession(vid, viewer) │ returns │ manifest_url : str │ │ heartbeat(session) │────────►│ drm_token? : str │ │ closeSession(session) │ │ cdn_host : str │ └──────────────────────────┘ │ ttl_seconds : int │ │ session_id : UUID │ └─────────────────────┘
┌──────────────────────────┐ ┌─────────────────────┐ │ RecommendationService │ │ VideoSummary │ ├──────────────────────────┤ ├─────────────────────┤ │ forChannel(ch, viewer) │ returns │ id, title │ │ forHomepage(viewer) │────────►│ duration_seconds │ │ relatedTo(video_id) │ │ thumbnail_url │ └──────────────────────────┘ │ channel_id │ └─────────────────────┘Three services, each owning one sub-API. The UploadSession is the only resource with non-trivial state (the others are read-mostly). PlaybackSession is short-lived (TTL on the order of an hour) and the manifest_url is signed so it can’t be re-used out of context.
Sequence diagram (key flows)#
Flow 1: resumable upload.
Creator UploadAPI BlobStore TranscodeQueue │ POST /videos:initiateUpload │ │ │ { file_size, mime, title } │ │ │─────────────────────────────►│ │ │ 201 + session, part_size │ │ │◄─────────────────────────────│ │ │ │ │ │ PUT /uploads/{s}/parts/0 │ │ │ (64MB chunk) │ │ │─────────────────────────────►│ stage part 0 │ │ │─────────────────►│ (no) │ 204 + etag │ │ │◄─────────────────────────────│ │ │ │ │ │ ... PUT part 1, 2, ... N-1 ... │ │ │ │ │ POST /uploads/{s}:complete │ │ │─────────────────────────────►│ finalize blob │ │ │ enqueue transcode│ │ │─────────────────►│ │ 202 Accepted + video_id │ │ │◄─────────────────────────────│ │ │ │ │ (some time later) │ │ ◄── transcode done│ │ POST {creator_webhook_url} │ │ │ { video_id, status: ready } │ │ │◄─────────────────────────────│ │Note the 202 Accepted on completion — the bytes are durable but the video is not yet playable. The webhook is the contract that says “now it is”.
Flow 2: viewer playback.
Viewer PlaybackAPI SignedURL CDN │ POST /videos/{id}/playback │ │ │ (auth: bearer token) │ │ │─────────────────────────────►│ │ │ geo-pick CDN from client IP │ │ │ mint signed manifest URL │ │ │ ────────────────────────────►│ │ │ signed URL │ │ │◄─────────────────────────────│ │ │ 200 + manifest_url, ttl │ │ │◄─────────────────────────────│ │ │ │ │ │ GET {manifest_url} │ │ ────────────────────────────────────────────────►│ │ HLS / DASH manifest │ │◄────────────────────────────────────────────────│ │ │ │ │ GET segment .ts / .m4s ... │ │ ────────────────────────────────────────────────►│ │ ABR client picks rendition per bandwidth │The PlaybackAPI does the CDN selection — choosing among iad.cdn.youtube.example, fra.cdn.youtube.example, sin.cdn.youtube.example etc. based on the client IP’s GeoIP lookup. The signed URL ties the manifest to that specific CDN host plus the session ID, so it can’t be replayed from another region.
Activity diagram (for non-trivial state)#
The UploadSession state machine is the part of this design with real lifecycle logic. Everything else is request/response.
[initiateUpload] │ ▼ ┌────────────────┐ │ PENDING │── 24h idle ─► EXPIRED └────────┬───────┘ │ first putPart succeeds ▼ ┌────────────────┐ │ IN_PROGRESS │── 7d idle ─► EXPIRED └────────┬───────┘ │ completeUpload ▼ ┌────────────────┐ │ FINALIZING │ ── blob fail ─► FAILED └────────┬───────┘ │ blob ok, transcode enqueued ▼ ┌────────────────┐ │ TRANSCODING │ ── transcode fail ─► FAILED └────────┬───────┘ │ transcode ok ▼ ┌────────────────┐ │ READY │ └────────────────┘
[from any state, except READY] │ cancelUpload ▼ ┌──────────────┐ │ CANCELLED │ └──────────────┘A few invariants the API enforces around these states:
putPartis rejected with409 Conflictunless state isPENDINGorIN_PROGRESS.completeUploadis idempotent — calling it twice on aFINALIZING/TRANSCODING/READYsession returns the samevideo_id(with current status). This matters because clients retry the completion call when the response gets lost.- The transition from
TRANSCODINGtoREADYis what fires the webhook. The webhook payload includes the finalvideo_id, the available renditions (e.g.[240p, 360p, 480p, 720p, 1080p, 1440p, 2160p]), and the manifest URL template. - A
FAILEDstate is non-terminal in the sense that the creator can re-trigger transcode (a different operation), but the upload itself can’t be revived.
API implementation#
Endpoint catalogue#
| Method | Path | Purpose |
|---|---|---|
POST | /v1/videos:initiateUpload | Begin a resumable upload; returns session |
PUT | /v1/uploads/{session}/parts/{idx} | Upload one part (64 MB) |
GET | /v1/uploads/{session} | Query session state (resume-from inspection) |
POST | /v1/uploads/{session}:complete | Finalize; triggers async transcode |
DELETE | /v1/uploads/{session} | Cancel an in-progress upload |
GET | /v1/videos/{id} | Video metadata (title, durations, thumbnails) |
POST | /v1/videos/{id}/playback | Open a playback session; returns manifest URL |
GET | /v1/channels/{id}/recommendations | Per-channel recs for an authenticated viewer |
GET | /v1/recommendations/home | Homepage recs for an authenticated viewer |
GET | /v1/videos/{id}/related | Related videos for an open watch session |
Ten endpoints across three sub-APIs. The :complete and :initiateUpload paths use Google’s verb-suffix convention (the colon-verb form) — appropriate here since these aren’t pure REST operations on a videos collection; they are state transitions.
OpenAPI schema (excerpt)#
paths: /v1/videos:initiateUpload: post: operationId: initiateUpload security: [{ bearerAuth: [upload.write] }] requestBody: required: true content: application/json: schema: type: object required: [file_size_bytes, mime_type, title] properties: file_size_bytes: type: integer minimum: 1 maximum: 274877906944 mime_type: type: string enum: [video/mp4, video/quicktime, video/webm, video/x-matroska] title: { type: string, maxLength: 100 } description: { type: string, maxLength: 5000 } webhook_url: { type: string, format: uri, nullable: true } responses: '201': description: Upload session created content: application/json: schema: $ref: '#/components/schemas/UploadSession' '400': { description: Invalid request } '413': { description: File too large }
/v1/uploads/{session}/parts/{idx}: put: operationId: uploadPart security: [{ bearerAuth: [upload.write] }] parameters: - { name: session, in: path, required: true, schema: { type: string, format: uuid } } - { name: idx, in: path, required: true, schema: { type: integer, minimum: 0 } } - name: Content-Length in: header required: true schema: { type: integer, maximum: 67108864 } requestBody: required: true content: application/octet-stream: schema: { type: string, format: binary } responses: '204': description: Part stored headers: ETag: { schema: { type: string } } '409': { description: Session not in IN_PROGRESS state } '416': { description: Part index out of range }
/v1/videos/{id}/playback: post: operationId: openPlayback security: [{ bearerAuth: [playback.read] }] parameters: - { name: id, in: path, required: true, schema: { type: string } } requestBody: content: application/json: schema: type: object properties: preferred_format: type: string enum: [hls, dash, auto] default: auto client_capabilities: type: object properties: max_resolution: { type: string, enum: [480p, 720p, 1080p, 1440p, 2160p] } codecs: { type: array, items: { type: string } } responses: '200': description: Playback session opened content: application/json: schema: $ref: '#/components/schemas/PlaybackSession' '404': { description: Video not found or not ready } '451': { description: Unavailable in viewer's region }
components: schemas: UploadSession: type: object required: [id, part_size_bytes, expires_at, parts_url_template] properties: id: { type: string, format: uuid } video_id: { type: string } part_size_bytes: { type: integer, default: 67108864 } total_parts: { type: integer } expires_at: { type: string, format: date-time } parts_url_template: type: string example: "https://api.youtube.example/v1/uploads/{session}/parts/{idx}" state: type: string enum: [PENDING, IN_PROGRESS, FINALIZING, TRANSCODING, READY, FAILED, CANCELLED] PlaybackSession: type: object required: [session_id, manifest_url, ttl_seconds] properties: session_id: { type: string, format: uuid } manifest_url: { type: string, format: uri } manifest_format: { type: string, enum: [hls, dash] } cdn_host: { type: string } ttl_seconds: { type: integer, example: 3600 } drm_token: { type: string, nullable: true }Webhook contract for transcode completion#
When the transcode pipeline finishes a video, the Upload API posts a small JSON document to the webhook_url the creator supplied at upload time. The shape is intentionally minimal:
{ "event": "video.ready", "video_id": "v_8h2N9c0qK", "session_id": "01HFY3...", "renditions": ["240p", "360p", "480p", "720p", "1080p"], "manifest_url_template": "https://api.youtube.example/v1/videos/v_8h2N9c0qK/playback", "ts": "2026-05-30T17:42:11Z", "signature": "sha256=..."}The signature is HMAC-SHA256 over the body using a secret shared with the creator at app registration — same pattern as Stripe and GitHub webhooks. Delivery is at-least-once with exponential backoff up to 24 hours. A video.failed event with an error code is the alternative terminal event.
Client samples — three languages#
The end-to-end “initiate, upload one part, complete” flow in Python, Go, and Node.
import hashlibimport requests
API = "https://api.youtube.example"TOKEN = "Bearer eyJhbGciOi..."
def initiate(file_path, title): size = 0 with open(file_path, "rb") as f: f.seek(0, 2); size = f.tell() resp = requests.post( f"{API}/v1/videos:initiateUpload", json={"file_size_bytes": size, "mime_type": "video/mp4", "title": title}, headers={"Authorization": TOKEN}, timeout=10, ) resp.raise_for_status() return resp.json()
def upload_part(session_id, idx, chunk): return requests.put( f"{API}/v1/uploads/{session_id}/parts/{idx}", data=chunk, headers={"Authorization": TOKEN, "Content-Type": "application/octet-stream"}, timeout=120, )
def complete(session_id): resp = requests.post( f"{API}/v1/uploads/{session_id}:complete", headers={"Authorization": TOKEN}, timeout=30, ) resp.raise_for_status() return resp.json()
def upload_video(file_path, title): session = initiate(file_path, title) sid = session["id"] part_size = session["part_size_bytes"] with open(file_path, "rb") as f: idx = 0 while True: chunk = f.read(part_size) if not chunk: break r = upload_part(sid, idx, chunk) r.raise_for_status() idx += 1 return complete(sid)package main
import ( "bytes" "encoding/json" "fmt" "io" "net/http" "os")
const API = "https://api.youtube.example"const Token = "Bearer eyJhbGciOi..."
type Session struct { ID string `json:"id"` PartSizeBytes int `json:"part_size_bytes"` TotalParts int `json:"total_parts"`}
func initiate(size int64, title string) (*Session, error) { body, _ := json.Marshal(map[string]any{ "file_size_bytes": size, "mime_type": "video/mp4", "title": title, }) req, _ := http.NewRequest("POST", API+"/v1/videos:initiateUpload", bytes.NewReader(body)) req.Header.Set("Authorization", Token) req.Header.Set("Content-Type", "application/json") resp, err := http.DefaultClient.Do(req) if err != nil { return nil, err } defer resp.Body.Close() var s Session json.NewDecoder(resp.Body).Decode(&s) return &s, nil}
func uploadPart(sid string, idx int, chunk []byte) error { url := fmt.Sprintf("%s/v1/uploads/%s/parts/%d", API, sid, idx) req, _ := http.NewRequest("PUT", url, bytes.NewReader(chunk)) req.Header.Set("Authorization", Token) req.Header.Set("Content-Type", "application/octet-stream") resp, err := http.DefaultClient.Do(req) if err != nil { return err } resp.Body.Close() if resp.StatusCode != 204 { return fmt.Errorf("part %d: HTTP %d", idx, resp.StatusCode) } return nil}
func complete(sid string) (map[string]any, error) { req, _ := http.NewRequest("POST", API+"/v1/uploads/"+sid+":complete", nil) req.Header.Set("Authorization", Token) resp, err := http.DefaultClient.Do(req) if err != nil { return nil, err } defer resp.Body.Close() var out map[string]any json.NewDecoder(resp.Body).Decode(&out) return out, nil}
func uploadVideo(path, title string) error { f, err := os.Open(path) if err != nil { return err } defer f.Close() info, _ := f.Stat() s, err := initiate(info.Size(), title) if err != nil { return err } buf := make([]byte, s.PartSizeBytes) for i := 0; ; i++ { n, err := f.Read(buf) if n == 0 || err == io.EOF { break } if err := uploadPart(s.ID, i, buf[:n]); err != nil { return err } } _, err = complete(s.ID) return err}import fs from "node:fs/promises";
const API = "https://api.youtube.example";const TOKEN = "Bearer eyJhbGciOi...";
async function initiate(size, title) { const resp = await fetch(`${API}/v1/videos:initiateUpload`, { method: "POST", headers: { Authorization: TOKEN, "Content-Type": "application/json" }, body: JSON.stringify({ file_size_bytes: size, mime_type: "video/mp4", title }), }); if (!resp.ok) throw new Error(`HTTP ${resp.status}`); return resp.json();}
async function uploadPart(sid, idx, chunk) { const resp = await fetch(`${API}/v1/uploads/${sid}/parts/${idx}`, { method: "PUT", headers: { Authorization: TOKEN, "Content-Type": "application/octet-stream" }, body: chunk, }); if (resp.status !== 204) throw new Error(`Part ${idx}: HTTP ${resp.status}`);}
async function complete(sid) { const resp = await fetch(`${API}/v1/uploads/${sid}:complete`, { method: "POST", headers: { Authorization: TOKEN }, }); if (!resp.ok) throw new Error(`HTTP ${resp.status}`); return resp.json();}
export async function uploadVideo(path, title) { const data = await fs.readFile(path); const session = await initiate(data.length, title); const partSize = session.part_size_bytes; for (let i = 0, off = 0; off < data.length; i++, off += partSize) { await uploadPart(session.id, i, data.subarray(off, off + partSize)); } return complete(session.id);}Latency budget — playback start#
Playback start (POST /v1/videos/{id}/playback through first decoded frame) breaks down as:
| Phase | Budget | Notes |
|---|---|---|
| TLS / HTTP setup | 30 ms | Warm connection from app start |
| Auth + entitlement check | 15 ms | JWT verify + per-region availability check |
| GeoIP lookup + CDN pick | 5 ms | In-process MaxMind-shaped DB |
| Sign manifest URL | 5 ms | HMAC over (video_id, viewer_id, cdn_host, ttl) |
| Manifest fetch (separate request) | 100 ms | From CDN edge, cached |
| Player decision (rendition pick) | 50 ms | Client-side |
| First segment fetch | 300 ms | 2-second segment from edge |
| Decode + paint | 50 ms | Hardware decoder warm-up |
| Total | 555 ms | Well under the 2 s p95 target |
The 2 s target has 1.4 s of margin for the long tail (cold CDN cache miss on the first segment, retry of a dropped TCP connection, DNS lookup on a fresh app install).
Recommendation latency budget#
| Phase | Budget |
|---|---|
| Auth + cache key build | 5 ms |
| Cache lookup (per-viewer top-N) | 5 ms |
| Black-box ranker RPC | 120 ms p95 |
Hydrate VideoSummary for top 50 | 40 ms |
| Serialize + transport | 20 ms |
| Total | 190 ms |
The ranker is the dominant cost and is out of the API’s control — the API contract is forChannel(channel_id, viewer_id) -> [video_id, score, reason_tag][]. If the model team needs more time per call, they have to push it into batch precomputation, not extend the synchronous budget.
Trade-offs and extensions#
| Decision | Why | Cost if requirements change |
|---|---|---|
| Resumable multipart (vs single PUT) | Files up to 256 GB; cannot retry from scratch on drop | More client logic, more session state to track |
| 64 MB part size | Tradeoff between TCP-window-fill and retry cost | Smaller parts on mobile would mean more round-trips |
| Webhook for transcode completion | Decouples sync upload from async work | Webhook delivery is at-least-once, creators must dedupe |
| Signed manifest URL with TTL | Prevents URL sharing / hot-linking | Re-signing on session refresh adds 1 RTT |
| CDN selection in PlaybackAPI | API can route around regional CDN issues | API now owns the GeoIP database refresh |
| Separate HLS + DASH manifests | Apple devices want HLS; everything else DASH | Cost of running two transcode output formats |
:complete is idempotent | Network drops mid-completion are routine | Server must store the result keyed on session id |
| Recommendation as a thin pass-through | Decouples API from ranker iteration | API can’t add cross-cutting concerns (e.g. blocklist) without becoming smart |
| No range-request download endpoint | Discourages scraping, simplifies licensing | Power users who want offline cannot DIY; needs a dedicated mobile API |
Likely follow-up extensions and how the API absorbs them:
- Live streaming. A new sub-API with
POST /v1/streams:create,PUT /v1/streams/{id}/ingest, manifest pointing to a moving window. Different latency profile (sub-3 s glass-to-glass), different player. Cleanest as a sibling API, not a flag on the VOD path. - DVR for live. Live + lookback within a 4-hour window. The manifest grows; the API contract is unchanged.
- Subtitles / captions. A
GET /v1/videos/{id}/captionsendpoint returning a list of{lang, url, format}entries. WebVTT + TTML. - Watch history sync.
POST /v1/videos/{id}/playback/heartbeatalready in the design — a richer write side would persist watch progress to a separate user-data service. - Family content / age gating. A claim on the JWT (
age_band) gates/playback. The API responds451 Unavailable for Legal Reasonsrather than hiding the video. - Adaptive ABR ladder. Today the renditions are a fixed
[240p…2160p]ladder; per-content ladders (“only encode what’s worth encoding”) need a wider transcode-result schema but no playback API change.
Mock interview follow-ups#
- “What happens if the creator’s webhook endpoint is down when transcode completes?” — At-least-once retry with exponential backoff over 24 hours; the creator can also poll
GET /v1/uploads/{session}orGET /v1/videos/{id}to learn state. The creator dashboard does the same. - “How does the client know to use HLS vs DASH?” — Client passes
preferred_format(orauto); server picks based onUser-Agentheuristics ifauto. iOS / tvOS always get HLS by App Store policy. - “How do you prevent someone from leeching the manifest URL?” — HMAC signature over
(video_id, viewer_id_hash, cdn_host, expires_at); TTL of 1 hour; CDN refuses signature mismatch. Premium content uses Widevine / FairPlay DRM tokens on top. - “How does the upload survive a network drop?” — Client polls
GET /v1/uploads/{session}to learnparts_received; resumes from the first missing part. EachputPartis independently retriable since they’re idempotent on(session, idx). - “What’s the recommendation endpoint’s cache strategy?” — Per-viewer cached for 5 minutes for homepage; not cached for
relatedTo(video_id)since context shifts every video. Cache key is(viewer_id, surface, locale, device_class). - “How do you handle a viewer in a region where the video is geo-blocked?” —
/playbackreturns451. The video metadata endpoint also reflects unavailability so the client doesn’t even try to play. - “What about hot-linked thumbnails?” — Thumbnails use a separate signed-URL scheme; same HMAC pattern, longer TTL. The CDN sees the signature and rejects unauthorised requests at the edge.
- “At 10x scale, what breaks first?” — The synchronous part of the recommendation endpoint. We’d move from on-request ranking to precomputed per-user candidate sets with online re-ranking on a smaller candidate pool, keeping the API contract identical.
- “How does the API contract evolve when we add a new codec (e.g. AV1)?” — Additive only. The
client_capabilities.codecsfield already exists; the server returns an AV1-bearing manifest when both client capability and codec availability match. Old clients keep getting H.264 / VP9 manifests unchanged.
Single /videos endpoint covering upload and playback. Common candidate mistake. Upload is a write-heavy state machine with multi-GB request bodies and a multi-second SLA. Playback is a read-heavy session with a 200 ms latency budget. They share nothing operationally; merging them makes one budget impossible to honour.
Three sub-APIs with explicit seams. Upload owns the resumable protocol and the session state machine; the webhook is the only thing it pushes downstream. Playback owns the signed-URL + CDN pick. Recommendation is a thin pass-through. Each can scale, cache, fail, and version independently.
Related#
- Design a File Service API — the upload-half of this design generalised. Range requests, signed URLs, multipart.
- Design a Pub-Sub Service API — the asynchronous backbone that carries the transcode-completion event under the hood.
- Event-Driven Architecture Protocols — the webhook contract details.
- The API-Design Walk-through — the seven-step recipe this writeup followed.
- REST — The Architectural Style — the architectural style behind the endpoint shape.