Design a File Service API — API Design

Context#

A file service is the second-most reinvented backend in our industry, after auth. Every product eventually needs to accept user uploads, store the bytes durably, hand back a URL the browser can render, and stream the file out again — sometimes from byte offset 1.2 GB into a 4 GB video. The shape of this contract has been stable since Amazon S3 launched in 2006 and every cloud has copied it: GCS, Azure Blob, R2, Backblaze B2 all expose the same primitives.

So why is this a fresh interview question? Because the API surface is wider than it looks:

Small files (<= 5 MB) want a single-shot upload — one request, done.
Large files want multipart — chunked, parallel, resumable.
Browsers downloading a 2 GB video want range requests, not the whole body.
Mobile clients on flaky networks want resumable uploads — pick up after a TCP reset.
Third parties want signed URLs — time-bounded, capability-style access without sharing credentials.

The interviewer’s hidden objectives:

Can you separate the metadata plane from the data plane — the small-fast API from the large-streaming one?
Do you know when to redirect to object storage versus stream through your service?
Can you specify a resumable-upload protocol without inventing one from scratch?
Can you produce a signed-URL design that’s secure by default (short TTL, signed path, no leaked keys)?
Can you defend the multipart threshold and the part size with numbers?

S3 is the reference implementation; this writeup uses its vocabulary directly.

Requirements (functional and non-functional)#

Functional — in scope:

Upload a file (single-shot for small, multipart for large).
Download a file (full body or a byte range).
Resume an interrupted upload from the last completed part.
Generate a signed URL for direct upload or download with a TTL.
Delete a file. List a user’s files with cursor pagination.
Query file metadata (size, content-type, checksum, created-at).

Functional — out of scope:

Image transforms / resizing — separate service.
Virus scanning — async pipeline, not part of the synchronous API.
Folder hierarchies — flat namespace; clients model folders with key prefixes.
Versioning — a follow-up axis; v1 is “last write wins”.

Non-functional:

Latency: metadata calls <= 200 ms p95; data calls bounded by network and content size, served via CDN.
Durability: 11 nines (S3-equivalent) via the underlying object store.
Availability: 99.99% on the read path, 99.9% on the write path.
Throughput: 100 GB/s aggregate read across the fleet.
Maximum file size: 5 TB (matches S3’s per-object cap).
Signed-URL TTL: 1 minute to 7 days; default 15 minutes.

Use case diagram#

                ┌──────────────┐
                │   End user   │
                └──────┬───────┘
                       │
       ┌───────────────┼──────────────────┐
       ▼               ▼                  ▼
  [upload file]   [download file]    [share signed URL]
       │               │                  │
       ▼               ▼                  ▼
  ┌──────────────────────────────────────────┐
  │            File Service API              │
  └─────────┬─────────────────────┬──────────┘
            │                     │
            ▼                     ▼
     ┌────────────┐         ┌────────────┐
     │ Metadata DB│         │ Object store│
     └────────────┘         │   (S3/GCS)  │
                            └────────────┘
                                  │
                                  ▼
                            ┌─────────┐
                            │   CDN   │  …  edge cache for hot reads
                            └─────────┘

Two actors: end user and a third party who receives a signed URL. The third party never authenticates against the service — the signature in the URL is the capability.

Class diagram#

   ┌───────────────────────┐
   │     FileService       │
   ├───────────────────────┤
   │ initUpload(req): File │
   │ uploadPart(req): Part │
   │ completeUpload(req)   │
   │ resumeUpload(req)     │
   │ download(id, range?)  │
   │ signedURL(req): URL   │
   └──────────┬────────────┘
              │ owns
              ▼
   ┌───────────────────────┐         ┌─────────────────────┐
   │        File           │ 1 ─── * │       Upload        │
   ├───────────────────────┤         ├─────────────────────┤
   │ id, owner_id          │         │ upload_id           │
   │ name, content_type    │         │ status              │
   │ size, sha256          │         │ part_size           │
   │ created_at            │         │ total_parts         │
   │ object_key            │         │ completed_parts[]   │
   └───────────────────────┘         │ expires_at          │
              │                      └─────────────────────┘
              │
              ▼
   ┌───────────────────────┐         ┌─────────────────────┐
   │     SignedURL         │         │        Part         │
   ├───────────────────────┤         ├─────────────────────┤
   │ url                   │         │ part_number         │
   │ expires_at            │         │ etag                │
   │ method (GET|PUT)      │         │ size                │
   │ max_bytes             │         │ uploaded_at         │
   └───────────────────────┘         └─────────────────────┘

File is the metadata-plane entity; the bytes themselves live in the object store keyed by object_key. Upload tracks an in-flight multipart upload — once completeUpload runs, the Upload row is deleted and only the File remains. SignedURL is ephemeral — never persisted; generated on demand and verified by signature.

Sequence diagram (key flows)#

The multipart upload flow — the discriminating one:

 Client            FileAPI          MetaDB         ObjectStore
   │ POST /files (init)│              │                  │
   │──────────────────►│              │                  │
   │                   │ insert Upload│                  │
   │                   │─────────────►│                  │
   │                   │  create MPU  │                  │
   │                   │─────────────────────────────────►│
   │  upload_id, part_size            │                  │
   │◄──────────────────│              │                  │
   │                                                      │
   │  PUT /files/{id}/parts/1 (5MB)                       │
   │─────────────────────────────────────────────────────►│
   │  etag_1                                              │
   │◄─────────────────────────────────────────────────────│
   │  PUT /files/{id}/parts/2 (5MB) ─── in parallel ─────►│
   │─────────────────────────────────────────────────────►│
   │  etag_2                                              │
   │◄─────────────────────────────────────────────────────│
   │  ... (N parts in parallel) ...                       │
   │                                                      │
   │  POST /files/{id}/complete                           │
   │  body: [{part:1, etag:e1}, {part:2, etag:e2}, ...]   │
   │──────────────────►│              │                  │
   │                   │ completeMPU  │                  │
   │                   │─────────────────────────────────►│
   │                   │  object_key, sha256             │
   │                   │◄─────────────────────────────────│
   │                   │ update File  │                  │
   │                   │─────────────►│                  │
   │  201 Created + File resource     │                  │
   │◄──────────────────│              │                  │

The range download flow is simpler — the gateway proxies to the object store with the Range header forwarded, and the response carries 206 Partial Content:

 Client            FileAPI / CDN          ObjectStore
   │ GET /files/{id}  │                       │
   │ Range: bytes=1048576-2097151             │
   │─────────────────►│                       │
   │                  │ check auth, signature │
   │                  │ GET object Range:...  │
   │                  │──────────────────────►│
   │                  │  206 Partial Content  │
   │                  │  Content-Range: ...   │
   │                  │◄──────────────────────│
   │  206 + bytes     │                       │
   │◄─────────────────│                       │

The signed-URL flow takes the API out of the data path entirely — client uploads directly to the object store:

 Client            FileAPI          ObjectStore
   │ POST /files/{id}/signed-url     │
   │ method: PUT, ttl: 900           │
   │──────────────────►│             │
   │ signed URL       │              │
   │◄──────────────────│             │
   │                                 │
   │ PUT <signed-url> body=bytes ───────►
   │                                 │
   │ 200 OK ◄────────────────────────│

Activity diagram (for non-trivial state)#

The Upload resource has a meaningful state machine — most of the interview is defending this picture:

              [client: POST /files]
                       │
                       ▼
              ┌─────────────────┐
              │     Pending     │
              │  (init complete,│
              │   no parts yet) │
              └────────┬────────┘
                       │ first part uploaded
                       ▼
              ┌─────────────────┐
              │   InProgress    │
              │  (≥ 1 part OK,  │
              │   < N parts)    │
              └────────┬────────┘
              │        │        │
   abort      │        │        │ TTL elapses
   ┌──────────┘        │        └──────────┐
   ▼                   ▼                   ▼
┌─────────┐    ┌───────────────┐     ┌──────────┐
│ Aborted │    │   Completed   │     │  Failed  │
│ (DELETE)│    │ (all parts +  │     │  (expired│
└─────────┘    │  complete OK) │     │   no GC) │
               └───────────────┘     └──────────┘
                       │
                       ▼
                 [File visible]

Two terminal-success paths and two terminal-failure paths. The garbage-collector job is critical: an abandoned InProgress upload holds storage but no File exists yet; after 7 days the GC aborts the underlying multipart upload to reclaim bytes. S3’s lifecycle rules expose this directly as AbortIncompleteMultipartUpload.

API implementation#

Endpoint catalogue#

Method	Path	Purpose
`POST`	`/v1/files`	Initiate upload; returns `upload_id` and `part_size`
`PUT`	`/v1/files/{id}/parts/{n}`	Upload one part (multipart only)
`POST`	`/v1/files/{id}/complete`	Finalise multipart upload
`POST`	`/v1/files/{id}/uploads:resume`	Resume an interrupted upload; returns next part number
`DELETE`	`/v1/files/{id}/uploads/{upload_id}`	Abort a multipart upload
`GET`	`/v1/files/{id}`	Download (full or `Range`-windowed)
`GET`	`/v1/files/{id}/metadata`	Metadata only — no bytes
`POST`	`/v1/files/{id}/signed-url`	Generate signed URL
`DELETE`	`/v1/files/{id}`	Delete file
`GET`	`/v1/files`	List files; cursor pagination

The single-shot upload path is POST /v1/files with Content-Type: multipart/form-data and the body inline. For payloads under 5 MB this skips the part/complete dance.

Multipart threshold#

The 5 MB threshold matches S3’s minimum part size. Below 5 MB, multipart adds round-trip overhead without parallelism benefit. Above 5 MB, parallel parts win on throughput. The server tells the client which mode applies in the init response — clients should not hardcode it.

OpenAPI schema (excerpt)#

paths:
  /v1/files:
    post:
      operationId: initUpload
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [name, size, content_type]
              properties:
                name: { type: string, maxLength: 1024 }
                size: { type: integer, minimum: 1, maximum: 5497558138880 }
                content_type: { type: string }
                sha256: { type: string, nullable: true }
      responses:
        '201':
          description: Upload initiated
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/UploadInit'
  /v1/files/{id}/complete:
    post:
      operationId: completeUpload
      parameters:
        - { name: id, in: path, required: true, schema: { type: string } }
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [upload_id, parts]
              properties:
                upload_id: { type: string }
                parts:
                  type: array
                  items:
                    type: object
                    required: [part_number, etag]
                    properties:
                      part_number: { type: integer, minimum: 1 }
                      etag: { type: string }
      responses:
        '200':
          description: File finalised
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/File'
  /v1/files/{id}/signed-url:
    post:
      operationId: signedURL
      parameters:
        - { name: id, in: path, required: true, schema: { type: string } }
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required: [method, ttl_seconds]
              properties:
                method:
                  type: string
                  enum: [GET, PUT]
                ttl_seconds:
                  type: integer
                  minimum: 60
                  maximum: 604800
                  default: 900
                max_bytes:
                  type: integer
                  nullable: true
      responses:
        '200':
          description: Signed URL
          content:
            application/json:
              schema:
                type: object
                required: [url, expires_at]
                properties:
                  url: { type: string, format: uri }
                  expires_at: { type: string, format: date-time }
components:
  schemas:
    UploadInit:
      type: object
      required: [file_id, upload_id, mode]
      properties:
        file_id: { type: string }
        upload_id: { type: string }
        mode:
          type: string
          enum: [single_shot, multipart]
        part_size: { type: integer, nullable: true }
        total_parts: { type: integer, nullable: true }
    File:
      type: object
      required: [id, owner_id, size, content_type, sha256]
      properties:
        id: { type: string }
        owner_id: { type: string }
        name: { type: string }
        size: { type: integer }
        content_type: { type: string }
        sha256: { type: string }
        created_at: { type: string, format: date-time }

Range request — raw HTTP#

The wire-level range download is plain HTTP/1.1; nothing custom:

GET /v1/files/abc123 HTTP/1.1
Host: api.example.com
Authorization: Bearer eyJhbGciOi...
Range: bytes=1048576-2097151

HTTP/1.1 206 Partial Content
Content-Type: video/mp4
Content-Range: bytes 1048576-2097151/4194304000
Content-Length: 1048576
ETag: "9b4f7e..."
Accept-Ranges: bytes

<binary>

Client samples — three languages#

The init-multipart-complete dance is the discriminating client-side flow. Each tab shows: init, upload one part, complete. Production clients parallelise the parts.

import hashlib, requests

API = "https://api.example.com"
TOKEN = "Bearer eyJhbGciOi..."

def upload(path, content_type):
    with open(path, "rb") as f:
        body = f.read()
    sha = hashlib.sha256(body).hexdigest()

    # 1. init
    init = requests.post(
        f"{API}/v1/files",
        json={"name": path, "size": len(body),
              "content_type": content_type, "sha256": sha},
        headers={"Authorization": TOKEN},
    ).json()

    if init["mode"] == "single_shot":
        requests.put(f"{API}/v1/files/{init['file_id']}/parts/1",
                     data=body, headers={"Authorization": TOKEN})
        parts = [{"part_number": 1, "etag": "single"}]
    else:
        size = init["part_size"]
        parts = []
        for i, off in enumerate(range(0, len(body), size), 1):
            chunk = body[off:off + size]
            r = requests.put(
                f"{API}/v1/files/{init['file_id']}/parts/{i}",
                data=chunk, headers={"Authorization": TOKEN},
            )
            parts.append({"part_number": i, "etag": r.headers["ETag"]})

    # 2. complete
    return requests.post(
        f"{API}/v1/files/{init['file_id']}/complete",
        json={"upload_id": init["upload_id"], "parts": parts},
        headers={"Authorization": TOKEN},
    ).json()

print(upload("./video.mp4", "video/mp4"))

package main

import (
    "bytes"
    "crypto/sha256"
    "encoding/hex"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
    "os"
)

const API = "https://api.example.com"
const TOKEN = "Bearer eyJhbGciOi..."

type Init struct {
    FileID    string `json:"file_id"`
    UploadID  string `json:"upload_id"`
    Mode      string `json:"mode"`
    PartSize  int    `json:"part_size"`
}

func upload(path, ct string) error {
    body, err := os.ReadFile(path)
    if err != nil { return err }
    sum := sha256.Sum256(body)
    sha := hex.EncodeToString(sum[:])

    initBody, _ := json.Marshal(map[string]any{
        "name": path, "size": len(body),
        "content_type": ct, "sha256": sha,
    })
    req, _ := http.NewRequest("POST", API+"/v1/files",
        bytes.NewReader(initBody))
    req.Header.Set("Authorization", TOKEN)
    req.Header.Set("Content-Type", "application/json")
    resp, err := http.DefaultClient.Do(req)
    if err != nil { return err }

    var init Init
    json.NewDecoder(resp.Body).Decode(&init)
    resp.Body.Close()

    type Part struct {
        PartNumber int    `json:"part_number"`
        ETag       string `json:"etag"`
    }
    var parts []Part
    for i, off := 0, 0; off < len(body); i, off = i+1, off+init.PartSize {
        end := off + init.PartSize
        if end > len(body) { end = len(body) }
        url := fmt.Sprintf("%s/v1/files/%s/parts/%d",
            API, init.FileID, i+1)
        p, _ := http.NewRequest("PUT", url,
            bytes.NewReader(body[off:end]))
        p.Header.Set("Authorization", TOKEN)
        pr, _ := http.DefaultClient.Do(p)
        parts = append(parts, Part{i+1, pr.Header.Get("ETag")})
        io.Copy(io.Discard, pr.Body); pr.Body.Close()
    }

    cmp, _ := json.Marshal(map[string]any{
        "upload_id": init.UploadID, "parts": parts,
    })
    cr, _ := http.NewRequest("POST",
        fmt.Sprintf("%s/v1/files/%s/complete", API, init.FileID),
        bytes.NewReader(cmp))
    cr.Header.Set("Authorization", TOKEN)
    cr.Header.Set("Content-Type", "application/json")
    _, err = http.DefaultClient.Do(cr)
    return err
}

func main() { upload("./video.mp4", "video/mp4") }

import { readFile } from "node:fs/promises";
import { createHash } from "node:crypto";

const API = "https://api.example.com";
const TOKEN = "Bearer eyJhbGciOi...";

async function upload(path, contentType) {
  const body = await readFile(path);
  const sha = createHash("sha256").update(body).digest("hex");

  const init = await fetch(`${API}/v1/files`, {
    method: "POST",
    headers: { Authorization: TOKEN, "Content-Type": "application/json" },
    body: JSON.stringify({
      name: path, size: body.length,
      content_type: contentType, sha256: sha,
    }),
  }).then((r) => r.json());

  const parts = [];
  if (init.mode === "single_shot") {
    await fetch(`${API}/v1/files/${init.file_id}/parts/1`, {
      method: "PUT", headers: { Authorization: TOKEN }, body,
    });
    parts.push({ part_number: 1, etag: "single" });
  } else {
    const size = init.part_size;
    for (let i = 0, off = 0; off < body.length; i++, off += size) {
      const chunk = body.subarray(off, off + size);
      const r = await fetch(
        `${API}/v1/files/${init.file_id}/parts/${i + 1}`,
        { method: "PUT", headers: { Authorization: TOKEN }, body: chunk },
      );
      parts.push({ part_number: i + 1, etag: r.headers.get("etag") });
    }
  }

  return fetch(`${API}/v1/files/${init.file_id}/complete`, {
    method: "POST",
    headers: { Authorization: TOKEN, "Content-Type": "application/json" },
    body: JSON.stringify({ upload_id: init.upload_id, parts }),
  }).then((r) => r.json());
}

console.log(await upload("./video.mp4", "video/mp4"));

Latency budget#

Phase	Budget	Notes
`POST /v1/files` (init)	120 ms p95	DB write + metadata row
`PUT .../parts/{n}`	bytes ÷ link bandwidth + 30 ms	Direct-to-object-store
`POST .../complete`	200 ms p95	Validates all parts, computes final ETag
`GET /v1/files/{id}/metadata`	50 ms p95	Single DB read; cacheable
`GET /v1/files/{id}` (range)	100 ms TTFB	CDN edge for hot objects
`POST .../signed-url`	20 ms	Pure crypto + DB read for ACL

The metadata plane sits behind the 200 ms p95 ceiling. The data plane is governed by content size and the CDN; we don’t promise an SLO on data-plane throughput beyond “1 Gbps per connection”.

Trade-offs and extensions#

Decision	Why	Cost if requirements change
Two planes (metadata + data)	Hot reads bypass the app	Adds operational surface — two storage systems
5 MB multipart threshold	Matches S3; below this, single-shot wins	Smaller chunks would saturate metadata DB
Signed URLs default 15 min TTL	Short enough to revoke, long enough for big uploads	Longer TTLs leak capability if logs are stolen
sha256 in metadata, not enforced	Trust client; verify on read	Re-hashing 5 TB server-side is prohibitive
Direct upload via signed URL	Removes API from data path	Loses opportunity to scan / transform inline
Flat namespace, no folders	Matches S3, GCS, Azure	UX layer must reconstruct hierarchy from prefixes
7-day TTL on incomplete uploads	Bounds storage cost	Long-running multi-day uploads need a re-init

A few cleaner contrasts:

Stream through the API

One auth model
Inline virus scan, transforms, watermark
Easy to throttle per-user
Caps at API tier bandwidth
100% of bytes traverse your network

Signed-URL direct upload

Client uploads to S3 directly
API only mints a URL
Scales to object-store bandwidth
Inline scanning needs an async callback
Lower egress cost

Most production designs run both — the signed-URL path for known-large content (videos, dataset uploads); the proxy path for small content that benefits from inline processing.

Likely follow-up extensions:

Versioning. Add a version_id to File. Each upload to the same name creates a new version; delete becomes a tombstone. Same shape as S3 Versioning.
Resumable uploads on flaky mobile networks. Already covered by POST .../uploads:resume; client calls it after reconnect and learns which parts succeeded. Equivalent to Google’s tus-style resumable protocol.
Cross-region replication. Asynchronous backplane; same API contract; the region field in File becomes a list.
Lifecycle rules. A lifecycle config per “bucket” — auto-delete after N days, move to cold storage after M days. Mirrors S3 lifecycle config.

Mock interview follow-ups#

“What happens if part 3 fails after part 2 succeeded?” — The client retries part 3; idempotency by part number. Once all parts have an ETag, complete is called. If the client never returns, the GC sweep aborts the upload after 7 days.
“How do you stop someone from uploading a 5 TB file just to fill your bucket?” — Per-user quota checked at init time; declared size is the authoritative cap; if the upload exceeds it the parts beyond the cap are rejected 413.
“What’s stopping me from forging a signed URL?” — HMAC of (method, path, expiry, max_bytes, user_id) with a server-side secret. Tampering with any field breaks the signature.
“How do you serve a 50 GB ML dataset to 10k researchers?” — Origin-pull CDN on GET /v1/files/{id}; the CDN handles range requests and dedup at the edge. The API only authorises the first request per signature.
“Why not gRPC?” — HTTP is the lingua franca of object storage; every browser and CDN speaks it; range requests and resumable uploads are first-class. gRPC would force you to reinvent CDN integration.
“How do you handle deletes that need to survive an accidental rm?” — Soft delete: set deleted_at, hide from list, leave the object intact for 30 days. A reaper job purges after the retention window. Matches S3’s MFA-delete + lifecycle policy.
“What about end-to-end encryption?” — Client-side encryption: the client encrypts before upload using a key the API never sees. The API stores ciphertext + IV + a wrapped DEK. The sha256 in metadata is over the ciphertext for integrity.

Design a Search Service API — sibling foundational system; read-only with cursor pagination.
Design a Comment Service API — write-heavy CRUD analogue; same auth model.
Design the YouTube Streaming API — builds on range requests for adaptive bitrate.
The API-Design Walk-through — the seven-step recipe this writeup followed.
REST — The Architectural Style — the architectural style behind the endpoint shape.

Context#

Requirements (functional and non-functional)#

Use case diagram#

Class diagram#

Sequence diagram (key flows)#

Activity diagram (for non-trivial state)#

API implementation#

Endpoint catalogue#

Multipart threshold#

OpenAPI schema (excerpt)#

Range request — raw HTTP#

Client samples — three languages#

Latency budget#

Trade-offs and extensions#

Mock interview follow-ups#

Related#