Design a File Service API

Upload, download, range requests, multipart, signed URLs, resumability. The S3-shaped contract every backend reinvents.

System Intermediate
16 min read
file-service api-design multipart-upload signed-urls

Context#

A file service is the second-most reinvented backend in our industry, after auth. Every product eventually needs to accept user uploads, store the bytes durably, hand back a URL the browser can render, and stream the file out again — sometimes from byte offset 1.2 GB into a 4 GB video. The shape of this contract has been stable since Amazon S3 launched in 2006 and every cloud has copied it: GCS, Azure Blob, R2, Backblaze B2 all expose the same primitives.

So why is this a fresh interview question? Because the API surface is wider than it looks:

  • Small files (<= 5 MB) want a single-shot upload — one request, done.
  • Large files want multipart — chunked, parallel, resumable.
  • Browsers downloading a 2 GB video want range requests, not the whole body.
  • Mobile clients on flaky networks want resumable uploads — pick up after a TCP reset.
  • Third parties want signed URLs — time-bounded, capability-style access without sharing credentials.

The interviewer’s hidden objectives:

  • Can you separate the metadata plane from the data plane — the small-fast API from the large-streaming one?
  • Do you know when to redirect to object storage versus stream through your service?
  • Can you specify a resumable-upload protocol without inventing one from scratch?
  • Can you produce a signed-URL design that’s secure by default (short TTL, signed path, no leaked keys)?
  • Can you defend the multipart threshold and the part size with numbers?

S3 is the reference implementation; this writeup uses its vocabulary directly.

Requirements (functional and non-functional)#

Functional — in scope:

  • Upload a file (single-shot for small, multipart for large).
  • Download a file (full body or a byte range).
  • Resume an interrupted upload from the last completed part.
  • Generate a signed URL for direct upload or download with a TTL.
  • Delete a file. List a user’s files with cursor pagination.
  • Query file metadata (size, content-type, checksum, created-at).

Functional — out of scope:

  • Image transforms / resizing — separate service.
  • Virus scanning — async pipeline, not part of the synchronous API.
  • Folder hierarchies — flat namespace; clients model folders with key prefixes.
  • Versioning — a follow-up axis; v1 is “last write wins”.

Non-functional:

  • Latency: metadata calls <= 200 ms p95; data calls bounded by network and content size, served via CDN.
  • Durability: 11 nines (S3-equivalent) via the underlying object store.
  • Availability: 99.99% on the read path, 99.9% on the write path.
  • Throughput: 100 GB/s aggregate read across the fleet.
  • Maximum file size: 5 TB (matches S3’s per-object cap).
  • Signed-URL TTL: 1 minute to 7 days; default 15 minutes.

Use case diagram#

┌──────────────┐
│ End user │
└──────┬───────┘
┌───────────────┼──────────────────┐
▼ ▼ ▼
[upload file] [download file] [share signed URL]
│ │ │
▼ ▼ ▼
┌──────────────────────────────────────────┐
│ File Service API │
└─────────┬─────────────────────┬──────────┘
│ │
▼ ▼
┌────────────┐ ┌────────────┐
│ Metadata DB│ │ Object store│
└────────────┘ │ (S3/GCS) │
└────────────┘
┌─────────┐
│ CDN │ … edge cache for hot reads
└─────────┘

Two actors: end user and a third party who receives a signed URL. The third party never authenticates against the service — the signature in the URL is the capability.

Class diagram#

┌───────────────────────┐
│ FileService │
├───────────────────────┤
│ initUpload(req): File │
│ uploadPart(req): Part │
│ completeUpload(req) │
│ resumeUpload(req) │
│ download(id, range?) │
│ signedURL(req): URL │
└──────────┬────────────┘
│ owns
┌───────────────────────┐ ┌─────────────────────┐
│ File │ 1 ─── * │ Upload │
├───────────────────────┤ ├─────────────────────┤
│ id, owner_id │ │ upload_id │
│ name, content_type │ │ status │
│ size, sha256 │ │ part_size │
│ created_at │ │ total_parts │
│ object_key │ │ completed_parts[] │
└───────────────────────┘ │ expires_at │
│ └─────────────────────┘
┌───────────────────────┐ ┌─────────────────────┐
│ SignedURL │ │ Part │
├───────────────────────┤ ├─────────────────────┤
│ url │ │ part_number │
│ expires_at │ │ etag │
│ method (GET|PUT) │ │ size │
│ max_bytes │ │ uploaded_at │
└───────────────────────┘ └─────────────────────┘

File is the metadata-plane entity; the bytes themselves live in the object store keyed by object_key. Upload tracks an in-flight multipart upload — once completeUpload runs, the Upload row is deleted and only the File remains. SignedURL is ephemeral — never persisted; generated on demand and verified by signature.

Sequence diagram (key flows)#

The multipart upload flow — the discriminating one:

Client FileAPI MetaDB ObjectStore
│ POST /files (init)│ │ │
│──────────────────►│ │ │
│ │ insert Upload│ │
│ │─────────────►│ │
│ │ create MPU │ │
│ │─────────────────────────────────►│
│ upload_id, part_size │ │
│◄──────────────────│ │ │
│ │
│ PUT /files/{id}/parts/1 (5MB) │
│─────────────────────────────────────────────────────►│
│ etag_1 │
│◄─────────────────────────────────────────────────────│
│ PUT /files/{id}/parts/2 (5MB) ─── in parallel ─────►│
│─────────────────────────────────────────────────────►│
│ etag_2 │
│◄─────────────────────────────────────────────────────│
│ ... (N parts in parallel) ... │
│ │
│ POST /files/{id}/complete │
│ body: [{part:1, etag:e1}, {part:2, etag:e2}, ...] │
│──────────────────►│ │ │
│ │ completeMPU │ │
│ │─────────────────────────────────►│
│ │ object_key, sha256 │
│ │◄─────────────────────────────────│
│ │ update File │ │
│ │─────────────►│ │
│ 201 Created + File resource │ │
│◄──────────────────│ │ │

The range download flow is simpler — the gateway proxies to the object store with the Range header forwarded, and the response carries 206 Partial Content:

Client FileAPI / CDN ObjectStore
│ GET /files/{id} │ │
│ Range: bytes=1048576-2097151 │
│─────────────────►│ │
│ │ check auth, signature │
│ │ GET object Range:... │
│ │──────────────────────►│
│ │ 206 Partial Content │
│ │ Content-Range: ... │
│ │◄──────────────────────│
│ 206 + bytes │ │
│◄─────────────────│ │

The signed-URL flow takes the API out of the data path entirely — client uploads directly to the object store:

Client FileAPI ObjectStore
│ POST /files/{id}/signed-url │
│ method: PUT, ttl: 900 │
│──────────────────►│ │
│ signed URL │ │
│◄──────────────────│ │
│ │
│ PUT <signed-url> body=bytes ───────►
│ │
│ 200 OK ◄────────────────────────│

Activity diagram (for non-trivial state)#

The Upload resource has a meaningful state machine — most of the interview is defending this picture:

[client: POST /files]
┌─────────────────┐
│ Pending │
│ (init complete,│
│ no parts yet) │
└────────┬────────┘
│ first part uploaded
┌─────────────────┐
│ InProgress │
│ (≥ 1 part OK, │
│ < N parts) │
└────────┬────────┘
│ │ │
abort │ │ │ TTL elapses
┌──────────┘ │ └──────────┐
▼ ▼ ▼
┌─────────┐ ┌───────────────┐ ┌──────────┐
│ Aborted │ │ Completed │ │ Failed │
│ (DELETE)│ │ (all parts + │ │ (expired│
└─────────┘ │ complete OK) │ │ no GC) │
└───────────────┘ └──────────┘
[File visible]

Two terminal-success paths and two terminal-failure paths. The garbage-collector job is critical: an abandoned InProgress upload holds storage but no File exists yet; after 7 days the GC aborts the underlying multipart upload to reclaim bytes. S3’s lifecycle rules expose this directly as AbortIncompleteMultipartUpload.

API implementation#

Endpoint catalogue#

MethodPathPurpose
POST/v1/filesInitiate upload; returns upload_id and part_size
PUT/v1/files/{id}/parts/{n}Upload one part (multipart only)
POST/v1/files/{id}/completeFinalise multipart upload
POST/v1/files/{id}/uploads:resumeResume an interrupted upload; returns next part number
DELETE/v1/files/{id}/uploads/{upload_id}Abort a multipart upload
GET/v1/files/{id}Download (full or Range-windowed)
GET/v1/files/{id}/metadataMetadata only — no bytes
POST/v1/files/{id}/signed-urlGenerate signed URL
DELETE/v1/files/{id}Delete file
GET/v1/filesList files; cursor pagination

The single-shot upload path is POST /v1/files with Content-Type: multipart/form-data and the body inline. For payloads under 5 MB this skips the part/complete dance.

Multipart threshold#

The 5 MB threshold matches S3’s minimum part size. Below 5 MB, multipart adds round-trip overhead without parallelism benefit. Above 5 MB, parallel parts win on throughput. The server tells the client which mode applies in the init response — clients should not hardcode it.

OpenAPI schema (excerpt)#

OpenAPI 3.1 — File API (init + complete + signed-URL)
paths:
/v1/files:
post:
operationId: initUpload
requestBody:
required: true
content:
application/json:
schema:
type: object
required: [name, size, content_type]
properties:
name: { type: string, maxLength: 1024 }
size: { type: integer, minimum: 1, maximum: 5497558138880 }
content_type: { type: string }
sha256: { type: string, nullable: true }
responses:
'201':
description: Upload initiated
content:
application/json:
schema:
$ref: '#/components/schemas/UploadInit'
/v1/files/{id}/complete:
post:
operationId: completeUpload
parameters:
- { name: id, in: path, required: true, schema: { type: string } }
requestBody:
required: true
content:
application/json:
schema:
type: object
required: [upload_id, parts]
properties:
upload_id: { type: string }
parts:
type: array
items:
type: object
required: [part_number, etag]
properties:
part_number: { type: integer, minimum: 1 }
etag: { type: string }
responses:
'200':
description: File finalised
content:
application/json:
schema:
$ref: '#/components/schemas/File'
/v1/files/{id}/signed-url:
post:
operationId: signedURL
parameters:
- { name: id, in: path, required: true, schema: { type: string } }
requestBody:
required: true
content:
application/json:
schema:
type: object
required: [method, ttl_seconds]
properties:
method:
type: string
enum: [GET, PUT]
ttl_seconds:
type: integer
minimum: 60
maximum: 604800
default: 900
max_bytes:
type: integer
nullable: true
responses:
'200':
description: Signed URL
content:
application/json:
schema:
type: object
required: [url, expires_at]
properties:
url: { type: string, format: uri }
expires_at: { type: string, format: date-time }
components:
schemas:
UploadInit:
type: object
required: [file_id, upload_id, mode]
properties:
file_id: { type: string }
upload_id: { type: string }
mode:
type: string
enum: [single_shot, multipart]
part_size: { type: integer, nullable: true }
total_parts: { type: integer, nullable: true }
File:
type: object
required: [id, owner_id, size, content_type, sha256]
properties:
id: { type: string }
owner_id: { type: string }
name: { type: string }
size: { type: integer }
content_type: { type: string }
sha256: { type: string }
created_at: { type: string, format: date-time }

Range request — raw HTTP#

The wire-level range download is plain HTTP/1.1; nothing custom:

GET /v1/files/abc123 HTTP/1.1
Host: api.example.com
Authorization: Bearer eyJhbGciOi...
Range: bytes=1048576-2097151
HTTP/1.1 206 Partial Content
Content-Type: video/mp4
Content-Range: bytes 1048576-2097151/4194304000
Content-Length: 1048576
ETag: "9b4f7e..."
Accept-Ranges: bytes
<binary>

Client samples — three languages#

The init-multipart-complete dance is the discriminating client-side flow. Each tab shows: init, upload one part, complete. Production clients parallelise the parts.

Multipart upload — Python
import hashlib, requests
API = "https://api.example.com"
TOKEN = "Bearer eyJhbGciOi..."
def upload(path, content_type):
with open(path, "rb") as f:
body = f.read()
sha = hashlib.sha256(body).hexdigest()
# 1. init
init = requests.post(
f"{API}/v1/files",
json={"name": path, "size": len(body),
"content_type": content_type, "sha256": sha},
headers={"Authorization": TOKEN},
).json()
if init["mode"] == "single_shot":
requests.put(f"{API}/v1/files/{init['file_id']}/parts/1",
data=body, headers={"Authorization": TOKEN})
parts = [{"part_number": 1, "etag": "single"}]
else:
size = init["part_size"]
parts = []
for i, off in enumerate(range(0, len(body), size), 1):
chunk = body[off:off + size]
r = requests.put(
f"{API}/v1/files/{init['file_id']}/parts/{i}",
data=chunk, headers={"Authorization": TOKEN},
)
parts.append({"part_number": i, "etag": r.headers["ETag"]})
# 2. complete
return requests.post(
f"{API}/v1/files/{init['file_id']}/complete",
json={"upload_id": init["upload_id"], "parts": parts},
headers={"Authorization": TOKEN},
).json()
print(upload("./video.mp4", "video/mp4"))

Latency budget#

PhaseBudgetNotes
POST /v1/files (init)120 ms p95DB write + metadata row
PUT .../parts/{n}bytes ÷ link bandwidth + 30 msDirect-to-object-store
POST .../complete200 ms p95Validates all parts, computes final ETag
GET /v1/files/{id}/metadata50 ms p95Single DB read; cacheable
GET /v1/files/{id} (range)100 ms TTFBCDN edge for hot objects
POST .../signed-url20 msPure crypto + DB read for ACL

The metadata plane sits behind the 200 ms p95 ceiling. The data plane is governed by content size and the CDN; we don’t promise an SLO on data-plane throughput beyond “1 Gbps per connection”.

Trade-offs and extensions#

DecisionWhyCost if requirements change
Two planes (metadata + data)Hot reads bypass the appAdds operational surface — two storage systems
5 MB multipart thresholdMatches S3; below this, single-shot winsSmaller chunks would saturate metadata DB
Signed URLs default 15 min TTLShort enough to revoke, long enough for big uploadsLonger TTLs leak capability if logs are stolen
sha256 in metadata, not enforcedTrust client; verify on readRe-hashing 5 TB server-side is prohibitive
Direct upload via signed URLRemoves API from data pathLoses opportunity to scan / transform inline
Flat namespace, no foldersMatches S3, GCS, AzureUX layer must reconstruct hierarchy from prefixes
7-day TTL on incomplete uploadsBounds storage costLong-running multi-day uploads need a re-init

A few cleaner contrasts:

Stream through the API

  • One auth model
  • Inline virus scan, transforms, watermark
  • Easy to throttle per-user
  • Caps at API tier bandwidth
  • 100% of bytes traverse your network

Signed-URL direct upload

  • Client uploads to S3 directly
  • API only mints a URL
  • Scales to object-store bandwidth
  • Inline scanning needs an async callback
  • Lower egress cost

Most production designs run both — the signed-URL path for known-large content (videos, dataset uploads); the proxy path for small content that benefits from inline processing.

Likely follow-up extensions:

  • Versioning. Add a version_id to File. Each upload to the same name creates a new version; delete becomes a tombstone. Same shape as S3 Versioning.
  • Resumable uploads on flaky mobile networks. Already covered by POST .../uploads:resume; client calls it after reconnect and learns which parts succeeded. Equivalent to Google’s tus-style resumable protocol.
  • Cross-region replication. Asynchronous backplane; same API contract; the region field in File becomes a list.
  • Lifecycle rules. A lifecycle config per “bucket” — auto-delete after N days, move to cold storage after M days. Mirrors S3 lifecycle config.

Mock interview follow-ups#

  • “What happens if part 3 fails after part 2 succeeded?” — The client retries part 3; idempotency by part number. Once all parts have an ETag, complete is called. If the client never returns, the GC sweep aborts the upload after 7 days.
  • “How do you stop someone from uploading a 5 TB file just to fill your bucket?” — Per-user quota checked at init time; declared size is the authoritative cap; if the upload exceeds it the parts beyond the cap are rejected 413.
  • “What’s stopping me from forging a signed URL?” — HMAC of (method, path, expiry, max_bytes, user_id) with a server-side secret. Tampering with any field breaks the signature.
  • “How do you serve a 50 GB ML dataset to 10k researchers?” — Origin-pull CDN on GET /v1/files/{id}; the CDN handles range requests and dedup at the edge. The API only authorises the first request per signature.
  • “Why not gRPC?” — HTTP is the lingua franca of object storage; every browser and CDN speaks it; range requests and resumable uploads are first-class. gRPC would force you to reinvent CDN integration.
  • “How do you handle deletes that need to survive an accidental rm?” — Soft delete: set deleted_at, hide from list, leave the object intact for 30 days. A reaper job purges after the retention window. Matches S3’s MFA-delete + lifecycle policy.
  • “What about end-to-end encryption?” — Client-side encryption: the client encrypts before upload using a key the API never sees. The API stores ciphertext + IV + a wrapped DEK. The sha256 in metadata is over the ciphertext for integrity.
Search ESC

Keyboard shortcuts

Shortcuts are disabled while typing in inputs.