Speeding Up Web Page Loading

Critical render path, third-party blockers, what an API designer can give the front-end to win the LCP and CLS scores.

Concept Intermediate
10 min read
performance lcp cls web-vitals optimization

Summary#

Page loading is the sequence: DNS resolves → TCP connects → TLS handshakes → HTTP request → HTTP response → HTML parses → CSS loads → JS loads → JS executes → API calls fire → data renders. Every step has a latency cost and any step can block the next. The Critical Render Path (CRP) is the minimal subset that has to finish before the user sees pixels.

Google’s Core Web Vitals measure three observable properties of this path:

  • Largest Contentful Paint (LCP) — time until the largest visible element (hero image, headline, video) is painted. Target: < 2.5s. This is what “the page loaded” feels like to the user.
  • Cumulative Layout Shift (CLS) — how much the page jumps around after first paint. Target: < 0.1. Measured as the sum of unexpected layout shifts.
  • Interaction to Next Paint (INP) — replaced First Input Delay in 2024. Time from any user input to the next paint. Target: < 200ms. Measures responsiveness.

An API designer’s contribution to page loading is more direct than the framing suggests. The API designer controls:

  • Payload size — compression, sparse fieldsets, fragments.
  • Round-trip count — batch endpoints, eager embedding, GraphQL.
  • CacheabilityCache-Control and ETag headers, CDN headers, immutable assets.
  • Connection reuse — HTTP/2 multiplexing, keep-alive, server push (deprecated) and 103 Early Hints (its successor).
  • First-byte time — origin processing time, edge function placement.

The page-load conversation is half front-end work (bundling, lazy-loading, image formats) and half API work (payload, round-trips, caching). The senior API designer knows where the seams are.

Why it matters#

Three reasons page-load speed sits on the API designer’s plate:

  • LCP is dominated by data. The “largest contentful” element is almost always a product image, a hero text block, or a primary content payload. The image is served by the API (or the CDN behind it); the text block is rendered from JSON the API returned. A slow API call is a slow LCP, full stop.
  • CLS comes from missing dimensions and late-arriving content. An API that returns a list of items without telling the front-end how many there will be forces a layout reflow when the last item arrives. APIs that return the count up front (or paginate to a known page size) help; APIs that stream of unknown length hurt.
  • Core Web Vitals are a ranking factor. Google promoted them into the search algorithm in 2021. A slow LCP affects search position. The front-end team is judged on numbers the API meaningfully shapes.

The senior signal in an interview: “I can give the front-end an LCP win by trimming the payload, embedding the hero data, and shipping Cache-Control with a CDN-tunable max-age.”

How it works#

The Critical Render Path — what blocks what#

DNS TCP TLS HTTP req HTTP resp
~20ms 1 RTT 1-2 RTT 1 RTT first byte
▼ ▼ ▼ ▼ ▼
─────► connection established ─────► server processes ─────► bytes arrive
HTML parsing
discover CSS, JS, images in head → fetch in parallel
CSS arrives → render-blocked until CSSOM built
JS arrives → parser-blocked unless async/defer
First paint (FCP)
JS executes → fetches API data
API responds → render
Largest contentful paint (LCP)
Image lazy-loads → layout shift
CLS accumulates

The blockers in order of impact:

  1. DNS resolution — 20-100ms cold. dns-prefetch and the user’s resolver cache hide it on warm connections.
  2. TCP handshake — 1 RTT. On a 50ms RTT, that’s 50ms. HTTP/2 reuses the connection; HTTP/3 (QUIC) merges TCP and TLS into one handshake.
  3. TLS handshake — 1-2 RTT for TLS 1.2, 1 RTT for TLS 1.3, 0 RTT for resumed TLS 1.3.
  4. Server-side processing time (TTFB) — the API designer’s number. From the first byte the server receives to the first byte it sends back. Target: < 200ms for warm origins, < 800ms for cold serverless.
  5. Payload size + download time — bytes / bandwidth. On a 10 Mbps connection, a 1 MB payload is 800ms.
  6. Render-blocking CSS/JS — synchronous <script> and stylesheets block parsing.
  7. Subsequent API calls — for CSR pages, the first paint is empty until the API responds.

What an API designer can do — the levers#

1. Trim the payload#

Most APIs return 5-10x more data than the page renders. Three trimming patterns:

Sparse fieldset — client picks fields
GET /products/42?fields=id,name,price,heroImage
→ { "id": 42, "name": "...", "price": 4999, "heroImage": "..." }
GraphQL — client picks fields by query
POST /graphql
{ "query": "{ product(id:42) { id name price heroImage } }" }
Endpoint variant — pre-trimmed for a known use case
GET /products/42/card # tiny payload for catalogue tiles
GET /products/42 # full payload for the product page

Pre-trimmed variants are the simplest; sparse fieldsets are the most flexible; GraphQL is the most powerful and the most operationally expensive.

2. Reduce round-trips#

Same idea as the data-fetching-patterns piece: batch endpoints, eager embedding, page-bundle endpoints for SSR. A page that makes 12 sequential API calls is slow regardless of how fast each call is; a page that makes 1-3 parallel calls hits LCP fast.

Batch + embedded — one round-trip for the whole product page
GET /pages/product/42
→ {
"product": { ... },
"related": [ ... ],
"reviews": { items: [...], next: "abc" },
"you_may_also_like": [ ... ]
}

3. Compress#

Almost free. Every modern client supports gzip; brotli is supported by every modern browser and beats gzip by 15-25% for text. Set the API gateway to negotiate Accept-Encoding: br, gzip and trim 60-80% of payload bytes off JSON responses.

4. Cache aggressively where it’s safe#

Cache-Control is the API designer’s biggest LCP lever for repeat visits.

HTTP/1.1 200 OK
Cache-Control: public, max-age=300, s-maxage=3600, stale-while-revalidate=60
ETag: "abc123"
Vary: Accept-Encoding
  • public — CDN may cache.
  • max-age=300 — browser caches for 5 minutes.
  • s-maxage=3600 — CDN caches for 1 hour (overrides max-age for CDNs).
  • stale-while-revalidate=60 — serve stale up to 60s past expiry while revalidating in background.
  • ETag — conditional GET returns 304 Not Modified (no body) when the resource hasn’t changed.

User-specific responses use private + Vary: Cookie. Anonymous public endpoints use public and live in the CDN.

5. Use HTTP/2 or HTTP/3#

HTTP/1.1 head-of-line blocks; one slow request blocks the others on the same connection (the browser opens 6 connections per host to mitigate). HTTP/2 multiplexes — one connection, many parallel streams. HTTP/3 (QUIC) goes further — handshake in one RTT, no TCP head-of-line blocking.

This is mostly an infrastructure-level decision (terminate TLS with an HTTP/2 or HTTP/3 frontend), but it’s the API designer’s job to know it’s on. Cloudflare, Vercel, AWS CloudFront ship HTTP/3 by default in 2026.

6. Ship 103 Early Hints#

A new HTTP status (RFC 8297). The server sends 103 Early Hints with preload links before it sends the final response — letting the browser start fetching critical resources while the server is still processing.

HTTP/1.1 103 Early Hints
Link: </styles/main.css>; rel=preload; as=style
Link: </hero.jpg>; rel=preload; as=image; fetchpriority=high
HTTP/1.1 200 OK
Content-Type: text/html
...

Cloudflare, Fastly, and Vercel all support this. Real-world wins: 100-300ms LCP improvement on slow origins.

What’s outside the API designer’s box (mention but don’t claim)#

The front-end’s contributions are larger in aggregate:

  • Image format — WebP, AVIF beat JPEG by 30-50%. The CDN does the conversion.
  • Image sizingsrcset and sizes attributes let the browser pick the right resolution.
  • Lazy-loadingloading="lazy" on offscreen images.
  • Bundle splitting — code-split the JS bundle so the route’s bundle is small.
  • Third-party scripts — analytics, ads, A/B tests. Often the biggest blocker; defer or remove.
  • Font loadingfont-display: swap to avoid invisible text.

The senior framing: API performance is necessary; front-end performance is the other half. The biggest LCP improvements come from coordinating both.

Core Web Vitals — by the numbers#

MetricGoodNeeds improvementPoor
LCP< 2.5s2.5-4.0s> 4.0s
CLS< 0.10.1-0.25> 0.25
INP< 200ms200-500ms> 500ms
TTFB (sub-metric)< 800ms800-1800ms> 1800ms

Measured at the 75th percentile of real-user traffic. Google’s Chrome User Experience Report (CrUX) collects this anonymously from Chrome users and feeds it into the search-ranking signal.

Variants and trade-offs#

Optimise for first paint. SSR + page-bundle endpoint + heavy CDN caching + 103 Early Hints. The full payload arrives in one round-trip; no JS blocks on data; LCP wins.

Optimise for repeat visits and interactivity. CSR with a service worker, aggressive client-side caching, prefetched fragment endpoints, code-split bundles. First paint is slower; subsequent navigation is near-instant.

The trade-offs that matter for the API designer:

LeverWinCost
Trim payload (sparse fieldsets)Smaller bytes, faster LCPPer-endpoint variant or flexible query layer
Embed related (eager)Fewer round-tripsLarger payload, server work
Cache-Control: public + CDNNear-zero latency on repeatsCache invalidation complexity
stale-while-revalidateNo user-visible refresh latencyBriefly stale data
HTTP/2 multiplexingNo per-host connection limitServer must terminate H2 (usually free)
HTTP/3 (QUIC)One-RTT handshakeEdge support; older clients fall back to H2
103 Early HintsBrowser preloads while origin computesOrigin must know critical resources up front
gzip / brotli60-80% byte reductionCPU cost (small)

When this is asked in interviews#

Page-load speed comes up in three places in API-design interviews:

  • Anywhere consumer-facing performance matters — e-commerce, news, streaming, search. The interviewer asks “how do we make the page load fast?” The senior answer is structured: trim the payload, reduce round-trips, cache at the CDN, use HTTP/2 or HTTP/3, use 103 Early Hints if the platform supports it. Then hand it to the front-end for the rest.
  • In any LCP / Core Web Vitals discussion. Name the LCP target (< 2.5s), the CLS target (< 0.1), the INP target (< 200ms). Explain what the API can do for each.
  • In any “our page is slow on cellular” question. Senior answer focuses on round-trips (cellular RTT is high), payload size (bandwidth is low), and CDN proximity (long-haul TLS handshake is the killer).

Specific points to make:

  • Name the LCP / CLS / INP triad. Reference Google as the source.
  • Name the API levers explicitly. Compression, sparse fieldsets, batch endpoints, Cache-Control, HTTP/2, 103 Early Hints.
  • Distinguish API performance from front-end performance. API is half the story; image format, bundle splitting, third-party scripts are the other half. Don’t claim ownership of what’s not yours.
  • Tie cacheability to the resource shape. Public catalogue: CDN, s-maxage=3600. User-specific: private, short TTL. Personalised first-paint SSR: Vary: Cookie, careful CDN config.

The strongest one-liner: “Trim the payload, reduce round-trips, cache at the CDN, run on HTTP/2 or HTTP/3, ship 103 Early Hints. The rest is the front-end team’s call — and the LCP number is theirs and mine.”

Search ESC

Keyboard shortcuts

Shortcuts are disabled while typing in inputs.