Remote Procedure Calls (RPC)
Network abstractions over function-call semantics: gRPC, Thrift, REST-as-RPC, the leaks they hide and don't.
Summary#
A Remote Procedure Call dresses a network round-trip in the syntax of a function call. The promise — user.get(id) looks the same whether it’s local or remote — is a useful illusion, until it isn’t. Every serious distributed-systems failure mode (partial failure, timeout ambiguity, retries, ordering) lives in the gap between the illusion and the wire.
Why it matters#
RPC is the default vocabulary for service-to-service communication in modern systems, so the interviewer assumes you understand its leaks. The leaks aren’t bugs; they’re the eight fallacies of distributed computing (the network is reliable, latency is zero, bandwidth is infinite…) restated as a critique of the abstraction itself.
If you can name which specific guarantees of a local function call don’t carry across the wire — and what your RPC framework does about each — you’ve already cleared the bar for an SDE-2 system-design loop.
How it works#
Every RPC framework, regardless of branding, is three layers stacked:
- Interface Definition Language (IDL). A
.proto(gRPC),.thrift(Thrift), or OpenAPI schema. Describes services, methods, request/response types. Generates client and server stubs in N languages. - Serialization format. Protobuf, Thrift binary, JSON, MessagePack, Avro. Determines wire size, schema evolution rules, and CPU cost of encode/decode.
- Transport. HTTP/1.1, HTTP/2 (gRPC’s default), QUIC/HTTP/3, raw TCP. Determines multiplexing, head-of-line blocking, and what middleware (proxies, load balancers, observability) can inspect.
A call on the wire is: encode arguments → send over transport → server decodes → executes handler → encodes response → returns. The client stub hides this behind a method signature. Generated stubs handle connection pooling, retries (sometimes), deadlines, and metadata propagation.
The four guarantees that don’t carry across the wire#
- No partial failure locally. A function call either returns or throws. A network call can also time out without a verdict — the server may have executed, may have crashed mid-execution, may be slow. This single fact drives every retry, idempotency, and deduplication discussion.
- No latency variance locally. In-process calls are nanoseconds; an RPC over a load balancer in the same region is 1–5 ms p50, 50–500 ms p99 on a bad day. Latency tails are not optional knowledge.
- No bandwidth limit locally. RPC payloads have to fit a real MTU and a real cross-region bandwidth budget. Designs that pass 10 MB blobs as RPC arguments will be drilled.
- No version skew locally. Client and server are deployed independently, so the IDL must evolve in compatible ways. Adding a required field is a deploy-order trap.
Variants and trade-offs#
REST-as-RPC is the common middle ground: the wire format is HTTP+JSON, but the API is verb-shaped (POST /orders/cancel) rather than resource-shaped. Most “REST APIs” in production are this — and that’s fine; the dogmatic resource model rarely pays off.
GraphQL is RPC-shaped from the client’s perspective (queries are typed function calls returning typed data), but pushes selection of fields to the caller. Useful for many-clients / one-backend; over-engineered for service-to-service.
Streaming RPC matters when responses are inherently sequential (LLM token streams, real-time location, log tails). HTTP/2 server-streaming and bidi-streaming are gRPC’s killer feature versus REST.
Why HTTP/2 specifically
HTTP/1.1 has head-of-line blocking per connection — one slow response blocks the queue behind it. HTTP/2 multiplexes streams over a single TCP connection, so an RPC framework can run thousands of concurrent calls over one socket. The catch: TCP-level head-of-line blocking still exists (a dropped packet stalls all streams), which is why HTTP/3 / QUIC over UDP is the next step.
When this is asked in interviews#
Three flavors show up:
- The interface step (Step 3 of the walk-through). “Define the API.” A senior candidate uses an IDL-shape sketch — method, args, return, error type, idempotency, deadlines — not a vague “POST
/foo”. - The retry / idempotency drill. “Your
createOrderRPC times out. What do you do?” Wrong answer: “retry”. Right answer: “retry with an idempotency key the server dedupes by; if no key, expose a status endpoint and reconcile.” - The framework-choice question. Common at companies with polyglot stacks (Uber, Google, Coinbase). “Why gRPC over REST here?” The right answer cites payload size, streaming, schema discipline, or codegen — not “it’s faster” which isn’t always true.
More common at infrastructure-leaning companies and at any platform/SRE-track loop. Frontend-leaning loops will use REST/GraphQL framing and ask roughly the same questions in different vocabulary.
Related concepts#