GraphQL — A Query Language for APIs — API Design

What it is#

GraphQL is a query language for APIs and a runtime for fulfilling those queries against a typed schema. Facebook built it internally in 2012 to handle the over-fetching and N+1 problem they hit while building the iOS News Feed. They open-sourced it in 2015. It’s now used by GitHub, Shopify, Netflix, Airbnb, and several thousand others.

The shape of GraphQL is different from REST in three load-bearing ways:

One endpoint, not many. Every query and mutation goes to POST /graphql. There are no /users or /orders URLs.
The client picks the fields. The query specifies exactly which fields it wants from each object in the response. The server returns exactly those, in the shape the client requested.
The schema is the contract. A typed schema written in SDL (Schema Definition Language) defines every type, every field, every operation. The schema is introspectable at runtime — clients can ask the server “what queries can I run?” and get back a machine-readable answer.

Where REST is “resources over HTTP,” GraphQL is “a typed query language with one endpoint.” Both are valid; both have their place. The decision is about consumer shape.

When to use it#

Reach for GraphQL when:

The consumer is a rich, screen-shaped UI (mobile apps, complex SPAs). Each screen needs a different slice of the resource graph; REST forces over-fetching or N+1 round-trips.
The clients evolve faster than the backend. A mobile app shipped six months ago should be able to keep using the old schema while the new app uses new fields. GraphQL’s additive-only evolution model fits this.
The data is graph-shaped. A social product, a marketplace, a content site with rich cross-references. When user.posts.comments.author.followers is a natural query, GraphQL is the natural API.
Multiple frontends share one backend. Web, iOS, Android, partner integrations — each wants different fields. One GraphQL schema serves them all without per-client backend logic.

Avoid GraphQL (or be careful) when:

The consumer is a simple resource-CRUD integration. A partner backend syncing orders doesn’t need GraphQL’s flexibility; REST is simpler.
Caching at the HTTP layer matters. GraphQL queries are POSTs with bodies; CDNs cannot cache them by URL. You need application-level caching (Apollo, Relay).
Authorization is granular per field across many roles. GraphQL’s per-field auth is doable but expensive — every field on every type needs a permission check. REST’s per-endpoint auth is coarser but cheaper.
Public partner API with conservative integrators. REST has lower friction for a partner who just wants to integrate quickly.

How it works#

The schema (SDL)#

A GraphQL schema is a typed description of every queryable shape. Written in SDL:

type User {
  id: ID!
  name: String!
  email: String!
  orders(last: Int = 10, status: OrderStatus): [Order!]!
}

type Order {
  id: ID!
  status: OrderStatus!
  total: Money!
  customer: User!
  items: [OrderItem!]!
  createdAt: DateTime!
}

type OrderItem {
  sku: String!
  quantity: Int!
  product: Product!
}

type Money {
  currency: String!
  valueMinor: Int!
}

enum OrderStatus {
  PENDING
  CONFIRMED
  SHIPPED
  DELIVERED
  CANCELLED
}

type Query {
  user(id: ID!): User
  order(id: ID!): Order
}

type Mutation {
  createOrder(input: CreateOrderInput!): CreateOrderPayload!
  cancelOrder(id: ID!): Order!
}

input CreateOrderInput {
  customerId: ID!
  items: [OrderItemInput!]!
  shippingAddressId: ID!
}

type CreateOrderPayload {
  order: Order!
  userErrors: [UserError!]!
}

A few things to notice:

! means non-null. String! is required; String is nullable.
[Order!]! is a non-null array of non-null orders. Both the array and its elements are required.
Query, Mutation, Subscription are the three special root types — the entry points to the schema.
Inputs are separate types. CreateOrderInput is the shape the client sends; CreateOrderPayload is what comes back. Convention: every mutation has a userErrors array for validation errors so they don’t show up as top-level GraphQL errors.

Queries, mutations, subscriptions#

The three operation kinds:

Queries are read-only. They can be parallelised by the server, cached, retried.
Mutations are writes. They execute serially in the order specified.
Subscriptions are streams. Typically delivered over WebSocket; the server pushes updates when the underlying data changes.

A query and the corresponding response — the shape match is the magic of GraphQL.

query GetUserDashboard($userId: ID!, $orderLimit: Int!) {
  user(id: $userId) {
    name
    email
    orders(last: $orderLimit, status: CONFIRMED) {
      id
      total {
        currency
        valueMinor
      }
      items {
        product {
          name
        }
        quantity
      }
    }
  }
}

{
  "userId": "usr_18df",
  "orderLimit": 5
}

{
  "data": {
    "user": {
      "name": "Alex Chen",
      "email": "alex@example.com",
      "orders": [
        {
          "id": "ord_a3f9c2",
          "total": { "currency": "USD", "valueMinor": 4999 },
          "items": [{ "product": { "name": "Wireless mouse" }, "quantity": 1 }]
        }
      ]
    }
  }
}

The client asked for name, email, and orders.id/total/items.product.name/items.quantity. The server returned exactly that — no id on the user (not asked for), no customer field on the order (not asked for), no fields on items other than product.name and quantity. That precision is the whole point.

The N+1 problem and DataLoader#

GraphQL’s biggest operational footgun. Consider the query above. The naive server implementation looks like:

Fetch the user (1 query).
Fetch the user’s orders (1 query).
For each order, fetch its items (N queries).
For each item, fetch its product (N×M queries).

A simple-looking query has fanned out into hundreds of database round-trips. This is the N+1 query problem, and it’s the single biggest reason GraphQL servers fall over in production.

The fix is DataLoader (Facebook’s open-source library, ported to every language). DataLoader batches per-request resolver calls within a single tick of the event loop:

The first product fetch doesn’t execute immediately — it’s queued.
The next product fetch joins the queue.
At the end of the tick, DataLoader fires one batched query: SELECT * FROM products WHERE id IN (...).

Result: the N+1 becomes a single round-trip per type. Every production GraphQL server runs DataLoader (or its language-specific equivalent). Without it, you do not have a viable API; you have a denial-of-service surface on yourself.

Schema as the contract#

The schema is the source of truth. Three downstream consequences:

Introspection. A client can query the server for the schema itself (query IntrospectionQuery { __schema { ... } }). This is what powers GraphiQL, Apollo Sandbox, IDE autocomplete, and code generators.
Versioning by deprecation, not URL prefixes. GraphQL APIs typically don’t have /v1, /v2. Instead, fields are marked @deprecated(reason: "use newField") and removed only after the deprecation timeline ends. Clients keep working because the schema is additive.
Codegen on the client. Tools like graphql-codegen read the schema and produce strongly-typed clients in TypeScript, Swift, Kotlin. The client gets compile-time guarantees about the response shape.

Authorization (the unsolved part)#

REST’s auth is per endpoint: “this caller can hit GET /orders or it can’t.” GraphQL’s auth is per field: “this caller can read Order.total but not Order.customer.email.” That granularity is powerful and also expensive — every field on every type needs a permission check.

Patterns in production:

Field-level resolvers with auth in each one. Verbose but explicit.
Schema-directives like @auth(role: ADMIN) that wrap resolvers transparently. Hasura, PostGraphile, and several Apollo plugins do this.
Persisted queries — the client sends only a query hash; the server runs only pre-approved queries. Removes the entire surface where a hostile client crafts a query that reaches a field they shouldn’t see. GitHub’s GraphQL API uses this for performance reasons; many teams use it for auth.

A representative client across languages#

The same query above, executed from Python, Go, and Node. Apollo/Relay are common; here are direct HTTP calls to keep it framework-agnostic.

import requests

QUERY = """
query GetUserDashboard($userId: ID!, $orderLimit: Int!) {
  user(id: $userId) {
    name
    email
    orders(last: $orderLimit, status: CONFIRMED) {
      id
      total { currency valueMinor }
      items { product { name } quantity }
    }
  }
}
"""

resp = requests.post(
    "https://api.example.com/graphql",
    headers={
        "Authorization": "Bearer eyJhbGciOi...",
        "Content-Type": "application/json",
    },
    json={
        "query": QUERY,
        "variables": {"userId": "usr_18df", "orderLimit": 5},
        "operationName": "GetUserDashboard",
    },
    timeout=10,
)
resp.raise_for_status()
body = resp.json()

if "errors" in body:
    raise RuntimeError(body["errors"])

user = body["data"]["user"]
print(user["name"], len(user["orders"]))

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "net/http"
)

const query = `
query GetUserDashboard($userId: ID!, $orderLimit: Int!) {
  user(id: $userId) {
    name
    email
    orders(last: $orderLimit, status: CONFIRMED) {
      id
      total { currency valueMinor }
      items { product { name } quantity }
    }
  }
}`

func main() {
    body, _ := json.Marshal(map[string]any{
        "query":         query,
        "variables":     map[string]any{"userId": "usr_18df", "orderLimit": 5},
        "operationName": "GetUserDashboard",
    })

    req, _ := http.NewRequest("POST", "https://api.example.com/graphql", bytes.NewReader(body))
    req.Header.Set("Authorization", "Bearer eyJhbGciOi...")
    req.Header.Set("Content-Type", "application/json")

    resp, err := http.DefaultClient.Do(req)
    if err != nil { panic(err) }
    defer resp.Body.Close()

    var out struct {
        Data   json.RawMessage   `json:"data"`
        Errors []json.RawMessage `json:"errors"`
    }
    json.NewDecoder(resp.Body).Decode(&out)
    if len(out.Errors) > 0 { panic(string(out.Errors[0])) }
    fmt.Println(string(out.Data))
}

const query = `
query GetUserDashboard($userId: ID!, $orderLimit: Int!) {
  user(id: $userId) {
    name
    email
    orders(last: $orderLimit, status: CONFIRMED) {
      id
      total { currency valueMinor }
      items { product { name } quantity }
    }
  }
}`;

const resp = await fetch("https://api.example.com/graphql", {
  method: "POST",
  headers: {
    Authorization: "Bearer eyJhbGciOi...",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    query,
    variables: { userId: "usr_18df", orderLimit: 5 },
    operationName: "GetUserDashboard",
  }),
});

const body = await resp.json();
if (body.errors) throw new Error(JSON.stringify(body.errors));
console.log(body.data.user.name, body.data.user.orders.length);

Variants#

Variant	Mechanism	When it fits
Apollo Federation	Multiple GraphQL services compose into one super-graph; a federation gateway stitches schemas together	Large orgs with many teams owning slices of one big graph
Schema stitching	The pre-Federation way to combine multiple schemas	Mostly deprecated; Federation v2 is the modern path
Persisted queries	Client sends a query hash; server runs only pre-approved queries	Production hardening; reduces payload, locks down query surface
Relay-style cursor connections	A standardised pagination spec (`edges`, `nodes`, `pageInfo`)	Any list field that paginates; GitHub’s API uses this
Hasura / PostGraphile	Auto-generate a GraphQL schema from a Postgres database	Internal tools, admin panels, prototypes
GraphQL subscriptions over WebSocket	Real-time streams of typed events	Live dashboards, collaborative editing
`graphql-ws` / `graphql-transport-ws`	Subscription transport protocols	What Apollo Client uses for subs

Trade-offs#

What GraphQL gives you:

One round-trip per screen. The mobile-feed problem that motivated GraphQL’s invention. Real wins for rich UIs.
No over-fetching. Clients pay only for what they ask for.
Schema as documentation. Introspection means the schema is the docs.
Strongly-typed clients via codegen. TypeScript, Swift, Kotlin, all with compile-time field checks.
Frontend-driven iteration. Backend ships fields; frontend uses what it wants when it’s ready. Decoupled release cycles.

What GraphQL costs you:

N+1 risk on every query. DataLoader is mandatory, not optional.
HTTP-layer caching is gone. All queries are POSTs; CDNs can’t cache by URL. You need application-level caching (Apollo Client cache, persisted queries with CDN-cacheable GETs).
Authorization complexity. Per-field auth across hundreds of fields is real work.
Query-cost analysis. A pathological query (user { friends { friends { friends { ... } } } }) can DoS the server. Production servers need depth limits, cost limits, or persisted-queries-only mode.
Error semantics are ambiguous. A GraphQL response can have both data and errors. A partial failure on one field looks the same shape as a full success. Clients need to handle both.
Browser dev-tools are weaker than HTTP. A REST request is one URL in the network tab; a GraphQL request is a POST with the query in the body. Apollo DevTools and GraphiQL fill the gap, but it’s not free.

Common pitfalls#

No DataLoader. Watching your database explode under a single client query. Always batch resolvers.
No query depth or cost limit. A hostile or buggy client crafts a friends.friends.friends.friends... query and your server thrashes.
Per-endpoint auth ported wholesale to per-field. Re-running the same permission check on every field is a perf problem; consolidate.
Treating data: null, errors: [...] and data: { foo: null }, errors: [...] the same. They are different — one is a top-level failure, the other is partial. Document which fields can be null.
Versioning by URL (/v1/graphql, /v2/graphql). GraphQL’s whole evolution model is @deprecated; URL versioning fights it.
No persisted queries on a public mobile API. Every client can craft any query, including expensive ones. Persisted queries lock the surface.
Resolvers that hide REST behind GraphQL. If your GraphQL server makes 10 HTTP calls to fulfill one query, you’ve inverted the problem — now the GraphQL server is the over-fetching one. The win only materialises when the resolvers talk to a database (or batched microservices) directly.

REST — The Architectural Style — the alternative architectural style for resource APIs.
RESTful API Design in Practice — REST’s practical playbook.
gRPC — Protobuf over HTTP/2 — the other typed-schema option, for internal RPC.
REST vs GraphQL vs gRPC — Comparison — REST vs GraphQL vs gRPC, the trade-offs in one table.
HTTP — The Foundational Protocol for APIs — the HTTP layer GraphQL runs over.
WebSockets — Bidirectional Streaming — the transport for GraphQL subscriptions.