DNS — Hierarchy, Records, and Query Resolution
Root → TLD → authoritative → recursive; A / AAAA / CNAME / MX / NS / TXT; the recursive resolution walk.
What it is#
The Domain Name System is the Internet’s distributed phone book. It translates human-readable names (example.com) into the machine-readable identifiers that protocols actually use — most commonly IPv4 addresses (93.184.216.34) and IPv6 addresses, but also mail-exchange hosts, service endpoints, certificate-authority authorisation lists, and arbitrary text records. It is a hierarchical, globally distributed, eventually-consistent database — replicated and cached at every level — with a query/response protocol defined by RFCs 1034 and 1035 (1987) and continuously extended.
DNS is what makes the rest of the web tractable for humans. Without it, every URL would be an IP address, every IP renumbering would break every link, and load-balancing across many backend servers would require client-side awareness. DNS hides all of that behind a name.
When to use it#
DNS is invoked whenever you address something by name rather than by IP. In practice that’s nearly always. Specifically:
- Browser fetches a URL. First step is a DNS lookup for the host portion.
- Service discovery in the cloud. Internal services advertise via DNS (private hosted zones, Kubernetes’ CoreDNS), clients resolve by name.
- Traffic steering. Latency-based, geo-based, or weighted DNS routing sends users to the closest healthy endpoint.
- Email delivery. SMTP senders look up MX records to find the destination mail server.
- Certificate validation. ACME-style (Let’s Encrypt) uses TXT records for domain-validation challenges.
Skip DNS only when you really mean to address a specific IP (debugging, hardcoded edge nodes, peer-to-peer rendezvous). Even then, the cost of going through DNS is tiny: a cached lookup is sub-millisecond.
How it works#
The hierarchy#
DNS is a tree. The root is the unnamed top; below it are top-level domains (TLDs) like com, org, io, uk; below those are the registered names (example.com); below those are subdomains the owner controls (api.example.com, mail.example.com).
. (root) /|\\ / | \\ com org uk ... (TLDs) | | example ac | | www cam | ...Each level is independently administered: ICANN delegates TLDs to registries; registries sell domains to registrants; registrants run their own authoritative nameservers for everything below their domain.
The four kinds of nameserver#
+-------------------+----------------------------------+| root nameservers | 13 logical, anycast-replicated || | served by 12 operators worldwide |+-------------------+----------------------------------+| TLD nameservers | one set per TLD || | (.com servers, .uk servers, ...)|+-------------------+----------------------------------+| authoritative NS | own the records for a zone || | (e.g. Route 53 hosts example.com)|+-------------------+----------------------------------+| recursive resolver| does the walk on your behalf || | (your ISP, 8.8.8.8, 1.1.1.1) |+-------------------+----------------------------------+The recursive resolver is the one your laptop actually talks to. The others are walked transitively by the resolver.
The resolution walk#
A fresh lookup for www.example.com from a cold-cache resolver:
client (stub) ---> recursive resolver | | 1. Where is .com? v root NS -> "ask .com NS at a.gtld-servers.net" | | 2. Where is example.com? v .com NS -> "ask ns1.example.com (the authoritative)" | | 3. What's www.example.com? v authoritative NS -> "www.example.com A 93.184.216.34" | vclient (stub) <--- recursive resolver returns the answer + caches itAfter the first lookup, every layer’s answer is cached at the resolver for the TTL of the record. The next lookup of images.example.com skips straight to the authoritative NS (the resolver already knows where example.com lives).
Wire details: queries go to port 53 over UDP for small responses, falling back to TCP when the answer doesn’t fit in a UDP packet (> 512 bytes without EDNS0, larger with). DNS-over-HTTPS (DoH) and DNS-over-TLS (DoT) encrypt the channel — both became common after 2018.
Record types#
The records you’ll actually use:
A IPv4 address www.example.com A 93.184.216.34AAAA IPv6 address www.example.com AAAA 2606:2800:220:1:248:1893:25c8:1946CNAME alias shop.example.com CNAME store.shopify.comNS nameserver example.com NS ns1.example.comMX mail exchanger example.com MX 10 mail.example.comTXT arbitrary text example.com TXT "v=spf1 include:_spf.google.com ~all"SRV service location _sip._tcp.example.com SRV 10 60 5060 sip.example.comPTR reverse lookup 34.216.184.93.in-addr.arpa PTR www.example.comSOA zone authority example.com SOA ns1.example.com hostmaster.example.com (...)CAA cert authority example.com CAA 0 issue "letsencrypt.org"Two tips that catch most engineers:
- CNAME chains — a CNAME points to another name, which must be resolved separately. Browsers and resolvers chase the chain. CNAME cannot coexist with other records at the same name (e.g., no
MXon a name that already has aCNAME). - MX has a priority.
MX 10 mail.example.comsays try 10 first; lower number = higher priority. Multiple MXes give failover.
TTL and caching#
Every record carries a TTL — how long resolvers should cache it before re-querying. Long TTL = better performance, slower propagation. Short TTL = the opposite. Typical values: 300s (5 minutes) for records that might change; 86400s (24h) for stable records; 30s during a planned migration so you can revert quickly.
Variants#
- Authoritative servers — own the actual records for a zone. You configure these (or your DNS provider does). PowerDNS, NSD, BIND, Cloudflare, AWS Route 53.
- Recursive resolvers — do the walk for clients. Your ISP runs one; public ones include Google
8.8.8.8, Cloudflare1.1.1.1, Quad99.9.9.9. - Anycast DNS — one IP advertised from many locations; BGP routes you to the nearest. Used for root and TLD servers, and by major resolvers.
- DNSSEC — cryptographic signatures on records, validated by resolvers. Defends against cache poisoning. Adoption mixed (most TLDs sign; few zone owners and few stub resolvers validate).
- DNS-over-HTTPS (DoH) and DNS-over-TLS (DoT) — encrypted transport for DNS queries. DoH (RFC 8484) uses HTTPS so it’s indistinguishable from web traffic; DoT (RFC 7858) uses TCP port 853 and is firewall-friendly to detect.
- Split-horizon DNS — different answers depending on who’s asking. Common for internal vs external views of corporate domains.
- Latency / geo / weighted routing — extensions that return different answers based on resolver location (EDNS Client Subnet helps), measured latency, or weighted shares. Route 53, NS1, and similar managed DNS offer these.
Trade-offs#
Other trade-offs:
- UDP vs TCP — UDP is the fast path; TCP is the fallback for large responses (DNSSEC blows past 512 bytes; DoT/DoH always use TCP). Most resolvers handle both transparently.
- Caching invariance — DNS answers can change between two consecutive lookups. Clients that pin to a single resolved IP (Java’s default JVM cache used to be forever) miss failovers. Re-resolve on connect failure.
- Trust model — DNSSEC defends only the path from authoritative to recursive resolver, not from resolver to client (DoH/DoT fix that). The honest threat model: the resolver is part of your trusted base.
Common pitfalls#
- CNAME at the apex. Standards forbid
example.com CNAME ...because the apex needs SOA and NS records. Providers offer workarounds (Cloudflare CNAME flattening, Route 53 ALIAS records). - MX pointing to a CNAME. Not allowed by RFC; some receivers will ignore the MX. Always point MX to an A/AAAA-resolvable name.
- Forgetting to lower TTL before a cutover. You announce the change, but old resolvers still serve the old answer for the previous TTL. Always pre-lower.
- Trusting
dig +shortfrom one location. Different recursive resolvers see different cached states. Test from multiple vantage points (a public resolver, your laptop, a remote VPN). - Hardcoding IPs after DNS lookup. Common in long-running processes (Java JVMs, some Python services). The IP changes (Route 53 failover, blue/green); the process keeps hitting the dead host. Re-resolve on errors.
- Wildcard records masking subdomain typos.
*.example.com A 1.2.3.4answers every name — including misspellings. Useful sometimes, mostly a debugging hazard. - Long TTL on the negative cache. SOA’s
minimumfield controls how long resolvers cache NXDOMAIN. Set too high and a typo’d test record stays NXDOMAIN even after you add it.
What actually happens in 1ms when you cache-hit a name
Your stub resolver (the OS) is asked for www.example.com. The OS checks its own cache (nscd, systemd-resolved, dnsmasq, mDNSResponder); if present and not expired, it returns the cached A record — no network at all. If absent, it sends a UDP query to the configured recursive resolver (often the gateway’s ~10 ms away on the LAN). The resolver checks its cache; if present, it returns immediately. Total: well under 1 ms for cached, 10-30 ms for a resolver round-trip on a popular name, 100-300 ms for a fully-cold walk from root. The cold path is rare; the warm path is what users see.
Related building blocks#