Systems & Postmortems
Named systems — PostgreSQL, MySQL/InnoDB, DynamoDB, Spanner — and public postmortems worth reading.
Knowing one specific database deeply makes you twice as effective as knowing the model abstractly. The Systems & Postmortems topic covers the four databases interviewers most often pivot on (PostgreSQL, MySQL/InnoDB, DynamoDB, Spanner) and the two postmortems that every working database engineer should be able to summarise (GitLab 2017, Knight Capital 2012).
For system writeups, the structure is: how it stores, how it queries, how it transactions, and how it scales. For postmortems: what was the contract, where did it break, what did the team learn, and what's the lesson generalisable to your work.
Key concepts
- PostgreSQL is the open-source reference — MVCC, extensible, sound defaults
- MySQL/InnoDB is the largest deployed RDBMS — clustered index, group replication
- DynamoDB is the canonical managed key-value DB — partition+sort key, single-table
- Spanner is the breakthrough for global ACID — TrueTime + Paxos
- Public postmortems teach what survives contact with production
Reference template
// Reading a database postmortem
1. What was the contract? (ACID? availability target?)
2. Where did the contract break? (which property failed first?)
3. What was the root cause? (human? system? interaction?)
4. What did recovery look like? (RPO and RTO actual vs target)
5. What changed afterwards? (technical, process, monitoring) Adapt to your problem; the structure is the load-bearing part.
Common pitfalls
- Treating each database as a black box — internals matter for capacity and cost
- Copying DynamoDB's single-table pattern into a relational schema — they're different worlds
- Trusting backups without restoration drills — GitLab had five backup mechanisms; none worked
- Believing 'serverless' means 'no operational cost' — provisioning and quotas remain
Related topics
Items (6)
- PostgreSQL — The Reference Open-Source RDBMS
Process-per-connection model, MVCC, the planner, extensibility (FDW, custom types), and what makes Postgres the default.
System Intermediate - MySQL / InnoDB
The B+ tree clustered-index storage engine, group replication, the historical mistakes and how 8.x undid them.
System Intermediate - Amazon DynamoDB
Partition + sort key model, single-table design, on-demand vs provisioned capacity, GSIs, transactions, the cost model.
System Advanced - Google Spanner
TrueTime, externally consistent transactions, Paxos groups, the breakthrough that made global ACID actually work.
System Advanced - GitLab 2017 — The Database Outage
A mistaken `rm -rf` on the primary; five backup mechanisms that all failed; the public postmortem everyone should read.
Postmortem Foundational - Knight Capital 2012 — $440M in 45 Minutes
An old code path enabled by a deploy; loose-state assumptions in a trading system; what 'shared mutable state' costs at scale.
Postmortem Foundational