DBMS Workbook

53 topics across foundations, non-functional requirements, building blocks, full system designs, and public postmortems. Every system uses the same 7-step interview walk-through; every building block has a consistent design template.

22 Foundational 25 Intermediate 6 Advanced 8 topics RSS

Foundations

4 items

What a database is, what a DBMS adds over a pile of files, the three-schema architecture, and the family of database systems.

  • What Is a Database?

    A shared, structured, queryable, persistent collection of data — and the DBMS that gives those four properties teeth.

    Concept Foundational
  • Why Not Files? The Case for a DBMS

    The limits of file-based systems that drove the move to databases — redundancy, integrity, concurrency, recovery.

    Concept Foundational
  • DBMS Architecture — The Three-Schema Model

    Conceptual, logical, and physical schemas; data independence; how a schema change at one level shouldn't break the others.

    Concept Foundational
  • Classification of Database Systems

    Relational vs document vs key-value vs graph vs columnar vs time-series. What each is good at and where the boundaries blur.

    Concept Foundational

Data Modeling

5 items

ER modeling — entities, attributes, relationships, constraints, weak entities — and translating ER to relational tables.

Relations, keys, integrity constraints, the algebra, transactions, ACID, isolation levels.

  • Relational Model Concepts

    Relations, tuples, attributes, domains. Why the relational model won — composability and a closed algebra.

    Concept Foundational
  • Keys — Primary, Candidate, Foreign, Composite

    What each key is for, how to choose a primary key, why surrogate keys are usually the right default.

    Concept Foundational
  • Integrity Constraints

    Entity integrity, referential integrity, domain constraints, CHECK constraints, and triggers as the catch-all.

    Concept Foundational
  • Relational Algebra Basics

    Selection, projection, join, union, difference. The algebra SQL is compiled into.

    Building Block Intermediate
  • Transactions and ACID

    Atomicity, Consistency, Isolation, Durability. What each one promises and where each one bends in practice.

    Concept Foundational
  • Isolation Levels and Anomalies

    Read-uncommitted to serializable; the anomalies each prevents (dirty read, non-repeatable, phantom, write skew).

    Concept Intermediate

Normalization

3 items

Functional dependencies and the normal forms (1NF, 2NF, 3NF, BCNF) — and when stopping is the right answer.

SQL

8 items

SQL as the user-facing surface — DDL, DML, SELECT, joins, aggregates, subqueries, CTEs, window functions, EXPLAIN.

  • SQL DDL — CREATE, ALTER, DROP

    Defining the schema. Tables, columns, constraints, indexes — and the ALTER traps on a live system.

    Building Block Foundational
  • SQL DML — INSERT, UPDATE, DELETE, MERGE

    Mutating data. RETURNING, UPSERT / ON CONFLICT, the things to know about row-locking on writes.

    Building Block Foundational
  • SELECT and the WHERE Clause

    The shape of every read. Predicate logic, NULL handling, the difference between AND/OR and short-circuit eval.

    Building Block Foundational
  • JOIN — INNER, LEFT, RIGHT, FULL, CROSS

    Connecting tables. When LEFT JOIN is the right tool, what FULL OUTER is for, and how CROSS JOIN should be feared.

    Building Block Foundational
  • Aggregates, GROUP BY, HAVING

    COUNT / SUM / AVG / MIN / MAX, the GROUP BY contract, why HAVING is not WHERE.

    Building Block Foundational
  • Subqueries and CTEs

    Correlated vs non-correlated subqueries, the readability case for CTEs, recursive CTEs for trees.

    Building Block Intermediate
  • Window Functions

    OVER, PARTITION BY, ORDER BY, frame clauses. Running totals, ranks, lag/lead — the part of SQL most engineers underuse.

    Building Block Intermediate
  • Query Execution and EXPLAIN Plans

    How the planner reads a query, the cost-based optimizer, EXPLAIN/EXPLAIN ANALYZE, the four operators you'll see most.

    Concept Intermediate

Row-store vs column-store, B-trees, hash and bitmap indexes, LSM-trees, WAL, MVCC.

  • Row-Store vs Column-Store

    Why OLTP is row-oriented and OLAP is column-oriented; the storage layout that explains the performance gap.

    Building Block Intermediate
  • B-Tree Indexes

    The classic disk-friendly tree, fanout, leaf-page linking, why every relational DBMS defaults to B-tree.

    Building Block Intermediate
  • Hash and Bitmap Indexes

    Hash indexes for equality lookups, bitmap indexes for low-cardinality columns, the trade-offs vs B-tree.

    Building Block Intermediate
  • LSM-Tree Storage

    Memtables, SSTables, compaction. The structure behind LevelDB, RocksDB, Cassandra, ScyllaDB, and DynamoDB's storage.

    Building Block Advanced
  • Write-Ahead Logging and Recovery

    Why every durable database writes a WAL, ARIES, redo/undo, checkpointing, crash recovery.

    Building Block Intermediate
  • MVCC — Multi-Version Concurrency Control

    Snapshot isolation, vacuum, the trade-off vs lock-based concurrency. Why Postgres, MySQL InnoDB, and Spanner all use MVCC.

    Building Block Advanced

Named systems — PostgreSQL, MySQL/InnoDB, DynamoDB, Spanner — and public postmortems worth reading.

  • PostgreSQL — The Reference Open-Source RDBMS

    Process-per-connection model, MVCC, the planner, extensibility (FDW, custom types), and what makes Postgres the default.

    System Intermediate
  • MySQL / InnoDB

    The B+ tree clustered-index storage engine, group replication, the historical mistakes and how 8.x undid them.

    System Intermediate
  • Amazon DynamoDB

    Partition + sort key model, single-table design, on-demand vs provisioned capacity, GSIs, transactions, the cost model.

    System Advanced
  • Google Spanner

    TrueTime, externally consistent transactions, Paxos groups, the breakthrough that made global ACID actually work.

    System Advanced
  • GitLab 2017 — The Database Outage

    A mistaken `rm -rf` on the primary; five backup mechanisms that all failed; the public postmortem everyone should read.

    Postmortem Foundational
  • Knight Capital 2012 — $440M in 45 Minutes

    An old code path enabled by a deploy; loose-state assumptions in a trading system; what 'shared mutable state' costs at scale.

    Postmortem Foundational

SQL Patterns

15 items

Fifteen reusable query templates that show up in every SQL interview — aggregation, filtering, comparison, sequencing/hierarchical, transformation. The decomposition vocabulary for data problems.

  • Tally Count

    Count occurrences of values in a column. The COUNT(*) GROUP BY template every SQL interview opens with.

    Pattern Foundational
  • Group Bucket

    Bucket continuous values into ranges (age bands, price tiers) using CASE WHEN inside GROUP BY.

    Pattern Foundational
  • Rolling Totals

    Cumulative aggregations over a window. SUM() OVER (ORDER BY ... ROWS BETWEEN ...). The time-series running-total template.

    Pattern Intermediate
  • Patch Gaps

    Fill missing rows in a sequence with default values via LEFT JOIN against a generated calendar or series.

    Pattern Intermediate
  • Existence Check

    EXISTS vs NOT EXISTS vs IN vs LEFT JOIN IS NULL — four ways to say where this related row does or does not exist.

    Pattern Foundational
  • Deduplication

    ROW_NUMBER() OVER (PARTITION BY ...) = 1 and the DISTINCT ON (Postgres) alternatives for keeping one row per group.

    Pattern Foundational
  • Pattern Search

    LIKE, ILIKE, regex, trigram, full-text — the spectrum of find rows that match a string pattern.

    Pattern Foundational
  • Set Compare

    INTERSECT / EXCEPT / UNION and the equivalent JOIN / NOT EXISTS forms. When set algebra beats joins.

    Pattern Intermediate
  • Join Variants

    Self-join, anti-join, semi-join, lateral join, theta-join. Beyond the standard four kinds of join.

    Pattern Intermediate
  • Nested Query

    Correlated vs uncorrelated subqueries; subquery in SELECT vs WHERE vs FROM. When CTEs beat nested subqueries.

    Pattern Intermediate
  • Rank and Row

    ROW_NUMBER vs RANK vs DENSE_RANK. Per-partition top-N selection. The second-highest-salary canon.

    Pattern Intermediate
  • Lead-Lag Navigation

    LEAD() and LAG() for previous/next-row comparison. Streak detection, change-from-previous metrics, run-length tracking.

    Pattern Intermediate
  • Recursive Tree

    Recursive CTEs (WITH RECURSIVE) for tree traversal, transitive closures, organisational hierarchies.

    Pattern Advanced
  • Gaps and Islands

    The streak-detection template: subtract row number from a value, group by the difference. Identifies runs of consecutive rows.

    Pattern Advanced
  • Pivot Flip

    Long-to-wide and wide-to-long reshapes. PIVOT (T-SQL), crosstab (Postgres), conditional aggregation as the portable form.

    Pattern Intermediate
Search ESC

Keyboard shortcuts

Shortcuts are disabled while typing in inputs.