DBMS SQL

SQL DDL — CREATE, ALTER, DROP

Defining the schema. Tables, columns, constraints, indexes — and the ALTER traps on a live system.

Building Block Foundational
8 min read
ddl schema alter-table migrations ddl-locks

What it is#

DDL is the data-definition subset of SQL: statements that change schema rather than data. The big three are CREATE, ALTER, and DROP, each applicable to tables, indexes, views, schemas, sequences, types, functions, triggers, materialized views, and more. A fourth verb, TRUNCATE, technically lives in DDL because it deallocates storage and resets sequences rather than walking rows — it’s a fast delete that bypasses triggers.

DDL is what migrations are made of. Every relational database in production is shaped by a chronological sequence of DDL statements applied via some migration tool (Alembic, Flyway, Liquibase, Rails, Prisma, Knex). The statements themselves are usually small; the danger lives in how they interact with a running system — most DDL takes locks that can stall every query on the table, sometimes for the duration of a full table rewrite.

When to use it#

  • Initial schema creation. CREATE TABLE for every entity, with all constraints inline. Add indexes for the access patterns you know about; resist the urge to pre-index everything.
  • Schema evolution. ALTER TABLE ... ADD COLUMN, DROP COLUMN, ADD CONSTRAINT, ALTER COLUMN TYPE. Each carries different lock and rewrite costs.
  • Index management. CREATE INDEX, DROP INDEX, REINDEX. Almost always run CREATE INDEX CONCURRENTLY on Postgres for production tables.
  • Refactoring. Renaming a table, splitting one table into two, partitioning, switching constraint definitions. Always plan as a multi-step migration with the old shape still queryable.
  • Tear-down. DROP TABLE, DROP SCHEMA CASCADE for non-prod. In production, the strong recommendation is to rename-then-drop after a soak period rather than drop directly.

Avoid raw DDL outside a migration tool. Hand-typed ALTER against production is how schemas drift between environments and how someone forgets that staging needs the same change.

How it works#

CREATE TABLE#

The canonical shape, with constraints and defaults declared inline:

CREATE TABLE users (
id BIGSERIAL PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
name TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'active'
CHECK (status IN ('active', 'suspended', 'deleted')),
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX users_status_idx ON users (status)
WHERE status <> 'deleted';

Every column has a type, NOT NULL, and a default where the business logic has one. Constraints with explicit names (CONSTRAINT chk_status CHECK (...)) make error messages and later ALTER ... DROP CONSTRAINT calls less brittle.

CREATE INDEX#

CREATE INDEX orders_user_created_idx ON orders (user_id, created_at DESC);
CREATE UNIQUE INDEX users_email_lower_idx ON users (LOWER(email));
-- Production-safe, non-blocking (Postgres):
CREATE INDEX CONCURRENTLY large_table_col_idx ON large_table (col);

CONCURRENTLY builds the index without blocking writes, at the cost of taking longer and requiring a brief retry if it hits an error mid-build. On any table that’s actively being written to, this is the only acceptable choice.

ALTER TABLE — the dangerous one#

The cheap variants (metadata-only, no table scan):

ALTER TABLE users ADD COLUMN nickname TEXT; -- nullable, no default: instant
ALTER TABLE users RENAME COLUMN name TO full_name; -- metadata only
ALTER TABLE users ADD CONSTRAINT chk_email_len
CHECK (length(email) BETWEEN 3 AND 320) NOT VALID; -- defer the scan
ALTER TABLE users VALIDATE CONSTRAINT chk_email_len; -- scan online

The expensive variants (full table rewrite, ACCESS EXCLUSIVE lock):

ALTER TABLE users ALTER COLUMN id TYPE BIGINT; -- type change: rewrite
ALTER TABLE orders ADD COLUMN total_cents INT NOT NULL DEFAULT 0; -- pre-PG11: rewrite
ALTER TABLE users ADD COLUMN region TEXT NOT NULL DEFAULT 'us-east-1'; -- if default isn't constant: rewrite

Postgres 11+ added fast non-rewriting ADD COLUMN ... DEFAULT for constant defaults. Volatile defaults (now(), gen_random_uuid()) still rewrite. MySQL 8 has its own online-DDL rules; SQL Server has a different set; always check the engine.

Multi-step migrations#

For unavoidable expensive changes on hot tables, do them in steps so each step is cheap:

-- Goal: ALTER COLUMN id TYPE BIGINT (would take ACCESS EXCLUSIVE for the rewrite)
-- Step 1: add new column nullable, no default
ALTER TABLE users ADD COLUMN id_new BIGINT;
-- Step 2: backfill in batches (application-driven)
UPDATE users SET id_new = id WHERE id_new IS NULL AND id BETWEEN ? AND ?;
-- Step 3: add trigger to keep new column in sync for new writes
-- Step 4: once backfill done, NOT NULL + index it
ALTER TABLE users ALTER COLUMN id_new SET NOT NULL;
CREATE UNIQUE INDEX CONCURRENTLY users_id_new_idx ON users (id_new);
-- Step 5: swap primary key in a short transaction
BEGIN;
ALTER TABLE users DROP CONSTRAINT users_pkey;
ALTER TABLE users RENAME COLUMN id TO id_old;
ALTER TABLE users RENAME COLUMN id_new TO id;
ALTER TABLE users ADD PRIMARY KEY USING INDEX users_id_new_idx;
COMMIT;
-- Step 6: drop old column after soak
ALTER TABLE users DROP COLUMN id_old;

This is more work than ALTER COLUMN TYPE, and it’s how every team that has shipped a primary-key migration on a live system has done it.

DROP and TRUNCATE#

DROP TABLE deprecated_events; -- gone, with all indexes
DROP INDEX CONCURRENTLY old_unused_idx; -- non-blocking
TRUNCATE TABLE staging_events RESTART IDENTITY; -- fast empty, resets sequence

DROP TABLE and TRUNCATE both take ACCESS EXCLUSIVE locks; TRUNCATE doesn’t fire row-level triggers and doesn’t generate per-row WAL on most engines (still fully durable via metadata).

Variants#

  • CREATE TABLE ... AS SELECT — create and populate in one go. Useful for snapshots and intermediate tables; copies data but not constraints or indexes.
  • CREATE TABLE LIKE / INCLUDING ALL — clone the structure including constraints and indexes. The right tool for partitioning or sharding setups.
  • Generated columns (GENERATED ALWAYS AS (expr) STORED) — column whose value is computed from other columns. Replaces a common pattern of trigger-maintained columns.
  • Partitioned tables — declarative partitioning (PARTITION BY RANGE / LIST / HASH) since Postgres 10 and MySQL for a long time. Each partition is a child table; queries plan over them.
  • Temporary tables (CREATE TEMP TABLE) — session-scoped, often used for staging in ETL workflows.
  • Schema-qualified namesCREATE TABLE app.users ... puts the table in the app schema (Postgres) or database (MySQL). Useful for multi-tenant separation and migration safety.

Trade-offs#

Inline constraints at CREATE TABLE time — clean, atomic, every row that ever exists in this table satisfies the constraint. The right default. Cost: you can’t change them without an ALTER, and renaming inline-named constraints uses an auto-generated name.
Separate ALTER ADD CONSTRAINT statements — let you name constraints explicitly, add them to existing tables, and use NOT VALID to defer the scan on huge tables. Slightly more verbose; far more flexible operationally.

Other lever choices:

  • Adding a column with a default — pre-Postgres 11 was a full table rewrite; 11+ rewrites only if the default is volatile. MySQL InnoDB has its own ALGORITHM=INSTANT rules. Always check.
  • Concurrent index creation — slower wall-clock, safer for live systems. Almost always the right choice in production.
  • Foreign-key validation timing — adding an FK NOT VALID then VALIDATE CONSTRAINT separately splits the lock-heavy scan into a step you can run during low-traffic hours.
  • Renames vs drops — renaming a soon-to-be-deprecated table (ALTER TABLE x RENAME TO x_deprecated_20260524) before dropping gives you a soak window. Rolling back a rename is one statement; rolling back a drop is restore-from-backup.

Common pitfalls#

  • ALTER TABLE ... ADD COLUMN col TYPE NOT NULL without a default on a populated table — Postgres requires a default (or pre-fills with NULL then rejects). Either add nullable, backfill, then set NOT NULL — or use a constant default if your engine supports fast ADD COLUMN.
  • ALTER TABLE ... ALTER COLUMN ... TYPE on a wide table during business hours — full rewrite, ACCESS EXCLUSIVE lock. Use the multi-step pattern above.
  • CREATE INDEX without CONCURRENTLY in production — blocks writes for the build duration. Almost never what you want.
  • Forgetting DEFAULT semanticsDEFAULT now() evaluates at insert time, not at column-creation time. A column added with DEFAULT now() to existing rows fills them all with the same timestamp (the ALTER’s time), not their original creation time.
  • DROP COLUMN reclaiming space. Postgres marks the column dead but doesn’t reclaim its storage until VACUUM FULL or a column rewrite. Don’t expect disk to shrink immediately.
  • TRUNCATE cascading. TRUNCATE TABLE parent fails if any FK references it; TRUNCATE ... CASCADE truncates the referenced tables too. Easy to nuke more than intended.
  • Hand-typed DDL bypassing migrations. Every team learns this once. Lock down production DDL to the migration tool; treat ad-hoc ALTER as an incident.
The lock cheat sheet for Postgres ALTER

Most metadata-only: ADD COLUMN nullable no default, RENAME COLUMN, RENAME TABLE, DROP COLUMN, ADD CONSTRAINT NOT VALID, SET DEFAULT, DROP DEFAULT. SHARE UPDATE EXCLUSIVE (blocks autovacuum but not queries): VALIDATE CONSTRAINT, CREATE INDEX CONCURRENTLY, ALTER INDEX SET. ACCESS EXCLUSIVE (blocks everything): ALTER COLUMN TYPE, ADD COLUMN with non-constant DEFAULT (pre-PG11 for any default), DROP TABLE, TRUNCATE, REINDEX. The official Postgres docs have a per-statement table; if you’re about to run an ALTER on a hot table, look it up.

Search ESC

Keyboard shortcuts

Shortcuts are disabled while typing in inputs.