Patch Gaps
Fill missing rows in a sequence with default values via LEFT JOIN against a generated calendar or series.
What it is#
The patch gaps pattern fills missing rows in an ordered sequence — usually dates — with default values (typically zero or null). It is built from two ingredients: a complete sequence generated on the fly, then a LEFT JOIN against the sparse data.
-- Postgres: daily signups for the last 30 days, including days with zero signupsWITH calendar AS ( SELECT generate_series( CURRENT_DATE - INTERVAL '29 days', CURRENT_DATE, INTERVAL '1 day' )::date AS day)SELECT c.day, COALESCE(COUNT(s.id), 0) AS signupsFROM calendar cLEFT JOIN signups s ON DATE(s.signed_up_at) = c.dayGROUP BY c.dayORDER BY c.day;Without this pattern, a daily-signup query produces a row only for days where at least one signup happened. A dashboard with a sawtooth gap pattern is almost always missing a calendar join.
It is also a critical companion to the Rolling Totals pattern: a 7-day moving average computed over a series with missing days gives wrong answers — the missing days were zero, not “not in the window.”
When to use it#
Use patch gaps when:
- A time-series report needs to show every period in a range, including periods with no events.
- A funnel report needs all steps even if some steps had no users.
- A grid join is needed: every combination of two dimensions (every user × every product, every region × every month).
- A downstream window function (moving average, rolling sum) must treat missing periods as zero.
- A line chart visualiser needs evenly-spaced x-axis values; missing days produce a misleading slope.
Don’t use it when:
- The downstream consumer can interpret missing periods correctly on its own.
- The “gap” is actually an artifact of an upstream filter that should be relaxed instead.
- The series cardinality is huge and most rows would be zero — sparse representations are more efficient.
How it works#
Three steps:
- Generate the complete sequence as a CTE or subquery. Date series for time, integer series for sequence numbers, cross-join for grids.
LEFT JOINthe sparse table onto the sequence. The sequence is on the left; the data is on the right. Missing rows on the right come back asNULL.COALESCEthe right-side columns to apply the default value — usually0for counts and sums, butNULLis sometimes the correct default.
SELECT c.day, COALESCE(COUNT(o.id), 0) AS orders, COALESCE(SUM(o.amount), 0) AS revenueFROM calendar cLEFT JOIN orders o ON DATE(o.created_at) = c.dayGROUP BY c.dayORDER BY c.day;The filter must be in the ON clause, not the WHERE clause. This is the single most common bug in this pattern.
Correct — filter in ON:
SELECT c.day, COUNT(o.id) AS ordersFROM calendar cLEFT JOIN orders o ON DATE(o.created_at) = c.day AND o.status = 'paid'GROUP BY c.day;Days with no paid orders still appear with count 0. The status filter is part of the join condition.
Wrong — filter in WHERE:
SELECT c.day, COUNT(o.id) AS ordersFROM calendar cLEFT JOIN orders o ON DATE(o.created_at) = c.dayWHERE o.status = 'paid'GROUP BY c.day;On days with no orders at all, the LEFT JOIN produces a row with o.status = NULL. The WHERE clause then drops it (because NULL = 'paid' is NULL, not TRUE), turning the LEFT JOIN into an effective INNER JOIN and re-introducing the gaps.
Mnemonic: filters on the right side of a LEFT JOIN belong in ON; filters on the left side belong in WHERE. The right-side filter in WHERE defeats the LEFT JOIN.
Variants#
Generate a series — by engine#
Postgres / Redshift: generate_series is the canonical helper.
SELECT day::dateFROM generate_series(DATE '2026-01-01', DATE '2026-12-31', INTERVAL '1 day') AS day;It also generates integer series, useful for grid joins or numbers-table substitutes.
SELECT n FROM generate_series(1, 100) AS n;BigQuery: GENERATE_DATE_ARRAY, then UNNEST.
SELECT dayFROM UNNEST(GENERATE_DATE_ARRAY('2026-01-01', '2026-12-31', INTERVAL 1 DAY)) AS day;Snowflake: GENERATOR + ROW_NUMBER arithmetic, or DATEADD from a row sequence.
SELECT DATEADD(day, SEQ4(), DATE '2026-01-01') AS dayFROM TABLE(GENERATOR(ROWCOUNT => 365));SQL Server: recursive CTE or sys.all_objects row trick.
WITH calendar AS ( SELECT CAST('2026-01-01' AS DATE) AS day UNION ALL SELECT DATEADD(day, 1, day) FROM calendar WHERE day < '2026-12-31')SELECT day FROM calendarOPTION (MAXRECURSION 366);MAXRECURSION 0 disables the default 100-row cap entirely.
MySQL 8+: recursive CTE.
WITH RECURSIVE calendar(day) AS ( SELECT DATE('2026-01-01') UNION ALL SELECT DATE_ADD(day, INTERVAL 1 DAY) FROM calendar WHERE day < '2026-12-31')SELECT day FROM calendar;The default cte_max_recursion_depth is 1000; raise it with SET cte_max_recursion_depth = 100000; for longer ranges.
MySQL pre-8.0 / SQLite: persistent numbers table.
-- Materialised once: integers 0..99999CREATE TABLE numbers (n INT PRIMARY KEY);INSERT INTO numbers SELECT n FROM ( -- some join over information_schema or hand-rolled population) src;
-- Use it as a seriesSELECT DATE_ADD('2026-01-01', INTERVAL n DAY) AS dayFROM numbersWHERE n < 365;A persistent numbers table is the most portable, cheapest source of an integer series across engines — useful even on MySQL 8+ where it can be a view.
Patch gaps in a non-date sequence#
The pattern is not date-specific. Anywhere you have a known full set joined to a partial set, the same shape applies.
-- All product × month pairs, including months with zero salesWITH products AS (SELECT id FROM products), months AS ( SELECT day FROM generate_series(DATE '2026-01-01', DATE '2026-12-01', INTERVAL '1 month') AS day )SELECT p.id, m.day AS month, COALESCE(SUM(s.amount), 0) AS revenueFROM products pCROSS JOIN months mLEFT JOIN sales s ON s.product_id = p.id AND DATE_TRUNC('month', s.sold_at) = m.dayGROUP BY p.id, m.dayORDER BY p.id, m.day;The CROSS JOIN between products and months produces the complete grid; the LEFT JOIN to sales then fills it.
Continuous gaps — finding which values are missing#
If the goal is not to patch the gaps but to find them, the inverse query uses NOT EXISTS or LEFT JOIN ... IS NULL against the same calendar.
-- Which days in the range had zero signupsSELECT c.dayFROM calendar cLEFT JOIN signups s ON DATE(s.signed_up_at) = c.dayWHERE s.id IS NULL;This is a use of the Existence Check pattern on top of the calendar.
Common pitfalls#
Right-side filter in WHERE (covered above). The single most common bug. If your LEFT JOIN suddenly looks like an INNER JOIN, check the WHERE clause.
Timezone bugs when generating the calendar. CURRENT_DATE and DATE(timestamp) follow the session timezone. If your data is stored in UTC and users see local time, a day in user-local time may straddle two days in UTC. Decide on one timezone (usually the report’s timezone), and convert both the calendar and the timestamp into it:
-- Postgres: report in America/Los_AngelesWITH calendar AS ( SELECT generate_series( (CURRENT_DATE - INTERVAL '29 days') AT TIME ZONE 'UTC' AT TIME ZONE 'America/Los_Angeles', CURRENT_DATE AT TIME ZONE 'UTC' AT TIME ZONE 'America/Los_Angeles', INTERVAL '1 day' )::date AS day)SELECT c.day, COUNT(s.id)FROM calendar cLEFT JOIN signups s ON DATE(s.signed_up_at AT TIME ZONE 'America/Los_Angeles') = c.dayGROUP BY c.day;COALESCE on the aggregate, not the column. When patching counts, COUNT(s.id) is already 0 for unmatched rows (because COUNT of zero rows is 0, not NULL). The COALESCE is redundant. But SUM(s.amount) over zero rows is NULL — there COALESCE(SUM(s.amount), 0) is required.
COUNT(s.id) -- 0 when no match, no COALESCE neededCOALESCE(SUM(s.amount), 0) -- NULL when no match, COALESCE neededCOALESCE(AVG(s.amount), 0) -- NULL when no match (cannot average zero rows)COALESCE(MAX(s.amount), 0) -- NULL when no matchCalendar too narrow. If the range is parameterised, an off-by-one on the upper bound silently drops the most recent day. Test with a query like SELECT MIN(day), MAX(day), COUNT(*) FROM calendar before joining.
Performance on huge calendars. A daily calendar across 10 years is 3650 rows — tiny. A minute-by-minute calendar across the same range is 5.3 million rows. If a fact table has 100M rows and the LEFT JOIN doesn’t have a good index on the join column, the plan can degrade to a nested loop with a 5.3M outer side. Aggregate to a coarser grain first.
Generated series ignores DST. Postgres generate_series(..., INTERVAL '1 day') in a session timezone with DST produces some 23-hour and 25-hour days, but the ::date cast smooths over that. For minute-level or hour-level series, generate in UTC and convert per-row.
Practice problem#
You have a signups table:
CREATE TABLE signups ( id BIGSERIAL PRIMARY KEY, user_id BIGINT NOT NULL, signed_up_at TIMESTAMPTZ NOT NULL);Write a Postgres query that returns daily signup counts for the last 30 days, including the current day, with one row per day even when no one signed up. Days with zero signups should show signups = 0, not be absent.
Solution
WITH calendar AS ( SELECT generate_series( CURRENT_DATE - INTERVAL '29 days', CURRENT_DATE, INTERVAL '1 day' )::date AS day)SELECT c.day, COUNT(s.id) AS signupsFROM calendar cLEFT JOIN signups s ON DATE(s.signed_up_at) = c.dayGROUP BY c.dayORDER BY c.day;generate_series produces 30 dates, the LEFT JOIN matches each against signup rows by truncated date, and COUNT(s.id) correctly returns 0 for unmatched calendar rows (no COALESCE needed because COUNT of zero rows is 0, not NULL).
Related patterns#
- Joins (inner, outer, cross) — the LEFT JOIN semantics this pattern relies on.
- Rolling Totals — frequently consumes the gap-patched output for moving averages over time windows.
- Existence Check — the inverse use case: finding which days are missing instead of patching them.
- Tally Count — the per-period aggregate this pattern wraps with a calendar.
- Subqueries and CTEs — the CTE structure used to define the calendar cleanly.