Weak Entities and Identifying Relationships — DBMS

Summary#

A weak entity has no candidate key of its own; its identity exists only in relation to another entity called its owner (or identifying parent). The full primary key of a weak entity is (owner's primary key, partial key) — where the partial key (a “discriminator”) is unique only within the scope of one owner. The relationship to the owner is called the identifying relationship, and it always has total participation on the weak side: a weak entity that loses its owner ceases to exist.

The pattern is more common than the textbook treatment suggests. Order line items, room numbers within hotels, sections within courses, comments on posts (if scoped per-post) — all of these are at least candidate weak entities. The deciding question is whether the entity can be identified without naming its parent.

Why it matters#

Recognising the weak-entity pattern early changes three downstream decisions: the primary key shape (composite, including the owner’s key), the foreign key behaviour (ON DELETE CASCADE, because orphans are meaningless), and the indexing strategy (often clustered or co-located by owner for query locality). Missing the pattern leads to artificially surrogate-keyed children that allow orphans and lose the natural co-location benefit.

In interviews, weak entities surface in schema-design questions where the child genuinely depends on the parent. “Model invoices and line items” is a classic — a line item has no meaning without its invoice. Interviewers checking for ER fluency expect the candidate to call it a weak entity, not just “a child table with a FK”.

How it works#

The structure#

A weak entity has:

No standalone candidate key. The values it stores are not unique across the whole entity set — multiple owners can have children with the same partial key.
A partial key (discriminator). An attribute that’s unique within one owner. Line numbers within an invoice; floor + room numbers within a hotel; chapter numbers within a book.
An identifying relationship to an owner. A binary relationship to a strong (or another weak) entity, with total participation on the weak side and 1 cardinality on the owner side.
A composite primary key (owner_pk, partial_key). The owner’s primary key plus the discriminator.

Notation#

Chen notation draws weak entities as double-bordered rectangles, the identifying relationship as a double-bordered diamond, and the partial key as a dashed underline. Crow’s-foot doesn’t have a specific weak-entity glyph; it shows the same fact via the composite primary key and 1..1 participation on the owner side.

Examples#

Invoice (strong)
  invoice_id  PK
  customer_id FK
  total
  issued_on

LineItem (weak, owned by Invoice)
  invoice_id     PK + FK to Invoice
  line_number    PK (partial key — unique within an invoice)
  product_id
  quantity
  unit_price

Other recurring shapes:

Hotel —has→ Room with room_number unique per hotel.
Section —has→ Lecture where lectures are numbered within the section.
User —has→ Address if addresses are slot-indexed (address_type ∈ {home, work, billing}) rather than independently identified.
Order —has→ Shipment if shipments are numbered per order (shipment_seq 1, 2, 3).

Cascading behaviour#

The owner-child dependency makes the cascade choice almost automatic:

CREATE TABLE invoice (
  invoice_id  bigserial PRIMARY KEY,
  customer_id bigint NOT NULL REFERENCES customer (id),
  total       numeric(12, 2) NOT NULL
);

CREATE TABLE line_item (
  invoice_id  bigint NOT NULL,
  line_number int    NOT NULL,
  product_id  bigint NOT NULL REFERENCES product (id),
  quantity    int    NOT NULL,
  unit_price  numeric(12, 2) NOT NULL,
  PRIMARY KEY (invoice_id, line_number),
  FOREIGN KEY (invoice_id) REFERENCES invoice (invoice_id) ON DELETE CASCADE
);

ON DELETE CASCADE is the natural choice — deleting the invoice should delete its line items. With a non-weak child (a child that exists independently), ON DELETE RESTRICT or ON DELETE SET NULL are more common.

Co-location#

A weak entity benefits from being stored physically near its owner. In InnoDB, declaring the primary key as (invoice_id, line_number) clusters all line items for one invoice together on disk — fetching the invoice and its line items is one contiguous range scan. In Postgres, CLUSTER can achieve the same on a heap table. Spanner uses interleaved tables for exactly this pattern.

Variants and trade-offs#

Composite key from owner + discriminator — natural, enforces uniqueness within owner directly, makes ON DELETE CASCADE the obvious choice, co-locates rows by owner. Slightly heavier as a FK source from other tables (composite FKs everywhere it’s referenced).

Surrogate key on the child — single-column FK from anywhere referencing the child, simpler ORM mapping. Loses the natural uniqueness-within-owner; needs a separate UNIQUE (owner_id, partial_key) constraint to preserve it. Loses the co-location story unless explicitly clustered.

The right call usually depends on whether other tables reference the child. If line items are referenced by LineItemReturn(invoice_id, line_number, ...), the composite key is fine. If line items are referenced by ten downstream systems each carrying a line_item_id, the surrogate wins on usage.

Other variants and edge cases:

Chained weak entities. Course —has→ Section —has→ Lecture. Lecture’s primary key is (course_id, section_id, lecture_number). The chain works as long as each level has total participation on its owner.
Weak entity with its own identity later. Sometimes a row promoted from “owned by parent” to “first-class addressable thing” (a line item that becomes a returnable unit). The migration adds a surrogate column, keeps the existing composite as a uniqueness constraint, and dual-keys for the transition.
Aggregate-root / DDD framing. Domain-driven-design’s “aggregate” maps closely onto the weak-entity + identifying-relationship pattern — invoice as aggregate root, line items as internal entities. The aggregate boundary is the transactional boundary.
Document model alternative. In a document store, the weak entity often becomes a nested array inside the owner document (line items as an items: [...] array in the invoice document). Same lifecycle, different representation.

When 'weak entity' is wrong even if the child has total participation

Total participation alone does not make an entity weak. A row in OrderShipment that always has an order_id is not weak if the shipment has a globally unique tracking number — that tracking number is its candidate key, and the relationship to Order is a regular FK, not an identifying relationship. The diagnostic is the absence of a standalone candidate key. Total participation is necessary but not sufficient.

When this is asked in interviews#

Two main shapes. As a schema design — “model invoices, customers, and products” — the candidate is expected to recognise line items as weak and put the composite key on. A junior answer adds a surrogate line_item_id; a senior answer puts (invoice_id, line_number) with ON DELETE CASCADE and explains why.

As a pure ER question — “what’s a weak entity?” — the answer covers the three properties (no candidate key, partial key, identifying relationship with total participation) and gives an example.

A common follow-up: “why not just give it a surrogate ID?” The expected answer is the cascade and co-location arguments, plus the explicit note that the composite key encodes the dependency in the schema itself rather than relying on the application to remember it.