Skip to content

Commit ceda2e9

Browse files
harmonize Foreign Key narrative with Schema-as-a-Workflow paradigm
1 parent 0d0c829 commit ceda2e9

File tree

1 file changed

+35
-12
lines changed

1 file changed

+35
-12
lines changed

book/30-database-design/030-foreign-keys.ipynb

Lines changed: 35 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,17 @@
2121
"\n",
2222
"A foreign key is a column (or set of columns) in the child table that refers to the primary key of the parent table. In DataJoint, a foreign key *always* references a parent's primary key, which is a highly recommended practice for clarity and consistency.\n",
2323
"\n",
24+
"## Foreign Keys in DataJoint: Referential Integrity + Workflow Dependencies\n",
25+
"\n",
26+
"In DataJoint, foreign keys serve a **dual role** that extends beyond traditional relational databases:\n",
27+
"\n",
28+
"1. **Referential integrity** (like traditional databases): Ensures that child references must exist in the parent table\n",
29+
"2. **Workflow dependencies** (DataJoint addition): Prescribes the order of operations—the parent must be created before the child\n",
30+
"\n",
31+
"This transforms the schema into a **directed acyclic graph (DAG)** representing valid workflow execution sequences. The foreign key `-> Title` in `Employee` not only ensures that each employee has a valid title, but also establishes that titles must be created before employees can be assigned to them.\n",
32+
"\n",
33+
"For more on how DataJoint extends foreign keys with workflow semantics, see [Relational Workflows](../20-concepts/04-workflows.md).\n",
34+
"\n",
2435
"In the following example, we define the parent table `Title` and the child table `Employee`, which references `Title`."
2536
]
2637
},
@@ -77,7 +88,9 @@
7788
"cell_type": "markdown",
7889
"metadata": {},
7990
"source": [
80-
"Here the arrow `-> Title` creates a foreign key from `Employee` (child) to `Title` (parent).\n",
91+
"Here the arrow `-> Title` creates a foreign key from `Employee` (child) to `Title` (parent). This foreign key:\n",
92+
"- **Enforces referential integrity**: Ensures each employee has a valid title that exists in the `Title` table\n",
93+
"- **Establishes workflow dependency**: Requires that titles must be created before employees can be assigned to them\n",
8194
"\n",
8295
"We can use the `dj.Diagram` class to visualize the relationships created by the foreign keys."
8396
]
@@ -136,7 +149,7 @@
136149
"cell_type": "markdown",
137150
"metadata": {},
138151
"source": [
139-
"The parent table `Title` is above and the child table `Employee` is below."
152+
"The parent table `Title` is above and the child table `Employee` is below. The arrow direction indicates both the referential relationship (Employee references Title) and the workflow dependency (Title must be created before Employee)."
140153
]
141154
},
142155
{
@@ -193,14 +206,14 @@
193206
"source": [
194207
"## The Five Effects of a Foreign Key\n",
195208
"\n",
196-
"Foreign keys enforce **referential integrity** by regulating the relationships between a **parent table** (referenced entity set) and a **child table** (dependent entity set). In addition to defining how entities relate, foreign keys also impose important constraints on data operations. \n",
209+
"Foreign keys enforce **referential integrity** by regulating the relationships between a **parent table** (referenced entity set) and a **child table** (dependent entity set). In DataJoint, they also establish **workflow dependencies** that prescribe the order of operations. In addition to defining how entities relate, foreign keys also impose important constraints on data operations. \n",
197210
"\n",
198211
"Below are the five key effects of foreign keys:\n",
199212
"\n",
200213
"### Effect 1. The primary key columns from the parent become embedded as foreign key columns in the child \n",
201214
"When a foreign key relationship is established, the **primary key** (or unique key) of the parent table becomes part of the child table’s schema. The child table includes the foreign key attribute(s) with **matching name and datatype** to ensure that each row in the child table refers to a valid parent record.\n",
202215
"\n",
203-
"If you examine the heading of `Employee`, you will find that it now contains a `title_code` field. It will have the same data type as the \n"
216+
"If you examine the heading of `Employee`, you will find that it now contains a `title_code` field. It will have the same data type as the corresponding field in `Title`. \n"
204217
]
205218
},
206219
{
@@ -309,7 +322,9 @@
309322
"\n",
310323
"A foreign key ensures that no \"orphaned\" records are created. An insert into the child table is only permitted if the foreign key value corresponds to an existing primary key in the parent table.\n",
311324
"\n",
312-
"The rule is simple: **Inserts are restricted in the child, not the parent.** You can always add new job titles, but you cannot add an employee with a `title_code` that doesn't exist in the `Title` table."
325+
"The rule is simple: **Inserts are restricted in the child, not the parent.** You can always add new job titles, but you cannot add an employee with a `title_code` that doesn't exist in the `Title` table.\n",
326+
"\n",
327+
"**In DataJoint, this enforces workflow order**: The parent entity must be created before the child entity can reference it. This ensures workflows execute in the correct sequence."
313328
]
314329
},
315330
{
@@ -351,7 +366,9 @@
351366
"\n",
352367
"The rule is the inverse of the insert rule: **Deletes are restricted in the parent, not the child.** You can always delete an employee, but you cannot delete a title if it is still assigned to an employee.\n",
353368
"\n",
354-
"In standard SQL, this operation would fail with a constraint error. DataJoint, however, implements a **cascading delete**. It will warn you that deleting the parent record will also delete all dependent child records, which can cascade through many levels of a deep hierarchy."
369+
"In standard SQL, this operation would fail with a constraint error. DataJoint, however, implements a **cascading delete**. It will warn you that deleting the parent record will also delete all dependent child records, which can cascade through many levels of a deep hierarchy.\n",
370+
"\n",
371+
"**In DataJoint, this maintains workflow consistency**: When you delete a parent entity, all downstream workflow artifacts that depend on it are also deleted. This ensures computational validity—if the inputs are gone, any results based on those inputs must be removed as well. This is essential for maintaining workflow integrity in computational pipelines."
355372
]
356373
},
357374
{
@@ -393,7 +410,9 @@
393410
"\n",
394411
"In general relational theory, databases can be configured to handle this with **cascading updates**, where changing a parent's primary key automatically propagates that change to all child records.\n",
395412
"\n",
396-
"However, DataJoint does not support updating primary key values, as this can risk breaking referential integrity in complex scientific workflows. The preferred and safer pattern in DataJoint is to **delete the old record and insert a new one** with the updated information."
413+
"However, DataJoint does not support updating primary key values, as this can risk breaking referential integrity in complex scientific workflows. The preferred and safer pattern in DataJoint is to **delete the old record and insert a new one** with the updated information.\n",
414+
"\n",
415+
"**In DataJoint, this preserves workflow immutability**: Workflow artifacts are treated as immutable once created. If upstream data changes, the workflow must be re-executed from that point forward. This ensures that all downstream results remain consistent with their inputs, maintaining computational validity throughout the workflow."
397416
]
398417
},
399418
{
@@ -469,13 +488,17 @@
469488
"source": [
470489
"# Summary\n",
471490
"\n",
472-
"Foreign keys ensure referential integrity by linking a child table to a parent table. This link imposes five key effects:\n",
491+
"Foreign keys ensure referential integrity by linking a child table to a parent table. In DataJoint, they also establish **workflow dependencies** that prescribe the order of operations. This link imposes five key effects:\n",
473492
"\n",
474493
"1. **Schema Embedding**: The parent's primary key is added as columns to the child table.\n",
475-
"2. **Insert Restriction**: A row cannot be added to the **child** if its foreign key doesn't match a primary key in the **parent**.\n",
476-
"3. **Delete Restriction**: A row cannot be deleted from the **parent** if it is still referenced by any rows in the **child**.\n",
477-
"4. **Update Restriction**: Updates to the primary and foreign keys are restricted to prevent inconsistencies.\n",
478-
"5. **Performance Optimization**: An index is automatically created on the foreign key in the child table to speed up searches and joins."
494+
"2. **Insert Restriction**: A row cannot be added to the **child** if its foreign key doesn't match a primary key in the **parent**. In DataJoint, this enforces workflow order—the parent must be created before the child.\n",
495+
"3. **Delete Restriction**: A row cannot be deleted from the **parent** if it is still referenced by any rows in the **child**. In DataJoint, cascading deletes maintain workflow consistency by removing dependent downstream artifacts.\n",
496+
"4. **Update Restriction**: Updates to the primary and foreign keys are restricted to prevent inconsistencies. In DataJoint, this preserves workflow immutability—workflow artifacts must be re-executed rather than updated.\n",
497+
"5. **Performance Optimization**: An index is automatically created on the foreign key in the child table to speed up searches and joins.\n",
498+
"\n",
499+
"**In DataJoint, foreign keys transform the schema into a directed acyclic graph (DAG)** that represents valid workflow execution sequences. The schema becomes an executable specification of your workflow, where foreign keys not only enforce referential integrity but also prescribe the order of operations and maintain computational validity throughout the workflow.\n",
500+
"\n",
501+
"For more on how DataJoint extends foreign keys with workflow semantics, see [Relational Workflows](../20-concepts/04-workflows.md)."
479502
]
480503
}
481504
],

0 commit comments

Comments
 (0)