|
21 | 21 | "\n", |
22 | 22 | "A foreign key is a column (or set of columns) in the child table that refers to the primary key of the parent table. In DataJoint, a foreign key *always* references a parent's primary key, which is a highly recommended practice for clarity and consistency.\n", |
23 | 23 | "\n", |
24 | | - "## Foreign Keys in DataJoint: Referential Integrity + Workflow Dependencies\n", |
| 24 | + "## Referential Integrity + Workflow Dependencies\n", |
25 | 25 | "\n", |
26 | 26 | "In DataJoint, foreign keys serve a **dual role** that extends beyond traditional relational databases:\n", |
27 | 27 | "\n", |
28 | 28 | "1. **Referential integrity** (like traditional databases): Ensures that child references must exist in the parent table\n", |
29 | | - "2. **Workflow dependencies** (DataJoint addition): Prescribes the order of operations—the parent must be created before the child\n", |
| 29 | + "2. **Workflow dependencies** (DataJoint's addition): Prescribes the order of operations—the parent must be created before the child\n", |
30 | 30 | "\n", |
31 | 31 | "This transforms the schema into a **directed acyclic graph (DAG)** representing valid workflow execution sequences. The foreign key `-> Title` in `Employee` not only ensures that each employee has a valid title, but also establishes that titles must be created before employees can be assigned to them.\n", |
32 | 32 | "\n", |
|
415 | 415 | "**In DataJoint, this preserves workflow immutability**: Workflow artifacts are treated as immutable once created. If upstream data changes, the workflow must be re-executed from that point forward. This ensures that all downstream results remain consistent with their inputs, maintaining computational validity throughout the workflow." |
416 | 416 | ] |
417 | 417 | }, |
| 418 | + { |
| 419 | + "cell_type": "markdown", |
| 420 | + "metadata": {}, |
| 421 | + "source": [ |
| 422 | + "### Effect 5: Performance Optimization with Secondary Indexes\n", |
| 423 | + "\n", |
| 424 | + "A secondary index is automatically created on the foreign key in the child table to accelerate common operations and queries associated with foreign keys.\n", |
| 425 | + "\n", |
| 426 | + "**Why indexes matter:**\n", |
| 427 | + "\n", |
| 428 | + "1. **Delete operations**: When deleting from the parent table, the database must look up all matching child records to enforce referential integrity. An index on the foreign key makes these lookups fast, even when dealing with large child tables.\n", |
| 429 | + "\n", |
| 430 | + "2. **Join operations**: When joining a parent table with a child table, the database matches the foreign key in the child to the primary key in the parent. An index on the foreign key allows the database to quickly locate matching rows, dramatically improving join performance.\n", |
| 431 | + "\n", |
| 432 | + "3. **Subqueries**: When checking whether a foreign key value exists in the parent table, the database uses the index to quickly verify the existence of the referenced record. This is especially important for insert operations that must validate foreign key constraints.\n", |
| 433 | + "\n", |
| 434 | + "**In DataJoint, this optimization is automatic**: When you define a foreign key with `-> Parent`, DataJoint automatically creates the necessary index. This ensures that workflow operations—from populating tables to cascading deletes—remain efficient even as your data grows.\n", |
| 435 | + "\n", |
| 436 | + "The index is created automatically by the database system, so you don't need to explicitly define it. However, understanding its existence helps you appreciate why foreign key operations remain performant even with large datasets.\n" |
| 437 | + ] |
| 438 | + }, |
418 | 439 | { |
419 | 440 | "cell_type": "markdown", |
420 | 441 | "metadata": {}, |
|
0 commit comments