|
285 | 285 | "cell_type": "markdown", |
286 | 286 | "metadata": {}, |
287 | 287 | "source": [ |
288 | | - "## Practical Example: Animal Research Lab\n", |
| 288 | + "## DataJoint's Workflow Perspective\n", |
289 | 289 | "\n", |
290 | | - "Let's apply these principles to design a schema for tracking mice in a research lab.\n" |
291 | | - ] |
292 | | - }, |
293 | | - { |
294 | | - "cell_type": "markdown", |
295 | | - "metadata": {}, |
296 | | - "source": [ |
297 | | - "### ❌ Poor Design (Violates Normalization)\n" |
| 290 | + "A fundamental insight underlying DataJoint's normalization approach: **databases are workflows where downstream data depends on the integrity of upstream data**.\n", |
| 291 | + "\n", |
| 292 | + "This workflow-centric view fundamentally shapes normalization principles and explains why DataJoint emphasizes immutability and avoidance of updates.\n" |
298 | 293 | ] |
299 | 294 | }, |
300 | 295 | { |
301 | 296 | "cell_type": "markdown", |
302 | 297 | "metadata": {}, |
303 | 298 | "source": [ |
| 299 | + "## Practical Example: Animal Research Lab\n", |
| 300 | + "\n", |
| 301 | + "Let's apply these principles to design a schema for tracking mice in a research lab.\n", |
| 302 | + "\n", |
| 303 | + "### ❌ Poor Design (Violates Normalization)\n", |
| 304 | + "\n", |
304 | 305 | "```python\n", |
305 | 306 | "@schema\n", |
306 | 307 | "class Mouse(dj.Manual):\n", |
|
317 | 318 | "```" |
318 | 319 | ] |
319 | 320 | }, |
320 | | - { |
321 | | - "cell_type": "markdown", |
322 | | - "metadata": {}, |
323 | | - "source": [ |
324 | | - "## DataJoint's Workflow Perspective\n", |
325 | | - "\n", |
326 | | - "A fundamental insight underlying DataJoint's normalization approach: **databases are workflows where downstream data depends on the integrity of upstream data**.\n", |
327 | | - "\n", |
328 | | - "This workflow-centric view fundamentally shapes normalization principles and explains why DataJoint emphasizes immutability and avoidance of updates.\n" |
329 | | - ] |
330 | | - }, |
331 | 321 | { |
332 | 322 | "cell_type": "markdown", |
333 | 323 | "metadata": {}, |
|
423 | 413 | "The dependency chain is **explicit** and **enforced**.\n" |
424 | 414 | ] |
425 | 415 | }, |
426 | | - { |
427 | | - "cell_type": "markdown", |
428 | | - "metadata": {}, |
429 | | - "source": [ |
430 | | - "### How Workflow Thinking Leads to Normalization Principles\n", |
431 | | - "\n", |
432 | | - "The workflow perspective directly motivates DataJoint's normalization principles:\n", |
433 | | - "\n", |
434 | | - "**1. Immutability (INSERT/DELETE, not UPDATE)**\n", |
435 | | - "- **Why**: Updates hide broken dependencies in the workflow\n", |
436 | | - "- **Workflow view**: Upstream data is \"input\" to downstream computations—changing input invalidates output\n", |
437 | | - "- **Solution**: DELETE forces explicit handling of all dependent data\n", |
438 | | - "\n", |
439 | | - "**2. Separate Changeable Attributes (Rule 3)** \n", |
440 | | - "- **Why**: Time-varying properties represent different states in the workflow\n", |
441 | | - "- **Workflow view**: Each state is a distinct input that produces distinct outputs\n", |
442 | | - "- **Solution**: Model states as separate records (INSERTs), not updates to existing records\n", |
443 | | - "\n", |
444 | | - "**3. Entities with Only Intrinsic Properties (Rules 1 & 2)**\n", |
445 | | - "- **Why**: Properties of different entities are at different nodes in the dependency graph\n", |
446 | | - "- **Workflow view**: Each entity type represents a distinct stage or data type in the pipeline\n", |
447 | | - "- **Solution**: Separate entity types into separate tables to make dependencies explicit\n", |
448 | | - "\n", |
449 | | - "### The Key Insight\n", |
450 | | - "\n", |
451 | | - "> **In workflow-centric databases, referential integrity isn't just about preventing orphaned records—it's about ensuring computational validity.**\n", |
452 | | - "\n", |
453 | | - "Foreign keys don't just link data; they represent **data provenance**:\n", |
454 | | - "- \"This result was computed FROM this input\"\n", |
455 | | - "- \"This analysis is BASED ON this measurement\" \n", |
456 | | - "- \"This conclusion DEPENDS ON this observation\"\n", |
457 | | - "\n", |
458 | | - "When you UPDATE input data but leave outputs unchanged, you break the provenance chain. The outputs claim to be based on inputs that no longer exist (in their original form).\n", |
459 | | - "\n", |
460 | | - "**DataJoint's normalization principles ensure that data dependencies remain explicit and enforceable, making workflows scientifically reproducible and computationally sound.**\n" |
461 | | - ] |
462 | | - }, |
463 | 416 | { |
464 | 417 | "cell_type": "markdown", |
465 | 418 | "metadata": {}, |
|
0 commit comments