Skip to content

Commit 2880042

Browse files
further work on framing normalization
1 parent f2859f9 commit 2880042

File tree

1 file changed

+19
-7
lines changed

1 file changed

+19
-7
lines changed

book/30-schema-design/055-normalization.ipynb

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -25,17 +25,29 @@
2525
"source": [
2626
"## Approach 1: Mathematical Normalization\n",
2727
"\n",
28-
"Edgar F. Codd developed formal normalization theory in the early 1970s, rooted in the mathematical foundations of the relational model [@10.1145/358024.358054]. This approach is deeply tied to predicate calculus and the original conceptualization of relations.\n",
28+
"Edgar F. Codd developed formal normalization theory in the early 1970s, rooted in the mathematical foundations of the relational model [@10.1145/358024.358054].\n",
29+
"This approach is deeply tied to **predicate calculus** and the original conceptualization of relations.\n",
2930
"\n",
3031
"### The Predicate Calculus Foundation\n",
32+
"The intellectual foundation of the relational model lies in predicate calculus, a branch of mathematical logic.\n",
3133
"\n",
32-
"In Codd's original formulation:\n",
33-
"- **Relations are predicates**: A table represents a predicate with attributes as its parameters\n",
34-
"- **Tuples are true propositions**: Each row asserts that the predicate is true for those specific attribute values\n",
35-
"- **Attributes are subject to functional dependencies**: Some attributes determine the values of others\n",
36-
"- **Normalization prevents complex dependencies in predicate variables**: Well-designed relations avoid storing data with tangled functional dependencies\n",
34+
"1. **Predicates and Relations**: A **predicate** is a function or statement about one or more variables that can be determined to be either true or false. In a database, a table (relation) is the representation of a logical predicate; it represents the complete set of all facts (propositions) that make the predicate true.\n",
3735
"\n",
38-
"This is an **abstract, mathematical approach** that requires reasoning about attribute-level dependencies independent of real-world entities.\n",
36+
"2. **Tuples and Truth**: Each **row (tuple)** is a specific set of attribute values that asserts a true proposition for the predicate. For example, if a table's predicate is \"Employee $x$ works on Project $y$,\" the row (Alice, P1) asserts the truth: \"Employee Alice works on Project P1.\"\n",
37+
"\n",
38+
"### The Normalization Link: Derivability and Integrity\n",
39+
"The power of predicate calculus is the ability to **derive new true propositions** from a minimal set of existing true propositions using rules of inference (which correspond to relational operations like **projectino** or **join**). Normalization frames the database design choice this way:\n",
40+
"\n",
41+
"**The Design Goal**: Decide **which predicates should become base relations (stored tables)** so that:\n",
42+
"* All other valid true propositions (facts) can be **most easily and efficiently derived** through relational operations.\n",
43+
"* The total number of stored facts is minimized to reduce redundancy.\n",
44+
"* The chance of making mistakes in creating true propositions (data anomalies) is minimized.\n",
45+
"\n",
46+
"**Attributes are subject to functional dependencies:** some attributes determine the values of others. Normalization ensures well-designed relations avoid storing data with tangled functional dependencies.\n",
47+
"\n",
48+
"Mathematical normalization is founded on the Closed World Assumption (CWA) [@10.1145/320107.32010]. The CWA is the assumption that the only facts that are true are those that are explicitly stated in the database. Facts that are not stated in the database are assumed to be false: If a student enrollment is missing from the database, we assume that the student is not enrolled in that course. This is a simplifying assumption that allows us to reason about the data in the database in a more precise way.\n",
49+
"\n",
50+
"This is an abstract, mathematical approach that requires reasoning about attribute-level dependencies independent of real-world entities.\n",
3951
"\n",
4052
"### Functional Dependencies\n",
4153
"\n",

0 commit comments

Comments
 (0)