You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Start from Neuer's distinction between white-box, grey-box, and black-box models and explain why materials genomics cannot treat screening as a pure black-box exercise.
57
+
- Use Sandfeld to define materials data science as a domain-knowledge-guided workflow from data to information to knowledge.
58
+
- Use McClarren, Bishop, and Murphy only to stabilize the ML vocabulary: task definition, model selection, validation, and scientific interpretation.
59
+
- Keep the focus on databases, discovery loops, and validity criteria rather than on algorithm derivations.
60
+
- Exclude detailed probability theory and optimization proofs; they belong to MFML.
Copy file name to clipboardExpand all lines: materials_genomics/02_crystal_structure_fundamentals/unit2_content_50slides.md
+16-2Lines changed: 16 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,20 @@
1
-
# Materials Genomics Unit 2 — 50-Slide Scaffold Pack
1
+
# Materials Genomics Unit 2 — 50-Slide Teaching Scaffold (book-backed)
2
2
3
-
## Slide-by-slide scaffold
3
+
## Book-backed content summary (for this unit)
4
+
- Crystal structures become ML-ready only after careful choices about lattice, basis, coordinate systems, and periodic representation.
5
+
- Symmetry reduces redundant degrees of freedom but also creates canonicalization and leakage challenges when equivalent structures appear multiple times.
6
+
- CIF-like containers mix geometry, chemistry, and metadata; these fields must be parsed into structured representations without discarding provenance.
7
+
- Low-dimensional organization of structural data helps students see why crystal families and prototypes cluster, but these projections are not substitutes for crystallographic reasoning.
8
+
- The unit prepares students for descriptor and graph models by clarifying what information a crystal representation must preserve.
- Use Sandfeld and Neuer to explain how crystal structures must be represented as structured data objects before any ML method can act on them.
54
+
- Use McClarren, Murphy, and Bishop only to motivate low-dimensional structure, covariance views, and representation choices, not to teach crystallography itself.
55
+
- Keep the domain core on lattice, basis, periodicity, symmetry, and invariance requirements for ML-ready crystal data.
56
+
- Explain why representation choices change both model accuracy and leakage risk.
57
+
- Exclude formal derivations of PCA and latent-variable models; students already meet the mathematics in MFML.
- Use Sandfeld to motivate descriptor engineering through domain knowledge, feature matrices, and curse-of-dimensionality effects.
48
+
- Use Neuer plus McClarren to explain why autoencoder-style learned representations become attractive when hand-crafted features saturate.
49
+
- Use Bishop and Murphy only for latent-variable intuition, not for deep theory.
50
+
- Keep the materials focus on descriptor families such as Magpie and matminer, invariance requirements, and failure modes from multicollinearity or missing nonlocal physics.
51
+
- Exclude architecture-specific training detail; that belongs in later neural-network units.
0 commit comments