diff --git a/README.md b/README.md index 5cc7526..8d29cb5 100644 --- a/README.md +++ b/README.md @@ -6,13 +6,13 @@ At a high level the library walks through: 1. **Perception:** six ordered passes (rings → Kekulé expansion → electron bookkeeping → aromaticity → resonance → hybridization) that upgrade raw connectivity into a rich `AnnotatedMolecule`. 2. **Typing:** an iterative, priority-sorted rule engine that resolves the final DREIDING atom label for every atom. -3. **Building:** a pure graph traversal that emits canonical bonds, angles, and torsions as a `MolecularTopology`. +3. **Building:** a pure graph traversal that emits canonical bonds, angles, torsions, and inversions as a `MolecularTopology`. ## Features - **Chemically faithful perception:** built-in algorithms cover SSSR ring search, strict Kekulé expansion, charge/lone pair templates for heteroatoms, aromaticity categorization (including anti-aromatic detection), resonance propagation, and hybridization inference. - **Deterministic typing engine:** TOML rules are sorted by priority and evaluated until a fixed point, making neighbor-dependent rules (e.g., `H_HB`) converge without guesswork. -- **Engine-agnostic topology:** outputs canonicalized bonds, angles, proper and improper dihedrals ready for any simulator that consumes DREIDING-style terms. +- **Engine-agnostic topology:** outputs canonicalized bonds, angles, torsions, and inversions ready for any simulator that consumes DREIDING-style terms. - **Extensible ruleset:** ship with curated defaults (`resources/default.rules.toml`) and load or merge custom rule files at runtime. - **Rust-first ergonomics:** zero `unsafe`, comprehensive unit/integration tests, and precise error variants for validation, perception, and typing failures. diff --git a/docs/01_pipeline.md b/docs/01_pipeline.md index 7d2c52d..f7c6516 100644 --- a/docs/01_pipeline.md +++ b/docs/01_pipeline.md @@ -46,13 +46,13 @@ Once a `MolecularGraph` enters the pipeline, it is immediately converted into an The `MolecularTopology` is the final product of the pipeline. It is a clean, structured representation tailored specifically for consumption by molecular simulation engines. -- **Purpose:** To provide a complete list of all particles and interaction terms (bonds, angles, dihedrals) required to define a DREIDING force field model. +- **Purpose:** To provide a complete list of all particles and interaction terms (bonds, angles, torsions, inversions) required to define a DREIDING force field model. - **Structure:** - A list of final `Atom`s, now including their assigned `atom_type`. - - Deduplicated lists of `Bond`s, `Angle`s, `ProperDihedral`s, and `ImproperDihedral`s. + - Deduplicated lists of `Bond`s, `Angle`s, `Torsion`s, and `Inversion`s. - **Design Rationale:** - **Simulation-Oriented:** The structure directly maps to the needs of a simulation setup. It discards intermediate perception data (like `lone_pairs` or `steric_number`) that is not directly part of the final force field definition. - - **Canonical Representation:** Each topological component (`Angle`, `Dihedral`) is stored in a canonical form (e.g., atom indices are sorted). This simplifies consumption by downstream tools, as it eliminates ambiguity and the need for further deduplication. + - **Canonical Representation:** Each topological component (`Angle`, `Torsion`, `Inversion`) is stored in a canonical form (e.g., atom indices are sorted). This simplifies consumption by downstream tools, as it eliminates ambiguity and the need for further deduplication. ## 2. The Data Flow: A Deterministic Transformation diff --git a/docs/04_topology_builder.md b/docs/04_topology_builder.md index 31eb908..e5189f8 100644 --- a/docs/04_topology_builder.md +++ b/docs/04_topology_builder.md @@ -1,13 +1,13 @@ # Phase 3: The Topology Builder -With atom types resolved, the builder translates the annotated molecule into a `MolecularTopology`. This stage is pure graph traversal — no additional chemistry is inferred — but it’s where canonical force-field terms emerge. +With atom types resolved, the builder translates the annotated molecule into a `MolecularTopology`. This stage is pure graph traversal — no additional chemistry is inferred — but it's where canonical force-field terms emerge. `builder::build_topology` takes two inputs: 1. The immutable `AnnotatedMolecule` output of perception. 2. The `Vec` of atom types returned by the typing engine. -It produces `MolecularTopology { atoms, bonds, angles, propers, impropers }`, all deduplicated and ready for downstream MD engines. +It produces `MolecularTopology { atoms, bonds, angles, torsions, inversions }`, all deduplicated and ready for downstream MD engines. ```mermaid graph TD @@ -18,11 +18,11 @@ graph TD ## Atom Table -`build_atoms` walks the annotated atoms and copies their element, hybridization, and ID while splicing in the final type string (`atom_types[ann_atom.id]`). This produces the topology’s `atoms` vector. +`build_atoms` walks the annotated atoms and copies their element, hybridization, and ID while splicing in the final type string (`atom_types[ann_atom.id]`). This produces the topology's `atoms` vector. ## Connectivity Terms -Every interaction term uses the molecule’s adjacency lists and bond table, which already reflect Kekulé-expanded bond orders. +Every interaction term uses the molecule's adjacency lists and bond table, which already reflect Kekulé-expanded bond orders. ### Bonds @@ -41,24 +41,32 @@ for center in atoms: angles.insert(Angle::new(i, center, k)) ``` -### Proper Dihedrals (`build_propers`) +### Torsions (`build_torsions`) -Proper torsions are enumerated around each bond `j-k`: +Torsions are enumerated around each bond `j-k`: 1. Iterate over every stored bond. -2. For each neighbor `i` of `j` (excluding `k`) and each neighbor `l` of `k` (excluding `j` and `i`), emit `ProperDihedral::new(i, j, k, l)`. +2. For each neighbor `i` of `j` (excluding `k`) and each neighbor `l` of `k` (excluding `j` and `i`), emit `Torsion::new(i, j, k, l)`. 3. The constructor compares `(i, j, k, l)` to its reverse `(l, k, j, i)` and keeps the lexicographically smaller tuple to guarantee uniqueness. This approach naturally covers both directions (i.e., `i-j-k-l` and `l-k-j-i`) without generating duplicates. -### Improper Dihedrals (`build_impropers`) +### Inversions (`build_inversions`) -Improper torsions enforce planarity at trigonal centers. The builder scans every atom and checks two conditions: +Inversions enforce planarity at trigonal centers. The builder scans every atom and checks two conditions: 1. Degree equals 3. 2. Hybridization equals `Hybridization::SP2` or `Hybridization::Resonant`. -If satisfied, the atom’s three neighbors form the outer atoms while the center occupies the third index of `ImproperDihedral::new(p1, p2, center, p3)`. The constructor sorts the three peripheral atoms but keeps the central atom fixed, delivering a canonical key. +Per the DREIDING paper, **each planar center generates three inversion terms**, with each neighbor taking turn as the "axis": + +For center I with neighbors {J, K, L}: + +- Inversion(center=I, axis=J, plane={K, L}) +- Inversion(center=I, axis=K, plane={J, L}) +- Inversion(center=I, axis=L, plane={J, K}) + +The constructor `Inversion::new(center, axis, plane1, plane2)` sorts only the two plane atoms (not the axis), ensuring the three terms per center remain distinct. ## Why Canonical Forms Matter diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index db8a41e..964619a 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -46,14 +46,14 @@ graph TD Annotated -- "Provides geometry + resonance context" --> BuilderPhase; AtomTypes -- "Provides atom type info" --> BuilderPhase; - BuilderPhase -- "Generates bonds, angles, dihedrals" --> OutputTopology; + BuilderPhase -- "Generates bonds, angles, torsions, inversions" --> OutputTopology; ``` - **Phase 1: Perception (`perception::perceive`):** Takes the raw `MolecularGraph` and emits an `AnnotatedMolecule`. Six ordered passes (rings, kekulization, electrons, aromaticity, resonance, hybridization) enrich each atom with bonding, charge, lone-pair, ring, and delocalization metadata. The output is immutable and shared with later stages. - **Phase 2: Typing (`typing::engine::assign_types`):** Runs a deterministic fixed-point solver over the `AnnotatedMolecule`. It evaluates TOML rules parsed via `typing::rules::parse_rules`, honoring priorities and neighbor-dependent constraints until every atom is assigned a DREIDING type. -- **Phase 3: Building (`builder::build_topology`):** Consumes the annotated atoms plus the final type vector to produce a canonical `MolecularTopology`. Helper routines enumerate bonds, angles, proper/ improper dihedrals, and collapse duplicates using canonical ordering so downstream engines receive stable identifiers. +- **Phase 3: Building (`builder::build_topology`):** Consumes the annotated atoms plus the final type vector to produce a canonical `MolecularTopology`. Helper routines enumerate bonds, angles, torsions, and inversions, and collapse duplicates using canonical ordering so downstream engines receive stable identifiers. ## Directory of Architectural Documents diff --git a/src/builder/mod.rs b/src/builder/mod.rs index 4365656..b5c1326 100644 --- a/src/builder/mod.rs +++ b/src/builder/mod.rs @@ -1,12 +1,10 @@ //! Converts annotated molecules and assigned atom types into a full molecular topology. //! //! The builder stage takes the perception output and typing assignments, emitting atoms, bonds, -//! angles, proper dihedrals, and improper dihedrals expected by downstream force-field tooling. +//! angles, torsions, and inversions expected by downstream force-field tooling. use crate::core::properties::{GraphBondOrder, Hybridization, TopologyBondOrder}; -use crate::core::topology::{ - Angle, Atom, Bond, ImproperDihedral, MolecularTopology, ProperDihedral, -}; +use crate::core::topology::{Angle, Atom, Bond, Inversion, MolecularTopology, Torsion}; use crate::perception::{AnnotatedMolecule, ResonanceSystem}; use std::collections::HashSet; @@ -22,8 +20,7 @@ use std::collections::HashSet; /// /// # Returns /// -/// A populated [`MolecularTopology`] containing atoms, bonds, angles, proper dihedrals, and -/// improper dihedrals. +/// A populated [`MolecularTopology`] containing atoms, bonds, angles, torsions, and inversions. pub fn build_topology( annotated_molecule: &AnnotatedMolecule, atom_types: &[String], @@ -31,15 +28,15 @@ pub fn build_topology( let atoms = build_atoms(annotated_molecule, atom_types); let bonds = build_bonds(annotated_molecule); let angles = build_angles(annotated_molecule); - let propers = build_propers(annotated_molecule); - let impropers = build_impropers(annotated_molecule); + let torsions = build_torsions(annotated_molecule); + let inversions = build_inversions(annotated_molecule); MolecularTopology { atoms, bonds: bonds.into_iter().collect(), angles: angles.into_iter().collect(), - propers: propers.into_iter().collect(), - impropers: impropers.into_iter().collect(), + torsions: torsions.into_iter().collect(), + inversions: inversions.into_iter().collect(), } } @@ -114,9 +111,9 @@ fn build_angles(annotated_molecule: &AnnotatedMolecule) -> HashSet { angles } -/// Builds proper dihedrals by extending each bond to its neighboring atoms. -fn build_propers(annotated_molecule: &AnnotatedMolecule) -> HashSet { - let mut propers = HashSet::new(); +/// Builds torsions by extending each bond to its neighboring atoms. +fn build_torsions(annotated_molecule: &AnnotatedMolecule) -> HashSet { + let mut torsions = HashSet::new(); for bond_jk in &annotated_molecule.bonds { let (j, k) = bond_jk.atom_ids; @@ -128,16 +125,17 @@ fn build_propers(annotated_molecule: &AnnotatedMolecule) -> HashSet HashSet { - let mut impropers = HashSet::new(); +/// Builds inversions by identifying planar centers and generating three +/// terms per center with each neighbor as axis. +fn build_inversions(annotated_molecule: &AnnotatedMolecule) -> HashSet { + let mut inversions = HashSet::new(); for atom in &annotated_molecule.atoms { if atom.degree == 3 && matches!( @@ -146,13 +144,19 @@ fn build_impropers(annotated_molecule: &AnnotatedMolecule) -> HashSet = vec![ - ProperDihedral::new(0, 1, 2, 4), - ProperDihedral::new(3, 1, 2, 4), - ProperDihedral::new(1, 2, 4, 5), + Torsion::new(0, 1, 2, 4), + Torsion::new(3, 1, 2, 4), + Torsion::new(1, 2, 4, 5), ] .into_iter() .collect(); - assert_eq!(propers, expected); + assert_eq!(torsions, expected); } #[test] - fn build_impropers_targets_planar_degree_three_centers() { + fn build_inversions_generates_three_per_planar_center() { let (molecule, _) = planar_fragment(); - let impropers = build_impropers(&molecule); - let expected: HashSet<_> = vec![ImproperDihedral::new(0, 2, 1, 3)] - .into_iter() - .collect(); + let inversions = build_inversions(&molecule); + let expected: HashSet<_> = vec![ + Inversion::new(1, 0, 2, 3), + Inversion::new(1, 2, 0, 3), + Inversion::new(1, 3, 0, 2), + ] + .into_iter() + .collect(); - assert_eq!(impropers, expected); + assert_eq!(inversions.len(), 3); + assert_eq!(inversions, expected); } } diff --git a/src/core/topology.rs b/src/core/topology.rs index 3acf5be..488c5b2 100644 --- a/src/core/topology.rs +++ b/src/core/topology.rs @@ -16,10 +16,10 @@ pub struct MolecularTopology { pub bonds: Vec, /// A list of all three-atom angles. pub angles: Vec, - /// A list of all proper dihedral angles (torsions). - pub propers: Vec, - /// A list of all improper dihedral angles (out-of-plane bends). - pub impropers: Vec, + /// A list of all four-atom torsions around rotatable bonds. + pub torsions: Vec, + /// A list of all four-atom inversions for planar centers. + pub inversions: Vec, } /// Atom entry emitted in the final topology, combining identity and typing. @@ -71,38 +71,44 @@ impl Angle { } } -/// Proper dihedral entry emitted in the final topology. +/// Torsion entry emitted in the final topology. #[derive(Debug, Clone, PartialEq, Eq, Hash)] -pub struct ProperDihedral { - /// The IDs of the four atoms (`a-b-c-d`), sorted lexicographically. +pub struct Torsion { + /// The IDs of the four atoms (`i`, `j`, `k`, `l`) where `j-k` is the rotatable bond, + /// with `i-l` sorted. pub atom_ids: (usize, usize, usize, usize), } -impl ProperDihedral { - /// Creates a new proper dihedral with atom IDs sorted lexicographically. - pub fn new(a: usize, b: usize, c: usize, d: usize) -> Self { - let fwd = (a, b, c, d); - let rev = (d, c, b, a); +impl Torsion { + /// Creates a new torsion with terminal atoms sorted to a canonical order. + pub fn new(i: usize, j: usize, k: usize, l: usize) -> Self { + let fwd = (i, j, k, l); + let rev = (l, k, j, i); let atom_ids = if fwd <= rev { fwd } else { rev }; Self { atom_ids } } } -/// Improper dihedral entry emitted in the final topology. +/// Inversion entry emitted in the final topology. #[derive(Debug, Clone, PartialEq, Eq, Hash)] -pub struct ImproperDihedral { - /// The IDs of the four atoms (`plane1`, `plane2`, `center`, `plane3`), - /// with plane atoms sorted. +pub struct Inversion { + /// The IDs of the four atoms (`center`, `axis`, `plane1`, `plane2`) + /// where `center` is the inversion center, `axis` is the unique neighbor + /// defining the axis, with `plane1` and `plane2` sorted. pub atom_ids: (usize, usize, usize, usize), } -impl ImproperDihedral { - /// Creates a new improper dihedral with plane atoms sorted. - pub fn new(p1: usize, p2: usize, center: usize, p3: usize) -> Self { - let mut plane_ids = [p1, p2, p3]; - plane_ids.sort_unstable(); - let atom_ids = (plane_ids[0], plane_ids[1], center, plane_ids[2]); - Self { atom_ids } +impl Inversion { + /// Creates a new inversion with plane atoms sorted to a canonical order. + pub fn new(center: usize, axis: usize, plane1: usize, plane2: usize) -> Self { + let (p1, p2) = if plane1 < plane2 { + (plane1, plane2) + } else { + (plane2, plane1) + }; + Self { + atom_ids: (center, axis, p1, p2), + } } } @@ -125,18 +131,37 @@ mod tests { } #[test] - fn proper_dihedral_new_canonicalizes_orientation() { - let forward = ProperDihedral::new(1, 2, 3, 4); - let reversed = ProperDihedral::new(4, 3, 2, 1); + fn torsion_new_canonicalizes_orientation() { + let forward = Torsion::new(1, 2, 3, 4); + let reversed = Torsion::new(4, 3, 2, 1); assert_eq!(forward.atom_ids, reversed.atom_ids); assert_eq!(forward.atom_ids, (1, 2, 3, 4)); } #[test] - fn improper_dihedral_new_sorts_plane_atoms() { - let improper = ImproperDihedral::new(9, 1, 5, 4); - assert_eq!(improper.atom_ids, (1, 4, 5, 9)); + fn inversion_new_sorts_only_plane_atoms() { + let inv = Inversion::new(5, 9, 4, 1); + assert_eq!(inv.atom_ids, (5, 9, 1, 4)); + + let inv2 = Inversion::new(5, 1, 9, 4); + assert_eq!(inv2.atom_ids, (5, 1, 4, 9)); + assert_ne!(inv.atom_ids, inv2.atom_ids); + } + + #[test] + fn inversion_three_terms_per_center_are_distinct() { + let inv1 = Inversion::new(0, 1, 2, 3); + let inv2 = Inversion::new(0, 2, 1, 3); + let inv3 = Inversion::new(0, 3, 1, 2); + + assert_eq!(inv1.atom_ids, (0, 1, 2, 3)); + assert_eq!(inv2.atom_ids, (0, 2, 1, 3)); + assert_eq!(inv3.atom_ids, (0, 3, 1, 2)); + + assert_ne!(inv1, inv2); + assert_ne!(inv2, inv3); + assert_ne!(inv1, inv3); } #[test] @@ -146,7 +171,7 @@ mod tests { assert!(topology.atoms.is_empty()); assert!(topology.bonds.is_empty()); assert!(topology.angles.is_empty()); - assert!(topology.propers.is_empty()); - assert!(topology.impropers.is_empty()); + assert!(topology.torsions.is_empty()); + assert!(topology.inversions.is_empty()); } } diff --git a/src/lib.rs b/src/lib.rs index b98da8c..300bb47 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -51,7 +51,7 @@ //! assert_eq!(topology.atoms.len(), 9); //! assert_eq!(topology.bonds.len(), 8); //! assert_eq!(topology.angles.len(), 13); -//! assert_eq!(topology.propers.len(), 12); +//! assert_eq!(topology.torsions.len(), 12); //! //! // Check the assigned DREIDING atom types. //! assert_eq!(topology.atoms[c1].atom_type, "C_3"); // sp3 Carbon @@ -72,9 +72,7 @@ pub use crate::core::properties::{ Element, GraphBondOrder, Hybridization, ParseBondOrderError, ParseElementError, ParseHybridizationError, TopologyBondOrder, }; -pub use crate::core::topology::{ - Angle, Atom, Bond, ImproperDihedral, MolecularTopology, ProperDihedral, -}; +pub use crate::core::topology::{Angle, Atom, Bond, Inversion, MolecularTopology, Torsion}; /// Rule parsing and customization utilities. ///