Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ At a high level the library walks through:

1. **Perception:** six ordered passes (rings → Kekulé expansion → electron bookkeeping → aromaticity → resonance → hybridization) that upgrade raw connectivity into a rich `AnnotatedMolecule`.
2. **Typing:** an iterative, priority-sorted rule engine that resolves the final DREIDING atom label for every atom.
3. **Building:** a pure graph traversal that emits canonical bonds, angles, and torsions as a `MolecularTopology`.
3. **Building:** a pure graph traversal that emits canonical bonds, angles, torsions, and inversions as a `MolecularTopology`.

## Features

- **Chemically faithful perception:** built-in algorithms cover SSSR ring search, strict Kekulé expansion, charge/lone pair templates for heteroatoms, aromaticity categorization (including anti-aromatic detection), resonance propagation, and hybridization inference.
- **Deterministic typing engine:** TOML rules are sorted by priority and evaluated until a fixed point, making neighbor-dependent rules (e.g., `H_HB`) converge without guesswork.
- **Engine-agnostic topology:** outputs canonicalized bonds, angles, proper and improper dihedrals ready for any simulator that consumes DREIDING-style terms.
- **Engine-agnostic topology:** outputs canonicalized bonds, angles, torsions, and inversions ready for any simulator that consumes DREIDING-style terms.
- **Extensible ruleset:** ship with curated defaults (`resources/default.rules.toml`) and load or merge custom rule files at runtime.
- **Rust-first ergonomics:** zero `unsafe`, comprehensive unit/integration tests, and precise error variants for validation, perception, and typing failures.

Expand Down
6 changes: 3 additions & 3 deletions docs/01_pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,13 +46,13 @@ Once a `MolecularGraph` enters the pipeline, it is immediately converted into an

The `MolecularTopology` is the final product of the pipeline. It is a clean, structured representation tailored specifically for consumption by molecular simulation engines.

- **Purpose:** To provide a complete list of all particles and interaction terms (bonds, angles, dihedrals) required to define a DREIDING force field model.
- **Purpose:** To provide a complete list of all particles and interaction terms (bonds, angles, torsions, inversions) required to define a DREIDING force field model.
- **Structure:**
- A list of final `Atom`s, now including their assigned `atom_type`.
- Deduplicated lists of `Bond`s, `Angle`s, `ProperDihedral`s, and `ImproperDihedral`s.
- Deduplicated lists of `Bond`s, `Angle`s, `Torsion`s, and `Inversion`s.
- **Design Rationale:**
- **Simulation-Oriented:** The structure directly maps to the needs of a simulation setup. It discards intermediate perception data (like `lone_pairs` or `steric_number`) that is not directly part of the final force field definition.
- **Canonical Representation:** Each topological component (`Angle`, `Dihedral`) is stored in a canonical form (e.g., atom indices are sorted). This simplifies consumption by downstream tools, as it eliminates ambiguity and the need for further deduplication.
- **Canonical Representation:** Each topological component (`Angle`, `Torsion`, `Inversion`) is stored in a canonical form (e.g., atom indices are sorted). This simplifies consumption by downstream tools, as it eliminates ambiguity and the need for further deduplication.

## 2. The Data Flow: A Deterministic Transformation

Expand Down
28 changes: 18 additions & 10 deletions docs/04_topology_builder.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Phase 3: The Topology Builder

With atom types resolved, the builder translates the annotated molecule into a `MolecularTopology`. This stage is pure graph traversal — no additional chemistry is inferred — but its where canonical force-field terms emerge.
With atom types resolved, the builder translates the annotated molecule into a `MolecularTopology`. This stage is pure graph traversal — no additional chemistry is inferred — but it's where canonical force-field terms emerge.

`builder::build_topology` takes two inputs:

1. The immutable `AnnotatedMolecule` output of perception.
2. The `Vec<String>` of atom types returned by the typing engine.

It produces `MolecularTopology { atoms, bonds, angles, propers, impropers }`, all deduplicated and ready for downstream MD engines.
It produces `MolecularTopology { atoms, bonds, angles, torsions, inversions }`, all deduplicated and ready for downstream MD engines.

```mermaid
graph TD
Expand All @@ -18,11 +18,11 @@ graph TD

## Atom Table

`build_atoms` walks the annotated atoms and copies their element, hybridization, and ID while splicing in the final type string (`atom_types[ann_atom.id]`). This produces the topologys `atoms` vector.
`build_atoms` walks the annotated atoms and copies their element, hybridization, and ID while splicing in the final type string (`atom_types[ann_atom.id]`). This produces the topology's `atoms` vector.

## Connectivity Terms

Every interaction term uses the molecules adjacency lists and bond table, which already reflect Kekulé-expanded bond orders.
Every interaction term uses the molecule's adjacency lists and bond table, which already reflect Kekulé-expanded bond orders.

### Bonds

Expand All @@ -41,24 +41,32 @@ for center in atoms:
angles.insert(Angle::new(i, center, k))
```

### Proper Dihedrals (`build_propers`)
### Torsions (`build_torsions`)

Proper torsions are enumerated around each bond `j-k`:
Torsions are enumerated around each bond `j-k`:

1. Iterate over every stored bond.
2. For each neighbor `i` of `j` (excluding `k`) and each neighbor `l` of `k` (excluding `j` and `i`), emit `ProperDihedral::new(i, j, k, l)`.
2. For each neighbor `i` of `j` (excluding `k`) and each neighbor `l` of `k` (excluding `j` and `i`), emit `Torsion::new(i, j, k, l)`.
3. The constructor compares `(i, j, k, l)` to its reverse `(l, k, j, i)` and keeps the lexicographically smaller tuple to guarantee uniqueness.

This approach naturally covers both directions (i.e., `i-j-k-l` and `l-k-j-i`) without generating duplicates.

### Improper Dihedrals (`build_impropers`)
### Inversions (`build_inversions`)

Improper torsions enforce planarity at trigonal centers. The builder scans every atom and checks two conditions:
Inversions enforce planarity at trigonal centers. The builder scans every atom and checks two conditions:

1. Degree equals 3.
2. Hybridization equals `Hybridization::SP2` or `Hybridization::Resonant`.

If satisfied, the atom’s three neighbors form the outer atoms while the center occupies the third index of `ImproperDihedral::new(p1, p2, center, p3)`. The constructor sorts the three peripheral atoms but keeps the central atom fixed, delivering a canonical key.
Per the DREIDING paper, **each planar center generates three inversion terms**, with each neighbor taking turn as the "axis":

For center I with neighbors {J, K, L}:

- Inversion(center=I, axis=J, plane={K, L})
- Inversion(center=I, axis=K, plane={J, L})
- Inversion(center=I, axis=L, plane={J, K})

The constructor `Inversion::new(center, axis, plane1, plane2)` sorts only the two plane atoms (not the axis), ensuring the three terms per center remain distinct.

## Why Canonical Forms Matter

Expand Down
4 changes: 2 additions & 2 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,14 @@ graph TD
Annotated -- "Provides geometry + resonance context" --> BuilderPhase;
AtomTypes -- "Provides atom type info" --> BuilderPhase;

BuilderPhase -- "Generates bonds, angles, dihedrals" --> OutputTopology;
BuilderPhase -- "Generates bonds, angles, torsions, inversions" --> OutputTopology;
```

- **Phase 1: Perception (`perception::perceive`):** Takes the raw `MolecularGraph` and emits an `AnnotatedMolecule`. Six ordered passes (rings, kekulization, electrons, aromaticity, resonance, hybridization) enrich each atom with bonding, charge, lone-pair, ring, and delocalization metadata. The output is immutable and shared with later stages.

- **Phase 2: Typing (`typing::engine::assign_types`):** Runs a deterministic fixed-point solver over the `AnnotatedMolecule`. It evaluates TOML rules parsed via `typing::rules::parse_rules`, honoring priorities and neighbor-dependent constraints until every atom is assigned a DREIDING type.

- **Phase 3: Building (`builder::build_topology`):** Consumes the annotated atoms plus the final type vector to produce a canonical `MolecularTopology`. Helper routines enumerate bonds, angles, proper/ improper dihedrals, and collapse duplicates using canonical ordering so downstream engines receive stable identifiers.
- **Phase 3: Building (`builder::build_topology`):** Consumes the annotated atoms plus the final type vector to produce a canonical `MolecularTopology`. Helper routines enumerate bonds, angles, torsions, and inversions, and collapse duplicates using canonical ordering so downstream engines receive stable identifiers.

## Directory of Architectural Documents

Expand Down
79 changes: 44 additions & 35 deletions src/builder/mod.rs
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
//! Converts annotated molecules and assigned atom types into a full molecular topology.
//!
//! The builder stage takes the perception output and typing assignments, emitting atoms, bonds,
//! angles, proper dihedrals, and improper dihedrals expected by downstream force-field tooling.
//! angles, torsions, and inversions expected by downstream force-field tooling.

use crate::core::properties::{GraphBondOrder, Hybridization, TopologyBondOrder};
use crate::core::topology::{
Angle, Atom, Bond, ImproperDihedral, MolecularTopology, ProperDihedral,
};
use crate::core::topology::{Angle, Atom, Bond, Inversion, MolecularTopology, Torsion};
use crate::perception::{AnnotatedMolecule, ResonanceSystem};
use std::collections::HashSet;

Expand All @@ -22,24 +20,23 @@ use std::collections::HashSet;
///
/// # Returns
///
/// A populated [`MolecularTopology`] containing atoms, bonds, angles, proper dihedrals, and
/// improper dihedrals.
/// A populated [`MolecularTopology`] containing atoms, bonds, angles, torsions, and inversions.
pub fn build_topology(
annotated_molecule: &AnnotatedMolecule,
atom_types: &[String],
) -> MolecularTopology {
let atoms = build_atoms(annotated_molecule, atom_types);
let bonds = build_bonds(annotated_molecule);
let angles = build_angles(annotated_molecule);
let propers = build_propers(annotated_molecule);
let impropers = build_impropers(annotated_molecule);
let torsions = build_torsions(annotated_molecule);
let inversions = build_inversions(annotated_molecule);

MolecularTopology {
atoms,
bonds: bonds.into_iter().collect(),
angles: angles.into_iter().collect(),
propers: propers.into_iter().collect(),
impropers: impropers.into_iter().collect(),
torsions: torsions.into_iter().collect(),
inversions: inversions.into_iter().collect(),
}
}

Expand Down Expand Up @@ -114,9 +111,9 @@ fn build_angles(annotated_molecule: &AnnotatedMolecule) -> HashSet<Angle> {
angles
}

/// Builds proper dihedrals by extending each bond to its neighboring atoms.
fn build_propers(annotated_molecule: &AnnotatedMolecule) -> HashSet<ProperDihedral> {
let mut propers = HashSet::new();
/// Builds torsions by extending each bond to its neighboring atoms.
fn build_torsions(annotated_molecule: &AnnotatedMolecule) -> HashSet<Torsion> {
let mut torsions = HashSet::new();
for bond_jk in &annotated_molecule.bonds {
let (j, k) = bond_jk.atom_ids;

Expand All @@ -128,16 +125,17 @@ fn build_propers(annotated_molecule: &AnnotatedMolecule) -> HashSet<ProperDihedr
if l == j || l == i {
continue;
}
propers.insert(ProperDihedral::new(i, j, k, l));
torsions.insert(Torsion::new(i, j, k, l));
}
}
}
propers
torsions
}

/// Builds improper dihedrals for planar degree-three centers with SP2-like hybridization.
fn build_impropers(annotated_molecule: &AnnotatedMolecule) -> HashSet<ImproperDihedral> {
let mut impropers = HashSet::new();
/// Builds inversions by identifying planar centers and generating three
/// terms per center with each neighbor as axis.
fn build_inversions(annotated_molecule: &AnnotatedMolecule) -> HashSet<Inversion> {
let mut inversions = HashSet::new();
for atom in &annotated_molecule.atoms {
if atom.degree == 3
&& matches!(
Expand All @@ -146,13 +144,19 @@ fn build_impropers(annotated_molecule: &AnnotatedMolecule) -> HashSet<ImproperDi
)
{
let neighbors = &annotated_molecule.adjacency[atom.id];
let p1 = neighbors[0].0;
let p2 = neighbors[1].0;
let p3 = neighbors[2].0;
impropers.insert(ImproperDihedral::new(p1, p2, atom.id, p3));
let n0 = neighbors[0].0;
let n1 = neighbors[1].0;
let n2 = neighbors[2].0;
Comment thread
TKanX marked this conversation as resolved.

// Term 1: axis=n0, plane={n1, n2}
inversions.insert(Inversion::new(atom.id, n0, n1, n2));
// Term 2: axis=n1, plane={n0, n2}
inversions.insert(Inversion::new(atom.id, n1, n0, n2));
// Term 3: axis=n2, plane={n0, n1}
inversions.insert(Inversion::new(atom.id, n2, n0, n1));
}
}
impropers
inversions
}

#[cfg(test)]
Expand Down Expand Up @@ -253,30 +257,35 @@ mod tests {
}

#[test]
fn build_propers_emits_all_valid_dihedrals() {
fn build_torsions_emits_all_valid_dihedrals() {
let (molecule, _) = planar_fragment();

let propers = build_propers(&molecule);
let torsions = build_torsions(&molecule);
let expected: HashSet<_> = vec![
ProperDihedral::new(0, 1, 2, 4),
ProperDihedral::new(3, 1, 2, 4),
ProperDihedral::new(1, 2, 4, 5),
Torsion::new(0, 1, 2, 4),
Torsion::new(3, 1, 2, 4),
Torsion::new(1, 2, 4, 5),
]
.into_iter()
.collect();

assert_eq!(propers, expected);
assert_eq!(torsions, expected);
}

#[test]
fn build_impropers_targets_planar_degree_three_centers() {
fn build_inversions_generates_three_per_planar_center() {
let (molecule, _) = planar_fragment();

let impropers = build_impropers(&molecule);
let expected: HashSet<_> = vec![ImproperDihedral::new(0, 2, 1, 3)]
.into_iter()
.collect();
let inversions = build_inversions(&molecule);
let expected: HashSet<_> = vec![
Inversion::new(1, 0, 2, 3),
Inversion::new(1, 2, 0, 3),
Inversion::new(1, 3, 0, 2),
]
.into_iter()
.collect();

assert_eq!(impropers, expected);
assert_eq!(inversions.len(), 3);
assert_eq!(inversions, expected);
}
}
Loading