Read, write, and convert molecular file formats including SMILES, SDF, MOL2, and PDB. Includes structure standardization pipeline for consistent molecular representations.
pip install rdkit
pip install openbabel-wheel # For additional format supportTell your AI agent what you want to do:
- "Load my compound library from this SDF file"
- "Convert these SMILES to an SDF file with 3D coordinates"
- "Standardize the structures in my molecule database"
- "Read MOL2 files and convert to SMILES"
"Load all molecules from compounds.sdf and show me how many were successfully parsed."
"Read SMILES from this CSV file where the SMILES column is named 'structure'."
"Save my filtered compounds to an SDF file with properties included."
"Export canonical SMILES for my molecule list."
"Standardize these molecules by removing salts, neutralizing charges, and canonicalizing tautomers."
"Clean up my compound library for consistent representations."
- Parse molecular files using RDKit or Open Babel
- Handle parsing errors gracefully
- Apply standardization pipeline if requested
- Convert between formats as needed
- Preserve molecular properties during conversion
- Use rdMolStandardize module (Python MolStandardize was removed in Q1 2024)
- For Open Babel 3.x, use
from openbabel import pybelnotimport pybel - Standardization order: Sanitize, Normalize, Neutralize, Canonicalize tautomer, Strip salts
- Use rdMolDraw2D for molecular drawing (legacy Draw.MolToImage is deprecated)
- Always check for None when loading molecules (invalid structures return None)
- molecular-descriptors - Calculate properties after loading
- similarity-searching - Compare loaded molecules
- virtual-screening - Prepare ligands for docking