PepFoundry is a Python package designed to streamline peptide modeling beyond natural amino acids and linear topologies. It enables the incorporation of synthetic (non-canonical) amino acids and produces both RDKit molecule objects and peptide graphs, facilitating their use in machine learning applications.
In addition, PepFoundry supports the generation of cyclic peptides. These peptides are also represented as RDKit molecule objects and graphs, making them suitable for advanced computational analysis and ML workflows.
- Dec.01/2025, Version 1.1.1: We have added a new method
get_amino_acids, this return list of RDKit molecule objects, each representing a single amino acid. See usage examples in examples_PepFoundry - Nov.26/2025, Version 1.1.0: We have added a new method
get_smiles_chuckles_formatthat automatically converts peptide SMILES into CHUCKLES format, including mapping numbers for the terminal residues. This update introduces a new dependency,openbabel. Usage and examples of this method can be found in examples_CHUCKLES.ipynb.
To automatically create the environment with all required packages, download the file setup_pepfoundry.sh and run the following command:
bash setup_pepfoundry.sh Alternatively, you can create an Anaconda environment manually by running the following commands manually in the terminal:
conda create --name pepfoundry python=3.7.16conda activate pepfoundrypip install rdkit- If you have a CUDA-compatible GPU:
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 -f https://download.pytorch.org/whl/torch_stable.html - Else:
pip install torch==1.13.1 torchvision==0.14.1 -f https://download.pytorch.org/whl/torch_stable.htmlpip install openpyxlpip install scikit-learnpip install ipykernelpip install pandaspip install openbabel-wheelpip install git+https://github.com/BilodeauGroup/PepFoundry.gitOnce installed, you can import and use the package in your Python scripts:
from pepfoundry.interface import PepFoundryPepFoundry is the central interface for building peptide RDKit Mol objects. It combines the functionalities of peptide construction and amino acid processing through internal modules.
Before using it, you need to create an instance of the class:
pepfoundry = PepFoundry()The class use the default database.
-
Default:
Loads the standard amino acid database included with the package. amino_acids_library -
Custom Database
Optionally, you can provide a custom amino acid database for each class instance by passing the path to an Excel file:
pepfoundry = PepFoundry(custom_dict_path="path/to/custom_amino_acids.xlsx")Important: The Excel file should adhere to the format and conventions defined in the default database, with amino acids defined in the CHUCKLES format, including Map Numbers. Following this structure ensures that the peptide builder can correctly interpret the amino acids and construct molecules without errors.
-
Canonical Amino Acids:
- L-amino acids are represented with uppercase letters (e.g.,
Afor L-Alanine). - D-amino acids are represented with lowercase letters (e.g.,
afor D-Alanine).
- L-amino acids are represented with uppercase letters (e.g.,
-
Non-Canonical amino acids are enclosed in curly braces
{Xyz}. -
Modifications such as acetylation and amidation are also enclosed in
{}, e.g.:{ac}for acetylation{am}for amidation
Full usage examples are provided in:
SMILES construction or rewriting (CHUCKLES format):
Examples of how to construct or rewrite SMILES for amino acids in CHUCKLES format are provided in:
Examples of how PepFoudry can be implemented for ML application is provided in:
Garzon Otero, D.; Akbari, O.; Mandapati, A.; Bilodeau, C. PepFoundry: A Pipeline for Building Machine-Learning Ready Representations of Nonstandard Peptides Containing Cycles, Non-natural Residues, Polymer Units, and More. J. Chem. Inf. Model. ASAP. https://doi.org/10.1021/acs.jcim.5c02629


