Skip to content

BilodeauGroup/PepFoundry

Repository files navigation

PepFoundry

______________________________________________________________________

Visitors Python
RDKit PyTorch Torchvision openpyxl scikit--learn pandas

PepFoundry is a Python package designed to streamline peptide modeling beyond natural amino acids and linear topologies. It enables the incorporation of synthetic (non-canonical) amino acids and produces both RDKit molecule objects and peptide graphs, facilitating their use in machine learning applications.

Demo

In addition, PepFoundry supports the generation of cyclic peptides. These peptides are also represented as RDKit molecule objects and graphs, making them suitable for advanced computational analysis and ML workflows.

Demo

New Updates

  • Dec.01/2025, Version 1.1.1: We have added a new method get_amino_acids, this return list of RDKit molecule objects, each representing a single amino acid. See usage examples in examples_PepFoundry
  • Nov.26/2025, Version 1.1.0: We have added a new method get_smiles_chuckles_format that automatically converts peptide SMILES into CHUCKLES format, including mapping numbers for the terminal residues. This update introduces a new dependency, openbabel. Usage and examples of this method can be found in examples_CHUCKLES.ipynb.

1. Installation Guide

1.1. Creating an Environment with PepFoundry

To automatically create the environment with all required packages, download the file setup_pepfoundry.sh and run the following command:

bash setup_pepfoundry.sh 

1.2. Creating an Anaconda Environment Manually

Alternatively, you can create an Anaconda environment manually by running the following commands manually in the terminal:

1.2.1. Creating the Environment

conda create --name pepfoundry python=3.7.16

1.2.2. Activating the Environment

conda activate pepfoundry

1.2.3. Installing Dependencies

pip install rdkit
  • If you have a CUDA-compatible GPU:
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 -f https://download.pytorch.org/whl/torch_stable.html 
  • Else:
pip install torch==1.13.1 torchvision==0.14.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install openpyxl
pip install scikit-learn
pip install ipykernel
pip install pandas
pip install openbabel-wheel

1.2.4. Installing PepFoundry from GitHub

pip install git+https://github.com/BilodeauGroup/PepFoundry.git

2. Usage

Once installed, you can import and use the package in your Python scripts:

from pepfoundry.interface import PepFoundry

2.1. PepFoundry Class

PepFoundry is the central interface for building peptide RDKit Mol objects. It combines the functionalities of peptide construction and amino acid processing through internal modules.

Before using it, you need to create an instance of the class:

pepfoundry = PepFoundry()

The class use the default database.

  • Default:
    Loads the standard amino acid database included with the package. amino_acids_library

  • Custom Database
    Optionally, you can provide a custom amino acid database for each class instance by passing the path to an Excel file:

pepfoundry = PepFoundry(custom_dict_path="path/to/custom_amino_acids.xlsx")

Important: The Excel file should adhere to the format and conventions defined in the default database, with amino acids defined in the CHUCKLES format, including Map Numbers. Following this structure ensures that the peptide builder can correctly interpret the amino acids and construct molecules without errors.

2.2. Amino Acid Convention

Database Convention

  • Canonical Amino Acids:

    • L-amino acids are represented with uppercase letters (e.g., A for L-Alanine).
    • D-amino acids are represented with lowercase letters (e.g., a for D-Alanine).
  • Non-Canonical amino acids are enclosed in curly braces {Xyz}.

  • Modifications such as acetylation and amidation are also enclosed in {}, e.g.:

    • {ac} for acetylation
    • {am} for amidation

3. Examples:

3.1. PepFoundry Implementation

Full usage examples are provided in:

3.2. CHUCKLES Construction

SMILES construction or rewriting (CHUCKLES format):
Examples of how to construct or rewrite SMILES for amino acids in CHUCKLES format are provided in:

3.3. ML Implementation

Examples of how PepFoudry can be implemented for ML application is provided in:

4. Cite

Garzon Otero, D.; Akbari, O.; Mandapati, A.; Bilodeau, C. PepFoundry: A Pipeline for Building Machine-Learning Ready Representations of Nonstandard Peptides Containing Cycles, Non-natural Residues, Polymer Units, and More. J. Chem. Inf. Model. ASAP. https://doi.org/10.1021/acs.jcim.5c02629

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors