The EnzyKR is the deep learning framework for the activation free energy prediction of the enzyme-substrate complexes.
conda create -n kr python=3.8
conda install pytorch::pytorch
conda install pandas numpy
pip install torch_geometric rdkit-pypi bidirectional_cross_attentionThe model need to take the enzyme multiple sequence alignment, enzyme-substrate structural complexes and substrate SMILES strings as inputs.
- User can put the pdb files under the ./structures folder and put the MSA files in a3m format under the ./msa folder.
- A3M files can be obtained from HH-bilts webserver.
- Write the csv file to input the substrate SMILES strings. The ids column of the csv file need to align with the A3M file name and pdb file name
Also a input of the SMILES string substrates need to provide under the raw folder as the csv format. The example is shown under raw folder. However, the enzyme sequence column and dg++ column are not necessary.
The model params need to download first from the google drive. The download instruction is under the model folder
And then run python scripts
python preprocess.pyThe output of the features are under the processed folder.
And running the inference model to predict the activation free energy.
python inference.py --dataset ./ --model_path ./model/model1.pt