- numpy==1.19.4
- pandas==0.25.3
- python==3.6.12
- scikit-learn==0.23.2
- torch==1.7.0+cu101
- torchvision==0.8.0+cu101
- xgboost==1.3.0
- tqdm==4.54.1
pip install -r requirements.txtpython generate_sample.pyUse
sequence_generated.pyin./sequence_generatedto generate the sequence for customized searching space, we offered sequences for peptides which length is 6 and the script to generate peptide sequences of length 7 in folder./sequence_generated.
Use
cal_pep_des.pyin./featured_data_generatedto generate structual data for Classification and Ranking stage from the sequences derived in the last step.
Use
train.pyto get all the params for the three models(Classifcation, Ranking, Regressing). You can use customized training data or data generated from Grampa dataset.
Use
lstm_fine_tune.pyfor incremental learning. The augmented data was provided in folder./data/origin_data. Using customized data validated in other wet-lab settings is optional.
Use
predict.pyto get the final searching result. For a vast searching space, you may use 'chunk' mechanism to avoid RAM shortage.