AMP-LoRAgen: Generation of species-aware antimicrobial peptides using GPT and Low-Rank Adaptation

Abstract

Antimicrobial peptides (AMPs) have emerged as promising alternatives to traditional antibiotics due to their broad-spectrum activity and low propensity for inducing resistance. However, the discovery and design of potent, species-aware AMPs remain challenging due to limited availability of training data. Recent advances in natural language processing, particularly the development of parameter-efficient fine-tuning on Generative Pre-trained Transformer (GPT) models, have shown promise in generating biologically relevant sequences. In this study, we present a protein language modeling approach for generating potent, species-aware AMP sequences using a GPT model enhanced with parameter-efficient fine-tuning. Building upon a pre-trained GPT model trained on 44.88 million peptide sequences to capture general peptide characteristics, we implemented Low-Rank Adaptation (LoRA) fine-tuning using curated species-specific AMP sequences to generate AMPs tailored to specific microbial species. The performance of the LoRA fine-tuned GPT model was evaluated using physicochemical property analysis, AMP activity, and hemolytic activity predictions. Our results demonstrate that the parameter-efficient LoRA method significantly improves the model's ability to generate potent AMP sequences, outperforming other deep learning-based AMP generation models. We further analyzed species-awareness of the generated AMP sequences through species-specific minimum inhibitory concentration (MIC) predictions and confirmed that the species-aware samples are more potent than the non species-aware samples. This study showcases the potential of leveraging parameter-efficient fine-tuning techniques in language models for the discovery and design of novel, species-specific antibiotics, advancing the development of targeted antimicrobial therapies.

Usage

LoRA fine-tune

The AMP-LoRAgen model can be reproduced by implementing the lora_fine-tune notebook. The pre-trained ProtGPT2 (Ferruz et al., 2022) model can be downloaded from https://huggingface.co/nferruz/ProtGPT2/tree/main.

Generation

The LoRA fine-tuned adapter weights can be applied to the pre-trained model for AMP candidate generation by implementing the generate notebook.

Contact

Hansol Lee ([email protected])

Hojung Nam* ([email protected])

*Corresponding Author

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data_DBAASP		data_DBAASP
CC-BY-NC-SA-4.0		CC-BY-NC-SA-4.0
README.md		README.md
generate.ipynb		generate.ipynb
lora_fine-tune.ipynb		lora_fine-tune.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AMP-LoRAgen: Generation of species-aware antimicrobial peptides using GPT and Low-Rank Adaptation

Abstract

Usage

LoRA fine-tune

Generation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

GIST-CSBL/AMP-LoRAgen

Folders and files

Latest commit

History

Repository files navigation

AMP-LoRAgen: Generation of species-aware antimicrobial peptides using GPT and Low-Rank Adaptation

Abstract

Usage

LoRA fine-tune

Generation

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages