Skip to content

omicsEye/resLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

resLens: a family genomic language models to enhance antibiotic resistance gene detection

This repo contains the training and inference code for resLens, a family of genomic language models that detect and classify genes that confer antibiotic resistance (ARGs). It leverages language models' contextual understanding of gene function to identify ARGs in a way that is less database dependent than current alignment methods and capable of identifying potential novel ARGs for further investigation.

The scripts directory contains python files to train and evaluate the performance of resLens models and can be adapted to perform inference on other DNA sequence data. It additionally contains the code used for the novel ARG analysis and whole genome sequence analysis performed in the paper.

The example directory contains a Jupyter notebook to perform inference new DNA data, both on a mixed ARG/non-ARG dataset and a purely ARG dataset.

The fine-tuned resLens models, train and test data, and genome IDs and phenotypes for the WGS data can be found at our HuggingFace repo.


Citation:


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •