Skip to content

himanshu-matharu/Bacteria-Classification-ML

Repository files navigation

Bacteria Species Classification using Machine Learning Algorithms

As Machine learning has taken the world by a storm, how can the field of Microbiology be left behind? The advent of Machine learning became a boon in this field, as antibiotics were failing in front of ever-evolving bacteria. The processes which used to take days and even months, resulting in skyrocketing mortality rates due to delays, can now be solved within hours using the prediction, classification, and feature selection approach. This report analyzes the accuracy of different Machine learning algorithms to process data from a novel spectroscopic diagnostic device and in identifying the bacterial species by drawing a comparison to the available DNA sequences.

We select different supervised machine learning models and test their performance by training for the classification problem at hand. We also tune the hyperparameters of the models to increase performance. This study will be able to discern if machine learning models are accurate enough to be used for bacteria classification tasks and thus, ultimately be used for preliminary analysis and fast decision making in the medical field.

Implementation

The notebook contains the code for training Machine Learning Algorithms for classification of bacteria species on the basis of the ATCG BOC in 10-mer DNA samples.

Bacteria Classification Main.ipynb

Dataset used: data.csv

Hyperparameter Tuning:

Most of the tuning was done with manual searching, as grid search was taking a lot of time. Code for the methodical searches are available here

Machine Learning Algorithms Used

  • Logistic Regression
  • Random Forest Classifier
  • XGBoosting Classifier
  • Extra Trees Classifier
  • Neural Network

Requirements

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors