Skip to content

437-maral/Nephro-Pred

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Title

Data-driven analysis and development of computational nephrotoxicity model for the prediction of small molecules

Table of Contents

Summary

Drug-induced nephrotoxicity poses significant challenges in the research and development of new pharmaceuticals. As it is one of the major causes of drug withdrawal during preclinical and clinical stages, developing reliable methods for early and accurate detection is essential. In this project, computational toxicity prediction models based on machine learning (ML) and deep learning (DL) were implemented to predict nephrotoxicity using both chemical structure data and toxicogenomic profiles of compounds.

  1. Chemical-based Model

Various ML and DL algorithms, including Support Vector Machines (SVM), Random Forest (RF), XGBoost, and Deep Neural Networks (DNN), were trained on two distinct datasets for nephrotoxicity prediction. Among these, the Random Forest model demonstrated the best performance on both training and test datasets, and was therefore selected as the optimal model.

  1. Toxicogenomic-based Model

Toxicogenomic data were obtained from Toxygates, which provides gene expression profiles of rats exposed to 40 compounds for 24 hours at three different dosages. The data were first preprocessed and then subjected to Weighted Gene Co-expression Network Analysis (WGCNA) for gene selection. Based on these selected genes, two different deep learning models were developed to predict drug-induced pathological findings.

  1. Chemo-genomic Model

To further improve model robustness, chemical similarity between compounds was analyzed using the Tanimoto coefficient, integrating chemical descriptors with gene expression profiles from 13 withdrawn compounds. This approach was used to evaluate the reproducibility and reliability of the nephrotoxicity classifiers.

Python / R Packages

Multiple Python and R packages are needed to execute the scripts.
To install Python packages, use either conda or pip in your terminal, cmd, or PowerShell.
For R packages, use install.packages() in RStudio.

Python Packages

  • numpy
  • pandas
  • rdkit
  • joblib
  • scikit-learn
  • openpyxl
  • matplotlib
  • tensorflow

R Packages

  • tidyverse
  • WGCNA
  • ggplot2
  • CorLevelPlot

Run the Script

First, clone the repository and navigate to the ChemoGenomic_Model folder:

git clone https://github.com/437-maral/Nephro-Pred.git
cd Nephro-Pred/ChemoGenomic_Model

To run the main script (Chemo_genomic_model.py), make sure that:

The genomic data, The chemical data, and The genomic SMILES data are all prepared by the user and placed in the appropriate input directory.

python Chemo_genomic_model.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors