Project Title

Data-driven analysis and development of computational nephrotoxicity model for the prediction of small molecules

Summary

Drug-induced nephrotoxicity poses significant challenges in the research and development of new pharmaceuticals. As it is one of the major causes of drug withdrawal during preclinical and clinical stages, developing reliable methods for early and accurate detection is essential. In this project, computational toxicity prediction models based on machine learning (ML) and deep learning (DL) were implemented to predict nephrotoxicity using both chemical structure data and toxicogenomic profiles of compounds.

Chemical-based Model

Various ML and DL algorithms, including Support Vector Machines (SVM), Random Forest (RF), XGBoost, and Deep Neural Networks (DNN), were trained on two distinct datasets for nephrotoxicity prediction. Among these, the Random Forest model demonstrated the best performance on both training and test datasets, and was therefore selected as the optimal model.

Toxicogenomic-based Model

Toxicogenomic data were obtained from Toxygates, which provides gene expression profiles of rats exposed to 40 compounds for 24 hours at three different dosages. The data were first preprocessed and then subjected to Weighted Gene Co-expression Network Analysis (WGCNA) for gene selection. Based on these selected genes, two different deep learning models were developed to predict drug-induced pathological findings.

Chemo-genomic Model

To further improve model robustness, chemical similarity between compounds was analyzed using the Tanimoto coefficient, integrating chemical descriptors with gene expression profiles from 13 withdrawn compounds. This approach was used to evaluate the reproducibility and reliability of the nephrotoxicity classifiers.

Python / R Packages

Multiple Python and R packages are needed to execute the scripts.
To install Python packages, use either conda or pip in your terminal, cmd, or PowerShell.
For R packages, use install.packages() in RStudio.

Python Packages

numpy
pandas
rdkit
joblib
scikit-learn
openpyxl
matplotlib
tensorflow

R Packages

tidyverse
WGCNA
ggplot2
CorLevelPlot

Run the Script

First, clone the repository and navigate to the ChemoGenomic_Model folder:

git clone https://github.com/437-maral/Nephro-Pred.git
cd Nephro-Pred/ChemoGenomic_Model

To run the main script (Chemo_genomic_model.py), make sure that:

The genomic data, The chemical data, and The genomic SMILES data are all prepared by the user and placed in the appropriate input directory.

python Chemo_genomic_model.py

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Chemical_Model		Chemical_Model
ChemoGenomic_Model		ChemoGenomic_Model
Toxicogenomic_Model		Toxicogenomic_Model
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Title

Table of Contents

Summary

Python / R Packages

Python Packages

R Packages

Run the Script

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Project Title

Table of Contents

Summary

Python / R Packages

Python Packages

R Packages

Run the Script

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages