Skip to content

This repository contains a comprehensive machine learning project for predicting Chronic Kidney Disease (CKD) using various classifiers. The project implements a systematic pipeline including data cleaning, preprocessing, model training, evaluation, and inference,enabling healthcare practitioners to leverage predictive analytics for early detection

Notifications You must be signed in to change notification settings

macenkrace/Kidney-Disease-Prediction-Using-ML

Repository files navigation

Kidney Disease Prediction Using Machine Learning 🌟

Kidney Disease Prediction

Welcome to the Kidney Disease Prediction Using Machine Learning repository! This project aims to help healthcare professionals by providing tools for predicting Chronic Kidney Disease (CKD) using various machine learning classifiers. The comprehensive pipeline includes data cleaning, preprocessing, model training, evaluation, and inference.

Table of Contents

  1. Introduction
  2. Project Overview
  3. Installation
  4. Usage
  5. Data
  6. Modeling
  7. Evaluation
  8. Deployment
  9. Contributing
  10. License
  11. Contact

Introduction

Chronic Kidney Disease (CKD) affects millions worldwide. Early detection can save lives and improve treatment outcomes. This project uses machine learning to predict CKD based on various health metrics. By leveraging predictive analytics, healthcare practitioners can make informed decisions quickly.

Project Overview

This repository contains:

  • Data cleaning scripts
  • Preprocessing steps
  • Multiple machine learning models including:
    • Logistic Regression
    • Random Forest
    • XGBoost
    • Naive Bayes
    • AdaBoost
  • Evaluation metrics to assess model performance
  • Inference tools for real-time predictions

You can download the latest release here.

Installation

To get started, clone this repository to your local machine:

git clone https://github.com/macenkrace/Kidney-Disease-Prediction-Using-ML.git

Navigate to the project directory:

cd Kidney-Disease-Prediction-Using-ML

Next, install the required packages. It is recommended to use a virtual environment. You can create one using:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Then install the necessary libraries:

pip install -r requirements.txt

Usage

After setting up the environment, you can start using the Jupyter Notebook for model training and evaluation. Run the following command to launch Jupyter:

jupyter notebook

Open the relevant notebook files and follow the instructions to train your models. For inference, use the provided scripts to make predictions based on new input data.

Data

The dataset used in this project is derived from various health metrics. It includes features such as:

  • Age
  • Blood Pressure
  • Specific Gravity
  • Albumin
  • Sugar
  • Blood Glucose
  • Serum Creatinine
  • Hemoglobin
  • Packed Cell Volume

You can find the dataset in the data folder. Make sure to clean and preprocess the data before training your models.

Modeling

This project implements several machine learning algorithms. Here’s a brief overview of each:

Logistic Regression

Logistic regression is used for binary classification. It predicts the probability of CKD based on input features.

Random Forest

This algorithm builds multiple decision trees and merges them to improve accuracy and control overfitting.

XGBoost

XGBoost is a powerful gradient boosting algorithm that optimizes for speed and performance. It is effective for large datasets.

Naive Bayes

Naive Bayes uses the Bayes theorem for classification. It assumes independence among features, making it simple yet effective.

AdaBoost

AdaBoost combines multiple weak classifiers to create a strong classifier. It adjusts weights based on the errors of previous classifiers.

Evaluation

Model evaluation is crucial for understanding performance. This project includes metrics such as:

  • Accuracy
  • Precision
  • Recall
  • F1 Score
  • ROC-AUC

These metrics help determine which model performs best for CKD prediction.

Deployment

For deploying the model as a web application, we use Streamlit. To run the app, execute:

streamlit run app.py

This command will launch a web interface where users can input health metrics and receive predictions on CKD.

Contributing

We welcome contributions! If you want to help improve this project, please follow these steps:

  1. Fork the repository.
  2. Create a new branch for your feature or bug fix.
  3. Commit your changes.
  4. Push your branch to your forked repository.
  5. Create a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For any questions or suggestions, feel free to reach out:

You can also check the latest releases here.

Thank you for your interest in this project! Together, we can improve healthcare outcomes through data-driven insights.

About

This repository contains a comprehensive machine learning project for predicting Chronic Kidney Disease (CKD) using various classifiers. The project implements a systematic pipeline including data cleaning, preprocessing, model training, evaluation, and inference,enabling healthcare practitioners to leverage predictive analytics for early detection

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published