Skip to content

PialGhosh2233/Diabetes_prediction_using_ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Diabetes Prediction Model

This repository hosts a Python-based machine learning project aimed at predicting diabetes using patient health data from https://www.kaggle.com/datasets/mathchi/diabetes-data-set

The project leverages common machine learning techniques and several popular libraries, including pandas, NumPy, scikit-learn, and Matplotlib, to preprocess data, train models, and evaluate their performance.

Project Overview

The dataset used in this project is derived from the the National Institute of Diabetes and Digestive and Kidney Diseases. It includes several diagnostic measurements such as glucose concentration, blood pressure, skin thickness, insulin level, BMI, age, and more.

Key Features

  • Data Preprocessing: Includes handling missing values, feature scaling, and data transformations to prepare the dataset for modeling.
  • Model Training and Evaluation: Employs three different machine learning models:
    • Logistic Regression
    • K-Nearest Neighbors (KNN)
    • Support Vector Machine (SVM)
  • Performance Analysis: Evaluates models based on accuracy, precision, and recall. Includes detailed visualizations of model performance.
  • Data Visualization: Uses Matplotlib and Seaborn for insightful visualizations of the dataset distribution and model outcomes.

Contributing

Contributions are welcome! For major changes, please open an issue first to discuss what you would like to change. Please ensure to update tests as appropriate.

License

This project is licensed under the MIT License - see the LICENSE file for details.