A Logistic Regression model for heart disease risk classification (Accuracy ≈ 80%). Full technical guide available on Medium: https://medium.com/@jeremiahomueti/build-your-first-machine-learning-model-step-by-step-no-experience-needed-090545175d9b
This project documents the development and evaluation of a supervised machine learning model designed to predict the presence of heart disease in a patient. The goal is to perform a binary classification (0 = no disease, 1 = disease) using clinical data.
The project demonstrates an end-to-end Machine Learning workflow, including data analysis, preparation, model training, and performance evaluation.
A comprehensive, step-by-step technical article detailing the development process, conceptual explanations (e.g., train-test split, supervised learning), and code walkthrough is available on Medium.
Full Article: https://medium.com/@jeremiahomueti/build-your-first-machine-learning-model-step-by-step-no-experience-needed-090545175d9b
- Learning Type: Supervised Learning.
- Algorithm: Logistic Regression, chosen for its effectiveness in binary classification tasks.
- Data Split: The dataset was partitioned into a training set and a testing set using an 80% / 20% split.
The model's performance was evaluated using the unseen test set (20% of the data).
- Final Accuracy Score: Approximately 80%.
This project was developed in a Jupyter Notebook using the Python programming language. The following libraries were used and are required to run the notebook:
- Data Manipulation:
pandas,numpy. - Visualization:
matplotlib,seaborn. - Machine Learning:
scikit-learn(for data splitting, model training, and evaluation).