π« Heart Disease Prediction & Analysis
π Problem Statement Cardiovascular diseases are among the leading causes of death globally. This project aims to identify significant factors influencing heart disease and predict potential heart attacks using machine learning. The dataset contains 14 attributes and 4,000+ records, providing detailed information about patient health indicators.
π Dataset Records: 4,000+
Attributes: 14 (age, sex, cholesterol, resting blood pressure, thalassemia, etc.)
Target Variable: Presence of cardiovascular disease (CVD)
π Tech Stack Language: Python
Libraries: Pandas, NumPy, Matplotlib, Seaborn
Visualization: Tableau
Environment: Jupyter Notebook
π Project Workflow
- Data Import & Inspection Checked structure, missing values, duplicates
Treated missing values appropriately
Removed duplicate records
Generated statistical summaries (mean, median, standard deviation)
- Exploratory Data Analysis (EDA) Identified categorical variables (e.g., gender, chest pain type) and analyzed distributions using count plots
Studied CVD occurrence across different ages
Investigated impact of resting blood pressure on heart disease
Analyzed gender distribution of patients
- Factor Analysis & Insights Explored cholesterol levels vs. CVD
Examined peak exercise (oldpeak, exercise-induced angina) relationships
Evaluated the role of thalassemia in CVD occurrence
Used pair plots to visualize variable relationships
- Predictive Modeling Model: Logistic Regression
Trained on processed dataset
Evaluated using Confusion Matrix for accuracy, precision, recall, and F1-score
- Dashboarding in Tableau Created visual comparisons between Diseased vs. Healthy individuals
Linked variables to visualize relationships
Built interactive CVD risk factor dashboard
π Key Insights High cholesterol, abnormal resting blood pressure, and oldpeak are strong CVD predictors.
Men showed a slightly higher incidence of heart disease than women in this dataset.
Thalassemia and exercise-induced angina are closely related to increased CVD risk.
People aged 50+ had a significantly higher probability of heart disease.