According to the statistics mentioned by the World Health Organization (WHO), stroke is the 2nd largest cause of death contributing to 11% of the death rate. And it is a classification problem which makes the research more interesting because there are many algorithms for classification problems and even the prediction rate is more accurate for classification problems. That's why we are going to use machine learning to solve this problem.It will somehow help in decreasing the death rate due to stroke.
Dataset
GAN : The file “ healthcare-dataset-stroke-data.csv” is used as the training dataset.
-The train dataset contains 5110 rows and 12 columns, here is a bit description of our dataset:
-Features in our dataset: id, gender, age, hypertension, heart disease, ever married, work type, residence type, BMI, average glucose level and smoking status.
-1 id column
-1 stroke column
-10 latent vector column
-The dataset has been split into train and test with test size of 0.3.