What's the use case:

Here machine learning is applied in 'Bank Marketing' dataset to predict whether a person will purchase term deposit or not which is the target column 'y' here. The project's main task is to predict that target value (yes/no) if given unknown data. Based on the training data how well can it perform and how accurate is it. Well all the models performed pretty well and among them Logistic Regression had an accuracy of 90.734% (best).

What's the process:

EDA: First, the dataset was analyzed to understand the shortcomings and imperfections through plotting graphs, detecting outliers, correlation heatmap, density plot etc. This part was beneficial because it helped to understand the shape and the overall scope of the dataset. Which features depend on each other
Data Preprocessing: Missing data problems were solved in the data preprocessing part. Encoding (to turn the categorical values to numerical values) and scaling the data were also done in this part.
Data splitting and Data Training: The entire dataset was split into 70/30 -> of which 70% was training data and the rest 30% was testing data. In total 4 models were run while training the dataset:
```
(a)  Logistic Regression 🎯
(b)  Decision Tree
(c)  K-Nearest Neighbor Algorithm (KNN)
(d)  Neural Network
```
ROC curve based on AUC score, Confusion Matrix, Classification Report and Accuracy:

When the dataset is imbalanced just the accuracy score is not enough. To better understand which model performed the best we need to see the classification report along with the confusion matrix (in how many false and true instances it predicted the target accurately) and also roc curve. All these things help us to understand the situation better.

Improvements:

A few more things are planned to be done in the near future. Like to make the imbalance dataset into a balanced one by using data oversampling techniques like SMOTE, Synthetic data could've been really helpful. A little more emphasis on EDA by removing more outliers, and in terms of neural network, the right amount of neurons in all the layers could've really boosted the accuracy score even more. While the current state is promising and pretty decent as well, there's always room for improvement, and further iterations will aim to make the model even more accurate.

A report for this project has also been attached. You can go through that to have a better understanding.

Soon, the model will be deployed. Stay tuned for that

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Bank Marketing.csv		Bank Marketing.csv
README.md		README.md
Report.pdf		Report.pdf
project.ipynb		project.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

What's the use case:

What's the process:

Improvements:

About

Uh oh!

Releases

Packages

Languages

MIhirDas10/ML-project

Folders and files

Latest commit

History

Repository files navigation

What's the use case:

What's the process:

Improvements:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages