This project analyzes direct marketing campaigns (phone calls) of a Portuguese banking institution. The goal is to predict whether a client will subscribe to a term deposit based on demographic and behavioral data.
This repository contains a comprehensive analysis and a machine learning model to solve the classification problem described in the dataset.
The dataset (banking_data.csv) contains 45,211 records with 18 attributes, including:
- Client Demographics: Age, Job, Marital Status, Education.
- Financial Status: Credit Default, Balance, Housing Loan, Personal Loan.
- Campaign Details: Contact Type, Last Contact Day/Month, Duration, Number of Contacts.
- Previous Campaign History: Days since last contact, Poutcome (outcome of previous campaign).
- Target Variable:
y(has the client subscribed to a term deposit? - 'yes'/'no').
The key analysis is performed in analysis.ipynb, which includes:
- Exploratory Data Analysis (EDA): Visualizing distributions of age, job, marital status, and other features.
- Data Preprocessing: Handling categorical variables (One-Hot Encoding) and target variable conversion.
- Model Building: Training a Random Forest Classifier to predict term deposit subscriptions.
- Evaluation: Assessing model performance using Accuracy and Classification Reports.
- Python 3.x
- Jupyter Notebook
- Required libraries:
pandas,numpy,matplotlib,seaborn,scikit-learn
- Clone this repository.
- Install dependencies:
pip install pandas numpy matplotlib seaborn scikit-learn jupyter
- Open the notebook:
jupyter notebook analysis.ipynb
- Run the cells to view the analysis and model results.
analysis.ipynb: The main Jupyter Notebook with code and visualizations.banking_data.csv: The dataset used for analysis.Banking/: Contains additional project files and scripts.Problem Statement & Data Description.pdf: Detailed description of the problem and data dictionary.