🚀 Fraud Detection in Insurance Claims

📌 Project Overview

This project focuses on detecting fraudulent insurance claims using machine learning techniques. The dataset contains various policyholder details, claim information, and incident reports. The goal is to preprocess the data and build a model that classifies claims as fraudulent or non-fraudulent.

📊 Dataset Details

The dataset consists of multiple categorical and numerical features related to insurance claims.
The target variable (fraud_reported) indicates whether a claim is fraudulent (1) or not (0).
Some features include policy details, incident descriptions, and claim amounts.

🛠️ Data Preprocessing

Several preprocessing steps were applied to clean and prepare the dataset for modeling:

Handling Missing Values
- Replaced ? with NaN and imputed missing categorical values with 'Unknown'.
Feature Engineering
- Converted fraud_reported from categorical (Y/N) to binary (1/0).
- Converted policy_bind_date and incident_date into datetime format.
- Dropped unnecessary columns (policy_number, insured_zip, incident_location, etc.).
- Applied one-hot encoding to categorical features.
Data Export
- The cleaned dataset is saved as insurance_claims_preprocessed.csv.

📂 Project Structure

📁 Fraud-Detection-Insurance
│── 📂 data
│   ├── insurance_claims.csv  # Raw dataset
│   ├── insurance_claims_preprocessed.csv  # Cleaned dataset
│── 📂 src
│   ├── preprocess.py  # Data preprocessing script
│   ├── model_train.py  # Model training and evaluation
│── README.md  # Project documentation

📈 Expected Outcome

The project aims to train a machine learning model that can effectively classify fraudulent and non-fraudulent insurance claims. Performance metrics such as accuracy, precision, recall, and F1-score will be evaluated.

📌 Future Improvements

Feature selection and dimensionality reduction for better model performance.
Implementing advanced models like ensemble learning and deep learning.
Deploying the model using Flask or FastAPI for real-time fraud detection.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
ProcFile		ProcFile
README.md		README.md
Random_forest.ipynb		Random_forest.ipynb
SVM.ipynb		SVM.ipynb
app.py		app.py
categorical_cols.pkl		categorical_cols.pkl
encoder.pkl		encoder.pkl
final_random_forest_model.pkl		final_random_forest_model.pkl
front.html		front.html
insurance_claims.csv		insurance_claims.csv
insuranceclaimproject.ipynb		insuranceclaimproject.ipynb
logistic regression.ipynb		logistic regression.ipynb
logisticregression_improved.py		logisticregression_improved.py
model.py		model.py
original_feature_names.pkl		original_feature_names.pkl
pca.pkl		pca.pkl
requirements.txt		requirements.txt
scaler.pkl		scaler.pkl
style.css		style.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 Fraud Detection in Insurance Claims

📌 Project Overview

📊 Dataset Details

🛠️ Data Preprocessing

📂 Project Structure

📈 Expected Outcome

📌 Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

Soukful/AI-Project

Folders and files

Latest commit

History

Repository files navigation

🚀 Fraud Detection in Insurance Claims

📌 Project Overview

📊 Dataset Details

🛠️ Data Preprocessing

📂 Project Structure

📈 Expected Outcome

📌 Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages