<<<<<<< HEAD
An industry-grade end-to-end Machine Learning project that detects fraudulent credit card transactions using advanced ML models and serves real-time predictions via a FastAPI REST API.
This project includes:
- Data Cleaning & Feature Engineering
- Class Imbalance Handling with SMOTE
- Model Training with Random Forest, XGBoost, and SVM
- Hyperparameter Tuning
- Final Model Evaluation with ROC & Precision-Recall Curves
- Production-Ready API Deployment using FastAPI
- Python 3.10+
- Scikit-learn
- XGBoost
- Imbalanced-learn (SMOTE)
- FastAPI
- Matplotlib / Pandas / NumPy
- Joblib
fraud-detection-ml/ │ ├── data/ │ └── processed/ │ ├── clean_data.csv │ ├── train_data.csv │ └── test_data.csv │ ├── notebooks/ │ ├── 01_eda.ipynb │ ├── 02_feature_engineering.ipynb │ ├── 03_model_training.ipynb │ └── 04_hyperparameter_tuning.ipynb │ ├── models/ │ ├── tuned_random_forest.pkl │ ├── tuned_xgboost.pkl │ ├── tuned_svm.pkl │ └── feature_order.txt │ ├── app/ │ └── app.py │ ├── src/ │ └── final_evaluation.py │ ├── requirements.txt └── README.md
- European Credit Card Fraud Dataset (Kaggle)
- 284,807 transactions
- Only 0.17% fraud cases → highly imbalanced dataset
- Duplicate removal
- Standard Scaling of numerical features
- Train-test split with stratification
- Correlation-based feature elimination
- Class imbalance handling using SMOTE
- Random Forest
- XGBoost
- Linear SVM (for large-scale efficiency)
RandomizedSearchCVfor RF & XGBoostGridSearchCVfor SVM- Tuning optimized for maximum Recall
- Precision, Recall, F1-Score
- ROC-AUC
- Precision–Recall Curves
- Business Expected Loss Analysis
| Model | ROC-AUC |
|---|---|
| Random Forest | 1.000 |
| XGBoost | 1.000 ✅ (Production Model) |
| SVM | 0.989 |
✅ XGBoost was selected as the production model.
cd app
uvicorn app:app --reload
Open UI
http://127.0.0.1:8000/docs
Sample Request
{
"features": [30 numerical features]
}
Sample Response
{
"fraud_probability": 0.99,
"prediction": "Fraud"
}
🏆 Key Highlights
✔️ Highly imbalanced classification handled correctly
✔️ Recall-optimized model tuning
✔️ Fully production-ready REST API
✔️ Business-impact-based evaluation
✔️ Industry-level project structure
=======
# credit-card-fraud-detection-ml
>>>>>>> 819d3375d770bc08d214beacd924b60d0ab33961