title	emoji	colorFrom	colorTo	sdk	app_file	pinned	short_description	license
SantanderTransactionClassifcation	🏦	red	red	streamlit	src/streamlit_app.py	false	Predict customer transaction probability with a LightGBM.	mit

🏦 Santander Customer Transaction Prediction (LightGBM)

This project predicts the probability that a customer will make a specific transaction in the future. It is based on the Kaggle competition Santander Customer Transaction Prediction.

🔗 Live Demo & Code

💻 GitHub Repository: [https://github.com/EnYa32/SantanderTransactionClassifcation]

🏁 Kaggle Competition: [https://www.kaggle.com/code/enesyama/santandertransactionclassifcation]

📊 Visual Evaluation

Target Distribution — strong class imbalance → ROC-AUC used as main metric

ROC Curve — LightGBM (AUC ≈ 0.8888)

Confusion Matrix (fixed threshold view)

Top Feature Importances — LightGBM

🔍 Evaluation Notes

Metric: ROC-AUC (competition metric)
Strong class imbalance handled via probability modeling
Threshold used only for interpretability
Final model selected via cross-model comparison

Problem Statement

Given 200 anonymized numeric features (var_0 … var_199), predict the probability that a customer will perform a target transaction.

Challenges:

strong class imbalance

anonymized features (no domain meaning)

probability ranking more important than hard labels

✅ Model

We trained and compared multiple models using ROC-AUC (main metric due to class imbalance):

Logistic Regression: ROC-AUC ≈ 0.86
LightGBM (Final): ROC-AUC = 0.8888
XGBoost: ROC-AUC ≈ 0.8807

LightGBM achieved the best ROC-AUC and was selected as the final model.

📁 Project Files

app.py : Streamlit application
lightgbm_santander_model.pkl : saved LightGBM model (joblib)
requirements.txt : dependencies

Important: Put lightgbm_santander_model.pkl in the same folder as app.py.

🚀 How to Run Locally

pip install -r requirements.txt
streamlit run app.py
🧪 How to Use the App
Upload a CSV file (e.g., Kaggle test.csv) containing:

ID_code

var_0 ... var_199

The app outputs:

probability (0–1)

predicted label (based on a threshold slider)

Download:

predictions_lightgbm.csv (probability + label)

submission_lightgbm.csv (Kaggle submission format: ID_code, target)

📌 Notes
Kaggle evaluation uses probabilities (ROC-AUC). Do not apply a threshold for Kaggle submissions.

The threshold in the app is only for display (label).

Due to platform limits, large Kaggle test files are processed locally. This app demonstrates the deployed model on compatible CSV samples.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🏦 Santander Customer Transaction Prediction (LightGBM)

🔗 Live Demo & Code

📊 Visual Evaluation

Problem Statement

✅ Model

📁 Project Files

🚀 How to Run Locally

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🏦 Santander Customer Transaction Prediction (LightGBM)

🔗 Live Demo & Code

📊 Visual Evaluation

Problem Statement

✅ Model

📁 Project Files

🚀 How to Run Locally