An end-to-end MLOps project for detecting credit card fraud in real time using modern ML engineering best practices.
This project simulates a real-time fraud detection pipeline using AWS services, MLOps tooling, and robust monitoring. It includes:
- βοΈ DVC for data versioning
- π§ͺ MLflow & πͺ Weights & Biases for experiment tracking
- π§ ONNX for portable model inference
- 𧬠Hydra for configuration management
- β GitHub Actions for CI/CD
- π uv for Python environment management
- π Evidently AI for data drift monitoring
- π οΈ Lambda + SQS for serverless real-time inference
| Category | Tool/Library |
|---|---|
| Data Versioning | DVC, AWS S3 |
| Model Registry | MLflow, ONNX |
| Experiment Tracking | Weights & Biases, Hydra, Matplotlib |
| Model Training | scikit-learn, PyTorch Lightning, SMOTE |
| Data Monitoring | Evidently AI, CloudWatch |
| Inference | AWS Lambda, ONNX Runtime, SQS |
| CI/CD | GitHub Actions |
| Environment Mgmt | uv, pipx |
pipx install uv
uv venv .venv
uv pip install pytorch-lightning torch pandas scikit-learn numpy hydra-core matplotlib seaborn dvc wandb onnx onnxruntime kaggle
uv add pytorch-lightning torch pandas scikit-learn numpy hydra-core matplotlib seaborn dvc wandb onnx onnxruntime kagglekaggle datasets download -d mlg-ulb/creditcardfraud -p data/raw/ --unzipuv run src/preprocessing.pydvc init
dvc remote add -d s3remote s3://<your-bucket-name>/dvcstore
dvc remote modify s3remote endpointurl https://s3.us-west-1.amazonaws.com
dvc remote modify s3remote region us-west-1
dvc add data/raw/creditcard.csv
dvc add data/processed/X_train.csv
dvc add data/processed/X_test.csv
dvc add data/processed/Y_train.csv
dvc add data/processed/Y_test.csv
git add .gitignore *.dvc dvc.yaml dvc.lock
git commit -m "Added data files tracked with DVC"
dvc pushuv python src/train.pyYou can also use docker to run the training scripts,
docker build -t fraud-detection-app:latest .
docker run fraud-detection-app:latestwandb sweep sweep.yaml
wandb agent <sweep-agent-endpoint>mlflow uiIncludes training loss, AUC, hyperparameter optimization, model comparison, etc.
- Simulates credit card transactions
- Pushes JSON payloads to SQS queue
- Triggers AWS Lambda for real-time fraud prediction
python realtime_data_simulation.py- Lambda logs β CloudWatch
- Open results/drift_report.html in web browser to monitor datadrift in real time data.

Raahul Krishna Durairaju Machine Learning & MLOps Practitioner | MS CS @ Cal State Fullerton





