End-to-End E-Commerce Intelligence Platform

A production-grade machine learning platform featuring Churn Prediction, Demand Forecasting, Customer Segmentation, NLP Sentiment Analysis, Recommendation Engines, and Streaming Fraud Detection.

📖 Project Overview

This repository demonstrates a complete, end-to-end data science lifecycle. Moving beyond simple academic scripts, this project is built as a monolithic intelligence platform containing multiple interconnected machine learning pipelines that process relational e-commerce and financial data to drive actionable business outcomes.

It features robust data engineering (leakage-free validation splitting, cyclical feature extraction), GPU-accelerated algorithms (RAPIDS cuML, CuPy), and advanced modeling techniques including Nested Cross-Validation, Stacking Ensembles, and Streaming Learning via mini-batches.

🎯 Business Objectives

Reduce Customer Churn: Identify at-risk users early via Ensemble Classification.
Optimize Inventory: Predict weekly category demand using Time-Series Regression.
Personalize Marketing: Group users into distinct RFM clusters.
Automate Feedback Analysis: Score Portuguese customer reviews for sentiment and urgency.
Increase Average Order Value: Serve personalized products via Collaborative Filtering (SVD).
Mitigate Financial Risk: Detect anomalous transactions in real-time using Isolation Forests and Streaming SGD.

🏗️ Architecture

graph LR
    A[Raw Data] --> B(Ingestion & Feature Eng)
    B --> C{ML Pipelines}
    C -->|Classify| D[Churn & Conversion]
    C -->|Regress| E[Demand Forecaster]
    C -->|Cluster| F[Segmentation]
    C -->|NLP| G[Review Engine]
    C -->|Recommend| H[SVD Filtering]
    C -->|Stream| I[Fraud Detection]
    D & E & F & G & H & I --> J[(Model Artifacts)]
    J --> K[Streamlit Interactive App]

For an in-depth breakdown of the data flow and model pipelines, please see the docs/ directory.

🛠️ Technologies Used

Data Engineering: Pandas, NumPy, SQLite
Machine Learning: Scikit-Learn, XGBoost, LightGBM, RAPIDS cuML, CuPy
Hyperparameter Tuning: Optuna (Bayesian Search)
Model Explainability: SHAP, LIME
Frontend / Deployment: Streamlit

📂 Project Structure

omni-retail-customer-analysis/
├── app/                  # Streamlit frontend application
├── data/                 # Raw and processed datasets
│   ├── raw/              # Raw data (ecommerce/, fashion-mnist/, fraud/)
│   └── processed/        # Processed data (ecommerce_intelligence/, fraud_detection/)
├── docs/                 # Architecture & API documentation
├── models/               # Serialized ML models
│   ├── ecommerce_intelligence/
│   ├── fashion_classifier/
│   └── fraud_detection/
├── notebooks/            # Jupyter notebooks organized by sub-project
│   ├── ecommerce_intelligence/
│   ├── fashion_classifier/
│   └── fraud_detection/
├── reports/              # Generated outputs, matrices, and evaluation plots
│   └── figures/          # Evaluation figures organized by sub-project
│       ├── ecommerce_intelligence/
│       ├── fashion_classifier/
│       └── fraud_detection/
├── src/                  # Reusable Python modules and utility functions
├── tests/                # Unit tests for data engineering logic
├── README.md             # Project overview
└── requirements.txt      # Dependency lockfile

📊 Key Findings & Performance Metrics

Pipeline	Core Model	Key Metric	Result
Churn/Conversion	Stacking (XGB, LGBM, RF)	ROC-AUC	Top Tier Performance with Leakage mitigation
Demand Forecasting	Optuna-tuned LightGBM	SMAPE	High accuracy on weekly seasonality
Segmentation	RAPIDS PCA + GMM / K-Means	Silhouette	4 Distinct Personas identified
Review NLP	LinearSVC (TF-IDF)	F1-Score	Robust on Imbalanced Portuguese text
Recommender	Custom CuPy Funk SVD	RMSE	Highly optimized via Nested CV
Fraud Detection	Streaming SGDClassifier	Recall	Adaptive to incoming data streams

🚀 Installation & Usage

1. Clone the Repository

git clone https://github.com/ikartiksavaliya/omni-retail-customer-analysis.git
cd omni-retail-customer-analysis

2. Set Up the Environment

Create a virtual environment and install the required dependencies:

conda create -n retail-ml python=3.10 -y
conda activate retail-ml
pip install -r requirements.txt

3. Run the Application

Launch the interactive Streamlit dashboard:

streamlit run app/main.py

🔮 Future Improvements

Migrate SQLite database to PostgreSQL.
Dockerize the application and deploy to AWS Elastic Beanstalk or GCP Cloud Run.
Replace LinearSVC with a lightweight Transformer (e.g., BERTimbau) for Portuguese NLP.
Implement Airflow or Prefect for automated retraining pipelines.

Created as a comprehensive portfolio piece demonstrating production-ready Machine Learning engineering.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.venv		.venv
app		app
docs		docs
models		models
notebooks		notebooks
reports/figures		reports/figures
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
FINAL_PORTFOLIO_REVIEW.md		FINAL_PORTFOLIO_REVIEW.md
FREELANCER_SERVICES.md		FREELANCER_SERVICES.md
INTERVIEW_PREPARATION.md		INTERVIEW_PREPARATION.md
LICENSE		LICENSE
PROJECT_AUDIT.md		PROJECT_AUDIT.md
README.md		README.md
RECRUITER_GUIDE.md		RECRUITER_GUIDE.md
RESUME_BULLETS.md		RESUME_BULLETS.md
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End-to-End E-Commerce Intelligence Platform

📖 Project Overview

🎯 Business Objectives

🏗️ Architecture

🛠️ Technologies Used

📂 Project Structure

📊 Key Findings & Performance Metrics

🚀 Installation & Usage

1. Clone the Repository

2. Set Up the Environment

3. Run the Application

🔮 Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

End-to-End E-Commerce Intelligence Platform

📖 Project Overview

🎯 Business Objectives

🏗️ Architecture

🛠️ Technologies Used

📂 Project Structure

📊 Key Findings & Performance Metrics

🚀 Installation & Usage

1. Clone the Repository

2. Set Up the Environment

3. Run the Application

🔮 Future Improvements

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages