Skip to content

iBrokeTheCode/E-Commerce_ELT

Repository files navigation

title emoji colorFrom colorTo sdk pinned license short_description
E-Commerce ELT
🍃
indigo
purple
docker
true
mit
Extract, Load, Transform Pipeline applied to an E-Commerce

📦 E-Commerce ELT Pipeline

Table of Contents

  1. Project Description
  2. Methodology & Key Features
  3. Technology Stack
  4. Dataset

1. Project Description

This project showcases an Extract, Load, and Transform (ELT) pipeline applied to a real-world e-commerce dataset. The primary goal is to extract valuable business insights from transactional data and present them through an interactive dashboard. The pipeline integrates data from the Brazilian E-Commerce Public Dataset by Olist, which contains over 100,000 orders from 2016 to 2018, and also incorporates data from the Public Holiday API to analyze sales performance during national holidays.

The dashboard provides a detailed view of the e-commerce experience, including:

  • Order status, prices, and payment types
  • Freight and delivery performance
  • Customer locations and product categories
  • Customer reviews and satisfaction

Important

  • Check out the deployed app here: 👉️ E-Commerce ELT 👈️
  • Check out the Jupyter Notebook for a detailed walkthrough of the project here: 👉️ Jupyter Notebook 👈️

Dashboard

2. Methodology & Key Features

The ELT pipeline extracts raw data, loads it into a structured format, and then transforms it to generate key metrics and visualizations. The analysis is presented using an interactive dashboard built with Marimo, a Python library.

Key Features:

  • Data Integration: Combines e-commerce order data with public holiday information to analyze temporal sales patterns.
  • Data Transformation: Cleans and prepares raw data for analysis, enabling the calculation of key performance indicators (KPIs).
  • Interactive Dashboard: Provides a dynamic and user-friendly interface for exploring business insights.

3. Technology Stack

This project was built using the following technologies and libraries:

Dashboard & Hosting:

  • Marimo: A Python library for building interactive dashboards.
  • Hugging Face Spaces: Used for hosting and sharing the interactive dashboard.

Data Analysis & Visualization:

  • Pandas: For data manipulation and analysis.
  • Plotly: For creating interactive data visualizations.
  • Matplotlib: For creating static visualizations.
  • Seaborn: For creating statistical graphics.

Data Handling & Utilities:

  • SQLAlchemy: For interacting with databases.
  • Requests: For making HTTP requests to external APIs.

Development Tools:

  • Ruff: A fast Python linter and code formatter.
  • uv: A fast Python package installer and resolver.

4. Dataset

This project utilizes the Brazilian E-Commerce Public Dataset by Olist from Kaggle, a public dataset containing details on over 100,000 orders. The data spans from 2016 to 2018 and includes a wide range of transactional information.

Here is the ERD diagram for the database schema:

ERD

About

Extract, Load, Transform Pipeline applied to an E-Commerce

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published