Retail Data Pipeline with aws S3 bucket, Databricks, GCP BigQuery and Looker
Dashboard 📊 Request FeatureThis an end-to-end data engineering project, where I created an ELT data pipeline to extract, analyze, and visualize insights from the data of an online retail company based in the UK.
This is a transnational data set that contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers.
The dataset includes the following columns:
| Column | Description |
|---|---|
| InvoiceNo | Invoice number. Nominal, a 6-digit integral number uniquely assigned to each transaction. If this code starts with the letter 'c', it indicates a cancellation. |
| StockCode | Product (item) code. Nominal, a 5-digit integral number uniquely assigned to each distinct product. |
| Description | Product (item) name. Nominal. |
| Quantity | The quantities of each product (item) per transaction. Numeric. |
| InvoiceDate | Invoice Date and time. Numeric, the day and time when each transaction was generated. |
| UnitPrice | Unit price. Numeric, Product price per unit in sterling. |
| CustomerID | Customer number. Nominal, a 5-digit integral number uniquely assigned to each customer. |
| Country | Country name. Nominal, the name of the country where each customer resides. |
- CSV ingestion from S3 into Databricks
- Clean and transform data with Spark into parquet tables
- Data Modeling: Implement a star schema for analytical queries
- Load processed tables into BigQuery
- Provide interactive dashboards in Looker Studio
-
💸 Total Revenue by Country
- The UK 🇬🇧 is the country that generated the most of the company's revenue with over 1.8M followed by France with 182.4k.
-
📈 Revenue by months
- The month with the most revenue is July with more than 220K.
- The month with the lowest revenue is December with 100K.
We can observe significant revenue increases in January (New Year), July (Wimbledon Finals), and November (Bonfire Night).
- This Project is inspired by this video of the YouTube Channel Darshil Parmar
LinkedIn • Website • Gmail: chahiri.abderrahmane.eng@gmail.com






