Skip to content

Aly-azzam/air-quality-data-engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌍 Air Quality Data Engineering Pipeline — Paris

A Complete ETL + FastAPI + Docker Pipeline for Real-Time Air Quality Analytics
Developed by Ali Azzam


Python
FastAPI
Docker
ETL
SQLite


📘 Project Overview

This project implements a complete Data Engineering pipeline using real air quality data from the World Air Quality Index (WAQI) API for Paris, France.

It demonstrates:

  • Automated data ingestion
  • Cleaning and preprocessing with Pandas
  • Storage in SQLite
  • REST API built with FastAPI
  • Dockerized deployment
  • Full documentation with screenshots

🏗️ Architecture Diagram

Architecture


🚀 1. Data Ingestion

Script:

python script/ingest.py

Fetches Paris AQI data
Saves: data/raw_air_quality.csv

Screenshot

Ingestion


🧹 2. Data Cleaning

Script:

python script/clean.py

Loads raw CSV
Cleans and restructures data
Saves: data/clean_air_quality.csv

Screenshot

Cleaning


🗄️ 3. SQL Database Storage

Script:

python script/database.py

Creates SQLite database
Inserts cleaned records
Outputs: data/air_quality.db

Screenshot

SQL


🌐 4. FastAPI Backend

Run locally:

uvicorn api.main:app --reload

Endpoints:

Endpoint Description
/ API status
/city Returns "Paris"
/latest Latest AQI measurement
/stats Pollutant statistics

Swagger UI

Swagger

Latest AQI Example

Latest


🐳 5. Docker Deployment

Build image:

docker build -t paris-air-api .

Run container:

docker compose up

API available at:
http://localhost:8000
http://localhost:8000/docs

Docker Running Screenshot

Docker Running


📂 Project Structure

Project Data/
│
├── api/
│   └── main.py
│
├── script/
│   ├── ingest.py
│   ├── clean.py
│   └── database.py
│
├── data/
│   ├── raw_air_quality.csv
│   ├── clean_air_quality.csv
│   └── air_quality.db
│
├── screenshots/
│   ├── architecture.png
│   ├── ingestion.png
│   ├── cleaning.png
│   ├── sql_creation.png
│   ├── swagger_ui.png
│   ├── latest_endpoint.png
│   ├── docker_running.png
│   
│
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
└── README.md

🎯 Key Highlights

  • Real ETL pipeline
  • Clean modular code
  • FastAPI backend
  • Dockerized deployment
  • Professional documentation
  • Architecture diagram included

👨‍💻 Author

Ali Azzam
Computer & Communication Engineering (CCE)
Université Saint-Joseph (USJ), Lebanon


📜 License

MIT License.

About

End-to-end ETL pipeline with FastAPI, SQLite and Docker using real Paris air quality data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors