A Complete ETL + FastAPI + Docker Pipeline for Real-Time Air Quality Analytics
Developed by Ali Azzam
This project implements a complete Data Engineering pipeline using real air quality data from the World Air Quality Index (WAQI) API for Paris, France.
It demonstrates:
- Automated data ingestion
- Cleaning and preprocessing with Pandas
- Storage in SQLite
- REST API built with FastAPI
- Dockerized deployment
- Full documentation with screenshots
Script:
python script/ingest.pyFetches Paris AQI data
Saves: data/raw_air_quality.csv
Script:
python script/clean.pyLoads raw CSV
Cleans and restructures data
Saves: data/clean_air_quality.csv
Script:
python script/database.pyCreates SQLite database
Inserts cleaned records
Outputs: data/air_quality.db
Run locally:
uvicorn api.main:app --reload| Endpoint | Description |
|---|---|
/ |
API status |
/city |
Returns "Paris" |
/latest |
Latest AQI measurement |
/stats |
Pollutant statistics |
docker build -t paris-air-api .docker compose upAPI available at:
http://localhost:8000
http://localhost:8000/docs
Project Data/
│
├── api/
│ └── main.py
│
├── script/
│ ├── ingest.py
│ ├── clean.py
│ └── database.py
│
├── data/
│ ├── raw_air_quality.csv
│ ├── clean_air_quality.csv
│ └── air_quality.db
│
├── screenshots/
│ ├── architecture.png
│ ├── ingestion.png
│ ├── cleaning.png
│ ├── sql_creation.png
│ ├── swagger_ui.png
│ ├── latest_endpoint.png
│ ├── docker_running.png
│
│
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
└── README.md
- Real ETL pipeline
- Clean modular code
- FastAPI backend
- Dockerized deployment
- Professional documentation
- Architecture diagram included
Ali Azzam
Computer & Communication Engineering (CCE)
Université Saint-Joseph (USJ), Lebanon
MIT License.






