Skip to content

Latest commit

 

History

History
96 lines (73 loc) · 2.88 KB

File metadata and controls

96 lines (73 loc) · 2.88 KB

data-ingestion-cli

Python CLI tool for ingesting data from public APIs into PostgreSQL. Built with clean architecture: extractor → transformer → loader.

Features

  • Fetch current weather data for any Brazilian city (Open-Meteo API — no key needed)
  • Fetch Brazilian municipality population data (IBGE API — no key needed)
  • Extensible: add new sources by implementing BaseExtractor
  • CI with GitHub Actions (lint + tests on every push)

Installation

git clone https://github.com/your-username/data-ingestion-cli
cd data-ingestion-cli
pip install -e .

Usage

List available sources

ingest list
┌──────────┬──────────────────────────────────────────────────────┐
│ Source   │ Description                                          │
├──────────┼──────────────────────────────────────────────────────┤
│ weather  │ Open-Meteo weather API (current conditions by city)  │
│ ibge     │ IBGE Brazilian municipality population data          │
└──────────┴──────────────────────────────────────────────────────┘

Ingest weather data

# Uses DATABASE_URL from environment
export DATABASE_URL="postgresql://user:pass@localhost:5432/db"
ingest run --source weather --city "Florianopolis"
ingest run --source weather --city "São Paulo"

Ingest IBGE population

ingest run --source ibge --state SC
ingest run --source ibge --state SP

Check status

ingest status

Adding a New Source

  1. Create src/ingestion/extractors/your_source.py extending BaseExtractor
  2. Add a Pydantic model in src/ingestion/models/schemas.py
  3. Add a loader method in src/ingestion/loaders/postgres.py
  4. Register the source in src/ingestion/cli.py
class MyExtractor(BaseExtractor):
    source_name = "my_source"

    def extract(self, **kwargs) -> list[MyModel]:
        # fetch and return data
        ...

Project Structure

src/ingestion/
├── cli.py                    # Typer CLI (run, list, status)
├── config.py                 # Settings via pydantic-settings
├── extractors/
│   ├── base.py               # Abstract base class
│   ├── weather.py            # Open-Meteo extractor
│   └── ibge.py               # IBGE extractor
├── loaders/
│   └── postgres.py           # SQLAlchemy loader
└── models/
    └── schemas.py            # Pydantic models

Running Tests

pip install -e ".[dev]"
pytest tests/ -v --cov=src/ingestion