|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +Plombery is a Python task scheduler with a built-in web UI and REST API. Users define **Pipelines** (collections of **Tasks**) in pure Python, attach APScheduler-based **Triggers** for scheduling, and Plombery runs them, stores results, and streams real-time logs via WebSocket. |
| 8 | + |
| 9 | +Stack: FastAPI + APScheduler + SQLAlchemy (SQLite default) + Socket.IO on the backend; React + TypeScript + Vite + Tailwind + Tremor on the frontend. |
| 10 | + |
| 11 | +## Commands |
| 12 | + |
| 13 | +### Backend (Python) |
| 14 | + |
| 15 | +```sh |
| 16 | +# Install with dev dependencies |
| 17 | +uv sync --dev |
| 18 | + |
| 19 | +# Run all tests |
| 20 | +pytest |
| 21 | + |
| 22 | +# Run a single test file or test |
| 23 | +pytest tests/test_api.py |
| 24 | +pytest tests/test_api.py::test_api_list_pipelines |
| 25 | + |
| 26 | +# Run with coverage |
| 27 | +coverage run -m pytest |
| 28 | +coverage report -m |
| 29 | + |
| 30 | +# Lint / format |
| 31 | +flake8 |
| 32 | +black . |
| 33 | +``` |
| 34 | + |
| 35 | +Tests automatically use an in-memory SQLite database (`DATABASE_URL=sqlite:///:memory:`) — no setup needed. |
| 36 | + |
| 37 | +### Frontend (React/TypeScript) |
| 38 | + |
| 39 | +The frontend uses **pnpm** as the package manager. |
| 40 | + |
| 41 | +```sh |
| 42 | +cd frontend/ |
| 43 | + |
| 44 | +# Install dependencies |
| 45 | +pnpm install # or just: pnpm |
| 46 | + |
| 47 | +# Development server (hot-reload, proxies API to localhost:8000) |
| 48 | +pnpm dev |
| 49 | + |
| 50 | +# Production build (outputs to frontend/dist/, embedded into the Python package) |
| 51 | +pnpm build |
| 52 | +``` |
| 53 | + |
| 54 | +### Running the Example App |
| 55 | + |
| 56 | +```sh |
| 57 | +cd examples/ |
| 58 | +python -m venv .venv && source .venv/bin/activate |
| 59 | +pip install -r requirements.txt |
| 60 | +./run.sh # or ./run.ps1 on Windows |
| 61 | +``` |
| 62 | + |
| 63 | +The example app runs with `--reload` pointing at the parent directory, so changes to the `plombery` package are picked up live. |
| 64 | + |
| 65 | +### Documentation |
| 66 | + |
| 67 | +```sh |
| 68 | +mkdocs serve |
| 69 | +``` |
| 70 | + |
| 71 | +## Architecture |
| 72 | + |
| 73 | +### Core Python Package (`src/plombery/`) |
| 74 | + |
| 75 | +The main entry point is `src/plombery/__init__.py`, which exposes `register_pipeline`, `get_app`, `task`, `Task`, `Pipeline`, `Trigger`, and `PipelineRunStatus`. |
| 76 | + |
| 77 | +**Execution flow:** |
| 78 | +1. User calls `register_pipeline(...)` → stored in `orchestrator._all_pipelines` |
| 79 | +2. `_Orchestrator` (APScheduler `AsyncIOScheduler`) schedules jobs for each non-paused `Trigger` |
| 80 | +3. When a trigger fires (or a manual run is requested via API), `executor.run()` is called |
| 81 | +4. `executor.run()` iterates over `pipeline.tasks`, calls each task function, stores output as JSON in `.data/runs/run_<id>/`, and emits `run-update` WebSocket events |
| 82 | +5. Python `contextvars` (`pipeline_context`, `task_context`, `run_context` in `pipeline/context.py`) carry pipeline/run state into task functions and the logger |
| 83 | + |
| 84 | +**Key modules:** |
| 85 | +- `pipeline/pipeline.py`, `pipeline/task.py`, `pipeline/trigger.py` — Pydantic models for user-facing API |
| 86 | +- `orchestrator/__init__.py` — wraps APScheduler, holds pipeline registry, exposes `run_pipeline_now()` |
| 87 | +- `orchestrator/executor.py` — async `run()` function that executes pipelines task-by-task |
| 88 | +- `orchestrator/data_storage.py` — reads/writes task output JSON and run log files under `.data/` |
| 89 | +- `api/__init__.py` — FastAPI app wiring: mounts Socket.IO at `/ws`, adds routers under `/api`, serves SPA from root |
| 90 | +- `api/routers/pipelines.py` — REST endpoints: list/get pipelines, get input schema, trigger manual run |
| 91 | +- `api/authentication.py` — OAuth2 via Authlib; `NeedsAuth = Depends(_needs_auth)` is used on all API routers; auth is entirely optional (controlled by `settings.auth`) |
| 92 | +- `logger/__init__.py` — `get_logger()` returns a per-run `LoggerAdapter` that writes JSONL to disk and streams via Socket.IO |
| 93 | +- `notifications/__init__.py` — `NotificationManager` uses Apprise to send alerts based on `NotificationRule` objects |
| 94 | +- `database/` — SQLAlchemy models (`PipelineRun`), Alembic migrations, and repository functions; `SessionLocal` is a context-manager session factory |
| 95 | + |
| 96 | +**Configuration** (`config/model.py`): loaded via `pydantic-settings` from env vars, `.env` file, and a YAML settings file. Key settings: `database_url`, `data_path`, `auth`, `notifications`, `allowed_origins`, `frontend_url`. |
| 97 | + |
| 98 | +### Frontend (`frontend/src/`) |
| 99 | + |
| 100 | +File-system based routing: `frontend/src/Router.tsx` uses `import.meta.glob` to auto-discover all `pages/**/*.tsx` files and maps `[param]` folder segments to `:param` route params. |
| 101 | + |
| 102 | +Pages follow the hierarchy: `/` → pipeline list; `/pipelines/:pipelineId` → pipeline detail with trigger list; `/pipelines/:pipelineId/triggers/:triggerId` → trigger detail with runs list; `/pipelines/:pipelineId/triggers/:triggerId/runs/:runId` → run detail with logs and task output. |
| 103 | + |
| 104 | +Real-time updates arrive via `socket.io-client` (`contexts/WebSocketContext.tsx`) on the `run-update` event. Data fetching uses `@tanstack/react-query`. Forms for pipeline input params are auto-generated from the JSON Schema returned by `GET /api/pipelines/:id/input-schema` (`components/JsonSchemaForm.tsx`). |
| 105 | + |
| 106 | +### Pipeline Definition Pattern |
| 107 | + |
| 108 | +Users write pipelines using the public API: |
| 109 | + |
| 110 | +```python |
| 111 | +from plombery import get_app, register_pipeline, task, Task, Trigger |
| 112 | +from apscheduler.triggers.interval import IntervalTrigger |
| 113 | + |
| 114 | +@task |
| 115 | +async def my_task(params): |
| 116 | + return {"result": 42} |
| 117 | + |
| 118 | +register_pipeline( |
| 119 | + id="my_pipeline", |
| 120 | + tasks=[my_task], |
| 121 | + triggers=[Trigger(id="every_hour", name="Every hour", schedule=IntervalTrigger(hours=1))], |
| 122 | +) |
| 123 | + |
| 124 | +app = get_app() # pass to uvicorn |
| 125 | +``` |
| 126 | + |
| 127 | +Tasks receive the previous task's return value as the first positional argument (data flowing), and the pipeline's Pydantic params model via a `params` keyword argument. Either or both can be omitted. |
| 128 | + |
| 129 | +### Database Migrations |
| 130 | + |
| 131 | +Alembic is configured in `src/plombery/alembic/`. When `setup_database()` is called on startup, it applies any pending migrations automatically. New migrations go in `src/plombery/alembic/versions/`. |
| 132 | + |
| 133 | +### Frontend Build Integration |
| 134 | + |
| 135 | +`pnpm build` outputs to `frontend/dist/`. The `SPAStaticFiles` middleware in `api/middlewares.py` serves these static files and falls back to `index.html` for client-side routing. The built assets are included in the Python package via `MANIFEST.in`. |
0 commit comments