Restructure CI/CD pipeline: consolidate workflows, reorganize tests, add queue-based Modal architecture

## Problem

The CI/CD pipeline has accumulated technical debt:

- **9+ workflow files**, many deprecated or broken (manual testing, local area promote/publish, district validation)
- **PR CI takes ~3 hours** because it rebuilds all datasets from scratch on Modal just to run 53 integration tests (16% of the test suite)
- **Versioning workflow was broken** — the PAT (`POLICYENGINE_GITHUB`) expired, preventing the "Update package version" commit from triggering PyPI publish
- **No test organization** — unit tests (synthetic data, mocks) and integration tests (require built H5 datasets) were mixed together
- **H5 pipeline uses inefficient architecture** — 8 workers with 4 CPU each, processing items serially within each worker, with sequential phases (states → districts → cities)

## Solution

### 1. Fix PyPI publishing
Migrate versioning workflow from expired PAT to GitHub App token (`APP_ID` + `APP_PRIVATE_KEY`), matching the pattern in `policyengine-api-v2-alpha`.

### 2. Reorganize tests
Split into `unit/` (270 tests, synthetic data, runs in seconds) and `integration/` (53 tests, require H5 datasets). Unit sub-folders use no `test_` prefix (`unit/datasets/`, `unit/calibration/`). Integration tests named per-dataset (`test_cps.py`, `test_enhanced_cps.py`).

### 3. Consolidate workflows
Replace 9 workflow files with 4:
- **`pr.yaml`** — Fork check, lint, uv.lock, changelog, unit tests with Codecov, smoke test (~2-3 min)
- **`push.yaml`** — Per-dataset Modal build with integration tests after each stage → manual approval gate → pipeline dispatch. Version bump commits go straight to PyPI publish.
- **`pipeline.yaml`** — Dispatch-only. Queue-based H5 generation with scope filtering (all/national/state/congressional/local/test)
- **`versioning.yaml`** — Auto version bump (unchanged, App token fix applied)

### 4. Queue-based Modal architecture
Replace partition-based N-worker model with:
- `generate_work_items(scope, db_path)` — auto-generates work item list filtered by scope
- `build_single_area()` — 1 CPU, 16GB per container, processes exactly one work item
- `queue_coordinator()` — spawns up to 50 single-item workers, no multi-threading
- Test scope (national + NY + NV-01) for fast validation runs

### 5. Observability
- Codecov integration (informational, non-blocking)
- `$GITHUB_STEP_SUMMARY` timing tables for per-dataset build durations
- Modal dashboard link in pipeline summary

## Prerequisites (manual setup after merge)
- [ ] Create `pipeline-approval` GitHub environment with required reviewers
- [ ] Add `CODECOV_TOKEN` secret from codecov.io
- [ ] Verify H5 datasets are published on HuggingFace

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restructure CI/CD pipeline: consolidate workflows, reorganize tests, add queue-based Modal architecture #666

Problem

Solution

1. Fix PyPI publishing

2. Reorganize tests

3. Consolidate workflows

4. Queue-based Modal architecture

5. Observability

Prerequisites (manual setup after merge)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Restructure CI/CD pipeline: consolidate workflows, reorganize tests, add queue-based Modal architecture #666

Description

Problem

Solution

1. Fix PyPI publishing

2. Reorganize tests

3. Consolidate workflows

4. Queue-based Modal architecture

5. Observability

Prerequisites (manual setup after merge)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions