Skip to content

Restructure CI/CD pipeline: consolidate workflows, reorganize tests, add queue-based Modal architecture #666

@anth-volk

Description

@anth-volk

Problem

The CI/CD pipeline has accumulated technical debt:

  • 9+ workflow files, many deprecated or broken (manual testing, local area promote/publish, district validation)
  • PR CI takes ~3 hours because it rebuilds all datasets from scratch on Modal just to run 53 integration tests (16% of the test suite)
  • Versioning workflow was broken — the PAT (POLICYENGINE_GITHUB) expired, preventing the "Update package version" commit from triggering PyPI publish
  • No test organization — unit tests (synthetic data, mocks) and integration tests (require built H5 datasets) were mixed together
  • H5 pipeline uses inefficient architecture — 8 workers with 4 CPU each, processing items serially within each worker, with sequential phases (states → districts → cities)

Solution

1. Fix PyPI publishing

Migrate versioning workflow from expired PAT to GitHub App token (APP_ID + APP_PRIVATE_KEY), matching the pattern in policyengine-api-v2-alpha.

2. Reorganize tests

Split into unit/ (270 tests, synthetic data, runs in seconds) and integration/ (53 tests, require H5 datasets). Unit sub-folders use no test_ prefix (unit/datasets/, unit/calibration/). Integration tests named per-dataset (test_cps.py, test_enhanced_cps.py).

3. Consolidate workflows

Replace 9 workflow files with 4:

  • pr.yaml — Fork check, lint, uv.lock, changelog, unit tests with Codecov, smoke test (~2-3 min)
  • push.yaml — Per-dataset Modal build with integration tests after each stage → manual approval gate → pipeline dispatch. Version bump commits go straight to PyPI publish.
  • pipeline.yaml — Dispatch-only. Queue-based H5 generation with scope filtering (all/national/state/congressional/local/test)
  • versioning.yaml — Auto version bump (unchanged, App token fix applied)

4. Queue-based Modal architecture

Replace partition-based N-worker model with:

  • generate_work_items(scope, db_path) — auto-generates work item list filtered by scope
  • build_single_area() — 1 CPU, 16GB per container, processes exactly one work item
  • queue_coordinator() — spawns up to 50 single-item workers, no multi-threading
  • Test scope (national + NY + NV-01) for fast validation runs

5. Observability

  • Codecov integration (informational, non-blocking)
  • $GITHUB_STEP_SUMMARY timing tables for per-dataset build durations
  • Modal dashboard link in pipeline summary

Prerequisites (manual setup after merge)

  • Create pipeline-approval GitHub environment with required reviewers
  • Add CODECOV_TOKEN secret from codecov.io
  • Verify H5 datasets are published on HuggingFace

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions