Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
142 changes: 142 additions & 0 deletions .agents/projects/workflow_scripts/branch-protection-rollout.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
# Branch protection rollout for the workflow rename PR

The rename PR changes 11 required job ids. Branch protection on `main`
must be updated for the PR to merge — `gh` API call below.

## Current required contexts

Captured 2026-05-01:

```text
lint-and-format
build-docs
marin-tests
levanter-tests
levanter-entry-tests
levanter-torch-tests
haliax-tests
iris-tests
zephyr-tests
fray-tests
marin-itest
```

All 11 are emitted by `app_id: 15368` (GitHub Actions).

## Mapping

| Old | New |
| --- | --- |
| `lint-and-format` | `marin-lint` |
| `build-docs` | `marin-docs` |
| `marin-tests` | `marin-unit` |
| `levanter-tests` | `levanter-unit` |
| `levanter-entry-tests` | `levanter-entry` |
| `levanter-torch-tests` | `levanter-torch` |
| `haliax-tests` | `haliax-unit` |
| `iris-tests` | `iris-unit` |
| `zephyr-tests` | `zephyr-unit` |
| `fray-tests` | `fray-unit` |
| `marin-itest` | `marin-integration` |

## Rollout sequence

The rename PR cannot merge while `lint-and-format`, `marin-tests`, etc.
are required, because the renamed PR branch emits the new names instead.

### Step 1 — verify the renamed checks are green on the PR branch

```bash
gh pr checks <PR-NUMBER>
```

Confirm each of the 11 new contexts (`marin-lint`, `marin-docs`,
`marin-unit`, `levanter-unit`, `levanter-entry`, `levanter-torch`,
`haliax-unit`, `iris-unit`, `zephyr-unit`, `fray-unit`,
`marin-integration`) shows as `pass`. If any fail, fix the workflow
in the PR before continuing.

### Step 2 — swap required contexts

Single PATCH that replaces the entire required-checks list.
The previous list is preserved at the top of this file so a rollback
PATCH is trivial.

```bash
gh api \
--method PATCH \
-H "Accept: application/vnd.github+json" \
/repos/marin-community/marin/branches/main/protection/required_status_checks \
--input - <<'EOF'
{
"strict": false,
"checks": [
{"context": "marin-lint", "app_id": 15368},
{"context": "marin-docs", "app_id": 15368},
{"context": "marin-unit", "app_id": 15368},
{"context": "levanter-unit", "app_id": 15368},
{"context": "levanter-entry", "app_id": 15368},
{"context": "levanter-torch", "app_id": 15368},
{"context": "haliax-unit", "app_id": 15368},
{"context": "iris-unit", "app_id": 15368},
{"context": "zephyr-unit", "app_id": 15368},
{"context": "fray-unit", "app_id": 15368},
{"context": "marin-integration", "app_id": 15368}
]
}
EOF
```

### Step 3 — merge

The PR is now mergeable.

### Step 4 — verify

```bash
gh api repos/marin-community/marin/branches/main/protection/required_status_checks --jq '.checks'
```

Should print the 11 new contexts.

## Rollback

If the renamed checks fail post-merge for some reason and the renames
need to be reverted, restore the old required contexts:

```bash
gh api \
--method PATCH \
-H "Accept: application/vnd.github+json" \
/repos/marin-community/marin/branches/main/protection/required_status_checks \
--input - <<'EOF'
{
"strict": false,
"checks": [
{"context": "lint-and-format", "app_id": 15368},
{"context": "build-docs", "app_id": 15368},
{"context": "marin-tests", "app_id": 15368},
{"context": "levanter-tests", "app_id": 15368},
{"context": "levanter-entry-tests", "app_id": 15368},
{"context": "levanter-torch-tests", "app_id": 15368},
{"context": "haliax-tests", "app_id": 15368},
{"context": "iris-tests", "app_id": 15368},
{"context": "zephyr-tests", "app_id": 15368},
{"context": "fray-tests", "app_id": 15368},
{"context": "marin-itest", "app_id": 15368}
]
}
EOF
```

Then revert the rename commit on `main`.

## Active rulesets

`gh api repos/marin-community/marin/rulesets` returns the `protect main`
ruleset. As of 2026-05-01 the required status checks live in the classic
branch protection captured above, not in a ruleset, so the PATCH above
is sufficient. If a future change moves required checks into the ruleset,
the same sequence applies through the rulesets API
(`/repos/{owner}/{repo}/rulesets/{ruleset_id}`) instead of branch
protection.
12 changes: 7 additions & 5 deletions .agents/projects/workflow_scripts/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,9 @@ The second challenge is naming. Issue #5067 proposes `type-domain-test`; prior t

Create `scripts/workflows/` for Python CLIs that implement behavior currently embedded in `.github/workflows/*.yaml`. Workflow YAML remains responsible for triggers, permissions, runner selection, matrices, secrets binding, artifact upload/download, and calling scripts. Long `run:` blocks should become one-line invocations unless the block is only shell glue that GitHub Actions itself owns.

The default execution model is repo-local:
Prefer the `gh` CLI for GitHub state changes (creating PRs, opening issues, posting comments, editing labels, cutting releases, querying branch protection). `gh` is pre-installed and pre-authenticated on GitHub-hosted runners, works the same way locally, and removes a class of Marin-owned wrapper code. A short documented `git + gh pr create/edit` snippet replaces `peter-evans/create-pull-request` without introducing a Python helper. Reach for a Python script only when the behavior involves real logic that benefits from tests: path/glob matching across many groups, YAML auditing, subprocess polling with timeout and JSON parsing, multi-step provider-specific diagnostics. The bar is "would this be ugly or untestable in 10–15 lines of `gh` + shell." If the answer is no, keep it in YAML and call `gh` directly.

The default execution model for the Python scripts is repo-local:

```bash
uv run python scripts/workflows/<domain>_<command>.py <subcommand> ...
Expand Down Expand Up @@ -54,16 +56,16 @@ Job names should follow the same shape when they create branch-protection contex

Third-party actions are allowed in two tiers. A small explicit allowlist of trusted primitives may remain tag-pinned when that is already the project norm: checkout, setup, cache, artifact upload/download, CodeQL, GitHub app token creation, uv setup, Google auth/setup-gcloud, and Docker setup/login/build actions. All other third-party actions should be pinned to a full commit SHA. This follows GitHub's security guidance that SHA pinning is the only immutable reference for third-party actions and that tag pinning is a trust decision.

The largest duplicated behavior is Iris job orchestration: submit/wait/status inspection and failure diagnostics are embedded across canary, smoke, and integration workflows. The first operational extraction should be behavior-level CLIs such as `iris_job.py wait` and `iris_diagnostics.py collect`, not a reusable workflow. Before that operational PR lands, the foundation PR creates the script/audit surface and converts lower-risk workflows so the live-infrastructure extraction has a stable pattern to follow.
The largest duplicated behavior is Iris job orchestration: submit/wait/status inspection and failure diagnostics are embedded across canary, smoke, and integration workflows. The first operational extraction is a single `scripts/workflows/iris_monitor.py` with `status`, `wait`, and `collect` subcommands, not a reusable workflow. Wait and diagnostics share Iris CLI plumbing and are always invoked as a pair (wait, then collect on failure), so one module is the right shape. Before that operational PR lands, the foundation PR creates the script/audit surface and converts lower-risk workflows so the live-infrastructure extraction has a stable pattern to follow.

`scripts/workflows/**` should be included in ruff and pyrefly even though broader `scripts/**` remains excluded today. These scripts are CI infrastructure, not one-off local utilities, and breakage affects the development loop for everyone.

Do not create a shared workflow package on day one. Start with concrete importable modules such as `scripts/workflows/iris_job.py`, `scripts/workflows/iris_diagnostics.py`, and `scripts/workflows/pull_request.py`. Add `scripts/workflows/lib/` only after at least two modules need the same non-trivial helper and the helper has a stable contract. The repo currently has `scripts/__init__.py`, but not a meaningful scripts package hierarchy; premature packaging would add more structure than the first extraction needs.
Do not create a shared workflow package on day one. Start with concrete importable modules such as `scripts/workflows/iris_monitor.py`, `scripts/workflows/changes.py`, and `scripts/workflows/github_actions.py`. Add `scripts/workflows/lib/` only after at least two modules need the same non-trivial helper and the helper has a stable contract. The repo currently has `scripts/__init__.py`, but not a meaningful scripts package hierarchy; premature packaging would add more structure than the first extraction needs.

The full migration should land as a small number of large PRs, not a long tail of tiny workflow edits:

1. **Foundation and low-risk normalization.** Add `scripts/workflows/`, `.github/workflows/README.md`, the workflow inventory/audit script, path-change filtering, pull-request creation, and SHA pinning for non-trusted third-party actions. Bring `scripts/workflows/**` under ruff and pyrefly. Convert unit/docs/lint/release workflows that mostly run tests or packaging scripts, because they do not launch live infrastructure.
2. **Iris and ferry behavior extraction.** Add the Iris job wait/status and diagnostics scripts, then migrate `iris-cloud-smoke-gcp`, `iris-coreweave-ci`, `marin-canary-ferry*`, `marin-datakit-*`, and `zephyr-shuffle-itest` to call those scripts. This is one large operational PR because these workflows share the same failure modes and should be reviewed together.
1. **Foundation and low-risk normalization.** Add `scripts/workflows/`, `.github/workflows/README.md` (including the canonical `git + gh pr create/edit` recipe), the workflow inventory/audit script, path-change filtering, and SHA pinning for non-trusted third-party actions. Replace `peter-evans/create-pull-request@v7` in `dupekit-wheels.yaml` with an inline `gh` snippet rather than a Python wrapper. Delete `marin-metrics.yaml` (dead conda-based weekly job) instead of migrating it; this also eliminates the only `conda-incubator/setup-miniconda` pin from the repo. Bring `scripts/workflows/**` under ruff and pyrefly. Convert unit/docs/lint/release workflows that mostly run tests or packaging scripts, because they do not launch live infrastructure.
2. **Iris and ferry behavior extraction.** Add `iris_monitor.py` with `status`, `wait`, and `collect` subcommands, then migrate `iris-cloud-smoke-gcp`, `iris-coreweave-ci`, `marin-canary-ferry*`, `marin-datakit-*`, and `zephyr-shuffle-itest` to call it. This is one large operational PR because these workflows share the same failure modes and should be reviewed together.
3. **Names, file renames, and consolidation.** Rename workflow files and displayed workflow/job names to `domain-type[-variant]`, update branch-protection contexts with `gh api` where required, and consolidate workflows only where the scripts have made the provider/domain differences parameterizable. This is the right point to decide whether `iris-smoke-gcp` and `iris-smoke-coreweave` become one `iris-smoke` workflow.

## Testing
Expand Down
Loading
Loading