fix: Docker sandbox agent execution + role template rendering + --model propagation for non-Claude CLIs #1084
Workflow file for this run
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| name: "Contract Drift Autofix" | |
| # Refs: #1273 - CI hardening, bucket CI-E. | |
| # | |
| # Why this workflow exists | |
| # ------------------------ | |
| # Three contract tests in tests/unit/ act as drift detectors: | |
| # | |
| # * test_readme_api_coverage.py::test_all_cli_commands_are_documented | |
| # fails when a new top-level CLI command is registered but not added to | |
| # DOCUMENTED_COMMANDS. | |
| # * test_api_v1_routing.py::TestVersionedRoutesParity::test_every_root_route_has_v1_counterpart | |
| # fails when a new root-mounted FastAPI route is added without either a | |
| # /api/v1/* mirror or an entry in _INFRASTRUCTURE_PATHS. | |
| # * test_cli_run_params.py::test_run_params_match_cli_call | |
| # fails when run() gains a new parameter that the cli() click callback | |
| # does not forward. | |
| # | |
| # Every one of these is a 1-2 line allow-list / forward-arg edit. The fix is | |
| # mechanical and well-suited for an autofix bot. | |
| # | |
| # Strategy (modernized, refs operator audit 2026-05-18) | |
| # ----------------------------------------------------- | |
| # Inline-fix model: when drift is detected on a same-repo PR, commit the | |
| # regen DIRECTLY to the source PR's head ref instead of opening a separate | |
| # bot PR. This eliminates the entire class of orphaned bot-PR churn | |
| # (#1436, #1440, #1441, #1445 type) we previously saw, because: | |
| # | |
| # * No separate PR means no separate review/merge step. | |
| # * The regen rides into main with the source PR, atomically. | |
| # * When the source PR is auto-deleted on merge, there is no orphaned | |
| # bot branch left re-targeting to main with a giant diff. | |
| # | |
| # Fallback paths (preserved for safety): | |
| # * Fork PRs (no push to head ref): post the patch as an idempotent PR | |
| # comment with copy-paste-able apply instructions. | |
| # * Unhandled drift / LOC-cap trip: open a tracking issue. | |
| # * Push to head ref fails (branch protection, lease conflict): post | |
| # the patch as a PR comment so the operator can apply locally. | |
| # | |
| # Recursion guards | |
| # ---------------- | |
| # * Skips when the PR author is the bot itself (no infinite loops). | |
| # * Skips PRs whose head ref starts with `bot/contract-drift-` | |
| # (legacy bot branches; ignored so they don't re-trigger). | |
| # * Diff size capped at 30 LOC by the regen script; larger diffs imply | |
| # a real change and are escalated by opening an issue instead. | |
| # * No `actions: write` permission, so the autofix commit cannot | |
| # recursively trigger more workflows in this file (the inline commit | |
| # uses BOT_PAT to bypass the GITHUB_TOKEN recursion mute for the | |
| # source PR's other workflows; this file is muted for itself via the | |
| # head-ref recursion guard above and the bot-author check). | |
| # | |
| # Secret requirements | |
| # ------------------- | |
| # * BOT_PAT (optional, recommended): a fine-grained PAT with `contents: | |
| # write` + `pull-requests: write` on this repo. When present, the | |
| # inline push triggers CI runs on the source PR (GITHUB_TOKEN-authored | |
| # commits are otherwise muted by GitHub's recursion protection). | |
| # When absent, falls back to GITHUB_TOKEN and the PR-comment path | |
| # (the operator applies the regen manually). | |
| on: | |
| pull_request: | |
| types: [opened, synchronize] | |
| concurrency: | |
| group: contract-drift-${{ github.event.pull_request.number }} | |
| cancel-in-progress: true | |
| # Workflow-scope permissions: contents:write for the inline regen push to | |
| # the source PR head ref, pull-requests:write for the comment-fallback | |
| # path, issues:write for the tracking-issue fallback when regen cannot | |
| # produce a clean patch. The autofix job re-asserts the same set at job | |
| # scope (Scorecard token-permissions). | |
| permissions: | |
| contents: write | |
| pull-requests: write | |
| issues: write | |
| jobs: | |
| autofix: | |
| name: Detect and patch contract drift | |
| runs-on: ubuntu-latest | |
| timeout-minutes: 10 | |
| permissions: | |
| contents: write | |
| pull-requests: write | |
| issues: write | |
| # Recursion guard: don't run on PRs the bot itself opened, and don't | |
| # run on legacy bot-contract-drift-* branches (kept for safety while | |
| # any old open bot PRs drain). | |
| if: > | |
| github.event.pull_request.user.login != 'github-actions[bot]' && | |
| github.event.pull_request.user.login != 'bernstein[bot]' && | |
| github.event.pull_request.user.login != 'bernstein-orchestrator[bot]' && | |
| !startsWith(github.event.pull_request.head.ref, 'bot/contract-drift-') | |
| steps: | |
| - name: Harden runner (audit mode) | |
| uses: step-security/harden-runner@9af89fc71515a100421586dfdb3dc9c984fbf411 # v2.19.4 | |
| with: | |
| egress-policy: audit | |
| - name: Detect fork PR | |
| id: forkcheck | |
| env: | |
| HEAD_REPO: ${{ github.event.pull_request.head.repo.full_name }} | |
| BASE_REPO: ${{ github.repository }} | |
| run: | | |
| # Fork PRs have head.repo != base repo. We cannot push to a fork's | |
| # head ref, so we fall back to the PR-comment path for those. | |
| if [ "$HEAD_REPO" != "$BASE_REPO" ]; then | |
| echo "is_fork=true" >> "$GITHUB_OUTPUT" | |
| echo "::notice::PR is from a fork ($HEAD_REPO); will use PR-comment fallback if drift detected." | |
| else | |
| echo "is_fork=false" >> "$GITHUB_OUTPUT" | |
| fi | |
| # persist-credentials kept: this job pushes the regen commit back to | |
| # the source PR's head ref (same-repo PRs only). | |
| - name: Checkout PR head | |
| uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7 # zizmor: ignore[artipacked] | |
| with: | |
| ref: ${{ github.event.pull_request.head.sha }} | |
| fetch-depth: 0 | |
| # Use BOT_PAT if present so the inline push triggers CI on the | |
| # source PR. GITHUB_TOKEN-authored pushes are muted by GitHub. | |
| token: ${{ secrets.BOT_PAT || secrets.GITHUB_TOKEN }} | |
| - uses: astral-sh/setup-uv@fac544c07dec837d0ccb6301d7b5580bf5edae39 # v8.2.0 | |
| - name: Sync project (dev group) | |
| run: uv sync --group dev | |
| - name: Run drift tests (capture failures, don't fail the job) | |
| id: drift | |
| run: | | |
| set +e | |
| uv run pytest \ | |
| tests/unit/test_readme_api_coverage.py::test_all_cli_commands_are_documented \ | |
| tests/unit/test_api_v1_routing.py::TestVersionedRoutesParity::test_every_root_route_has_v1_counterpart \ | |
| tests/unit/test_cli_run_params.py::test_run_params_match_cli_call \ | |
| --tb=short | |
| status=$? | |
| echo "status=$status" >> "$GITHUB_OUTPUT" | |
| if [ "$status" -eq 0 ]; then | |
| echo "drift=false" >> "$GITHUB_OUTPUT" | |
| echo "::notice::No contract drift detected." | |
| else | |
| echo "drift=true" >> "$GITHUB_OUTPUT" | |
| echo "::notice::Contract drift detected - running regen." | |
| fi | |
| exit 0 | |
| - name: Regenerate drift fixtures | |
| if: steps.drift.outputs.drift == 'true' | |
| id: regen | |
| run: | | |
| set +e | |
| uv run python scripts/regen_contract_drift.py --fixture all | |
| # The regen script returns 0 only when at least one fixture was | |
| # patched. Any non-zero exit means nothing was patched - either | |
| # because the drift wasn't one of the three known patterns or | |
| # because the diff exceeded the 30-LOC safety cap. | |
| regen_status=$? | |
| echo "regen_status=$regen_status" >> "$GITHUB_OUTPUT" | |
| exit 0 | |
| - name: Verify regen produced something | |
| if: steps.drift.outputs.drift == 'true' | |
| id: verify | |
| run: | | |
| if git diff --quiet; then | |
| echo "changed=false" >> "$GITHUB_OUTPUT" | |
| echo "::warning::Drift was detected but regen produced no diff. Likely an unhandled drift pattern or LOC-cap trip - opening tracking issue." | |
| else | |
| echo "changed=true" >> "$GITHUB_OUTPUT" | |
| # Compute a coarse summary of what was regenerated. | |
| CHANGED_FILES=$(git diff --name-only | sort -u) | |
| { | |
| echo "changed_files<<EOF" | |
| echo "$CHANGED_FILES" | |
| echo "EOF" | |
| } >> "$GITHUB_OUTPUT" | |
| # Compute LOC delta (added+removed) for the safety report. | |
| STAT=$(git diff --shortstat) | |
| INS=$(echo "$STAT" | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0") | |
| DEL=$(echo "$STAT" | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0") | |
| LOC=$((INS + DEL)) | |
| echo "loc_delta=${LOC}" >> "$GITHUB_OUTPUT" | |
| echo "::notice::Regen produced ${LOC:-0} LOC of changes across:" | |
| while IFS= read -r f; do | |
| [ -z "$f" ] && continue | |
| echo " $f" | |
| done <<< "$CHANGED_FILES" | |
| # Persist the diff for the comment-fallback path. | |
| git diff > "${RUNNER_TEMP}/contract-drift.patch" | |
| fi | |
| - name: Re-run drift tests to confirm regen fixed them | |
| if: steps.drift.outputs.drift == 'true' && steps.verify.outputs.changed == 'true' | |
| id: reverify | |
| run: | | |
| set +e | |
| uv run pytest \ | |
| tests/unit/test_readme_api_coverage.py::test_all_cli_commands_are_documented \ | |
| tests/unit/test_api_v1_routing.py::TestVersionedRoutesParity::test_every_root_route_has_v1_counterpart \ | |
| tests/unit/test_cli_run_params.py::test_run_params_match_cli_call \ | |
| --tb=short | |
| status=$? | |
| echo "status=$status" >> "$GITHUB_OUTPUT" | |
| if [ "$status" -ne 0 ]; then | |
| echo "::warning::Regen ran but drift tests still fail. Bot will open a tracking issue." | |
| fi | |
| exit 0 | |
| - name: Commit regen inline to source PR head ref | |
| # Primary path. Same-repo PRs receive the regen patch directly on | |
| # their head branch via git push, so the autofix rides into main | |
| # with the source PR. No separate bot PR is opened. | |
| if: > | |
| steps.drift.outputs.drift == 'true' && | |
| steps.verify.outputs.changed == 'true' && | |
| steps.reverify.outputs.status == '0' && | |
| steps.forkcheck.outputs.is_fork == 'false' | |
| id: inline_push | |
| continue-on-error: true | |
| env: | |
| PR_HEAD_REF: ${{ github.event.pull_request.head.ref }} | |
| PR_NUMBER: ${{ github.event.pull_request.number }} | |
| RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }} | |
| run: | | |
| set -euo pipefail | |
| # Configure a stable bot identity for the commit. The identity is | |
| # intentionally distinct from any real user; the commit author | |
| # uses the GitHub-actions noreply email so commits attribute to | |
| # the bot, not the workflow's PAT owner. | |
| git config user.name "github-actions[bot]" | |
| git config user.email "41898282+github-actions[bot]@users.noreply.github.com" | |
| git add -A | |
| git commit -m "chore(ci): regenerate contract drift allow-lists | |
| Auto-applied by contract-drift-autofix.yml on PR #${PR_NUMBER}. | |
| Regenerated via scripts/regen_contract_drift.py. Refs #1273. | |
| Source CI run: ${RUN_URL}" | |
| # --force-with-lease is the safe push variant: it rejects if the | |
| # remote moved since we checked out, so a concurrent push from | |
| # the PR author isn't clobbered. If the lease check fails the | |
| # comment-fallback path will fire to surface the patch. | |
| git push --force-with-lease origin "HEAD:${PR_HEAD_REF}" | |
| echo "pushed=true" >> "$GITHUB_OUTPUT" | |
| - name: Post drift patch as PR comment (fallback) | |
| # Fallback path. Fires when: | |
| # * the PR is from a fork (no push access to head ref), OR | |
| # * the inline push step failed (lease conflict, branch protection, | |
| # transient API outage). | |
| # Body is idempotent on a hidden marker so re-runs edit the same | |
| # comment rather than spamming the PR. | |
| if: > | |
| steps.drift.outputs.drift == 'true' && | |
| steps.verify.outputs.changed == 'true' && | |
| steps.reverify.outputs.status == '0' && ( | |
| steps.forkcheck.outputs.is_fork == 'true' || | |
| steps.inline_push.outcome == 'failure' || | |
| steps.inline_push.outputs.pushed != 'true' | |
| ) | |
| id: comment | |
| uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0 | |
| env: | |
| PR_NUMBER: ${{ github.event.pull_request.number }} | |
| CHANGED_FILES: ${{ steps.verify.outputs.changed_files }} | |
| LOC_DELTA: ${{ steps.verify.outputs.loc_delta }} | |
| RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }} | |
| IS_FORK: ${{ steps.forkcheck.outputs.is_fork }} | |
| INLINE_OUTCOME: ${{ steps.inline_push.outcome }} | |
| with: | |
| script: | | |
| const fs = require('fs'); | |
| const path = require('path'); | |
| const marker = '<!-- contract-drift-autofix:patch -->'; | |
| const prNumber = Number(process.env.PR_NUMBER); | |
| const changedFiles = (process.env.CHANGED_FILES || '').trim(); | |
| const locDelta = process.env.LOC_DELTA || '?'; | |
| const runUrl = process.env.RUN_URL; | |
| const isFork = process.env.IS_FORK === 'true'; | |
| const inlineOutcome = process.env.INLINE_OUTCOME || ''; | |
| const reason = isFork | |
| ? 'PR is from a fork; the bot cannot push to fork head refs, so apply the patch below manually.' | |
| : `Inline autofix push failed (\`${inlineOutcome || 'skipped'}\`). Apply the patch below manually.`; | |
| const patchPath = path.join(process.env.RUNNER_TEMP, 'contract-drift.patch'); | |
| let patch = fs.readFileSync(patchPath, 'utf-8'); | |
| // Defensive cap: GitHub comment body limit is 65536 chars. | |
| // The regen script caps the diff at 30 LOC so we are nowhere | |
| // near this, but keep a guard in case the cap is ever raised. | |
| const PATCH_CAP = 50000; | |
| let truncated = false; | |
| if (patch.length > PATCH_CAP) { | |
| patch = patch.slice(0, PATCH_CAP); | |
| truncated = true; | |
| } | |
| const body = [ | |
| marker, | |
| '## Contract drift detected - proposed patch', | |
| '', | |
| reason, | |
| '', | |
| 'Three contract tests act as drift detectors against the public CLI / API surface:', | |
| '', | |
| '- `tests/unit/test_readme_api_coverage.py::test_all_cli_commands_are_documented`', | |
| '- `tests/unit/test_api_v1_routing.py::TestVersionedRoutesParity::test_every_root_route_has_v1_counterpart`', | |
| '- `tests/unit/test_cli_run_params.py::test_run_params_match_cli_call`', | |
| '', | |
| `One or more failed on this PR. \`scripts/regen_contract_drift.py\` produced the patch below (${locDelta} LOC, cap: 30).`, | |
| '', | |
| '**Files changed:**', | |
| '```', | |
| changedFiles || '(none)', | |
| '```', | |
| '', | |
| '### How to apply', | |
| '', | |
| 'Either run the regen script locally:', | |
| '', | |
| '```bash', | |
| 'uv run python scripts/regen_contract_drift.py --fixture all', | |
| 'git add -A && git commit -m "chore(ci): regenerate contract drift allow-lists"', | |
| 'git push', | |
| '```', | |
| '', | |
| 'Or apply the patch directly:', | |
| '', | |
| '```bash', | |
| `gh pr checkout ${prNumber}`, | |
| 'git apply <<\'PATCH\'', | |
| patch.trimEnd(), | |
| 'PATCH', | |
| 'git add -A && git commit -m "chore(ci): regenerate contract drift allow-lists"', | |
| 'git push', | |
| '```', | |
| '', | |
| truncated ? '_Patch truncated to fit comment limit; re-run regen locally for the full diff._' : '', | |
| '', | |
| '<details><summary>Full diff</summary>', | |
| '', | |
| '```diff', | |
| patch.trimEnd(), | |
| '```', | |
| '', | |
| '</details>', | |
| '', | |
| `**Source CI run:** ${runUrl}`, | |
| '', | |
| '_Refs #1273._', | |
| ].join('\n'); | |
| // Idempotency: edit the existing autofix comment in place | |
| // rather than spamming a new one on every push. | |
| const { data: comments } = await github.rest.issues.listComments({ | |
| owner: context.repo.owner, | |
| repo: context.repo.repo, | |
| issue_number: prNumber, | |
| per_page: 100, | |
| }); | |
| const existing = comments.find((c) => c.body && c.body.includes(marker)); | |
| if (existing) { | |
| await github.rest.issues.updateComment({ | |
| owner: context.repo.owner, | |
| repo: context.repo.repo, | |
| comment_id: existing.id, | |
| body, | |
| }); | |
| core.info(`Updated existing drift-patch comment #${existing.id}.`); | |
| } else { | |
| const { data: created } = await github.rest.issues.createComment({ | |
| owner: context.repo.owner, | |
| repo: context.repo.repo, | |
| issue_number: prNumber, | |
| body, | |
| }); | |
| core.info(`Created drift-patch comment #${created.id}.`); | |
| } | |
| - name: Fall back to issue when regen could not fix the drift | |
| if: > | |
| steps.drift.outputs.drift == 'true' && ( | |
| steps.verify.outputs.changed != 'true' || | |
| steps.reverify.outputs.status != '0' | |
| ) | |
| env: | |
| GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} | |
| run: | | |
| # Title is idempotent on PR number so re-runs don't spam issues. | |
| TITLE="Contract drift on PR #${{ github.event.pull_request.number }} could not be auto-patched" | |
| EXISTING=$(gh issue list --search "$TITLE in:title" --state open --json number --jq '.[0].number // ""') | |
| if [ -n "$EXISTING" ]; then | |
| echo "::notice::Issue #$EXISTING already tracks this drift - skipping." | |
| exit 0 | |
| fi | |
| gh issue create \ | |
| --title "$TITLE" \ | |
| --label ci,contract-drift,bot \ | |
| --body "Contract drift was detected on PR #${{ github.event.pull_request.number }} but \`scripts/regen_contract_drift.py\` could not produce a clean patch. | |
| Possible reasons: | |
| - Drift exceeded the 30-LOC safety cap (real semantic change, not drift). | |
| - Drift pattern is not one of the three the script handles. | |
| - Regen ran but the targeted tests still fail (regen bug). | |
| See workflow run ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }} for details. Refs #1273." |