Skip to content

fix: Docker sandbox agent execution + role template rendering + --model propagation for non-Claude CLIs #1084

fix: Docker sandbox agent execution + role template rendering + --model propagation for non-Claude CLIs

fix: Docker sandbox agent execution + role template rendering + --model propagation for non-Claude CLIs #1084

name: "Contract Drift Autofix"
# Refs: #1273 - CI hardening, bucket CI-E.
#
# Why this workflow exists
# ------------------------
# Three contract tests in tests/unit/ act as drift detectors:
#
# * test_readme_api_coverage.py::test_all_cli_commands_are_documented
# fails when a new top-level CLI command is registered but not added to
# DOCUMENTED_COMMANDS.
# * test_api_v1_routing.py::TestVersionedRoutesParity::test_every_root_route_has_v1_counterpart
# fails when a new root-mounted FastAPI route is added without either a
# /api/v1/* mirror or an entry in _INFRASTRUCTURE_PATHS.
# * test_cli_run_params.py::test_run_params_match_cli_call
# fails when run() gains a new parameter that the cli() click callback
# does not forward.
#
# Every one of these is a 1-2 line allow-list / forward-arg edit. The fix is
# mechanical and well-suited for an autofix bot.
#
# Strategy (modernized, refs operator audit 2026-05-18)
# -----------------------------------------------------
# Inline-fix model: when drift is detected on a same-repo PR, commit the
# regen DIRECTLY to the source PR's head ref instead of opening a separate
# bot PR. This eliminates the entire class of orphaned bot-PR churn
# (#1436, #1440, #1441, #1445 type) we previously saw, because:
#
# * No separate PR means no separate review/merge step.
# * The regen rides into main with the source PR, atomically.
# * When the source PR is auto-deleted on merge, there is no orphaned
# bot branch left re-targeting to main with a giant diff.
#
# Fallback paths (preserved for safety):
# * Fork PRs (no push to head ref): post the patch as an idempotent PR
# comment with copy-paste-able apply instructions.
# * Unhandled drift / LOC-cap trip: open a tracking issue.
# * Push to head ref fails (branch protection, lease conflict): post
# the patch as a PR comment so the operator can apply locally.
#
# Recursion guards
# ----------------
# * Skips when the PR author is the bot itself (no infinite loops).
# * Skips PRs whose head ref starts with `bot/contract-drift-`
# (legacy bot branches; ignored so they don't re-trigger).
# * Diff size capped at 30 LOC by the regen script; larger diffs imply
# a real change and are escalated by opening an issue instead.
# * No `actions: write` permission, so the autofix commit cannot
# recursively trigger more workflows in this file (the inline commit
# uses BOT_PAT to bypass the GITHUB_TOKEN recursion mute for the
# source PR's other workflows; this file is muted for itself via the
# head-ref recursion guard above and the bot-author check).
#
# Secret requirements
# -------------------
# * BOT_PAT (optional, recommended): a fine-grained PAT with `contents:
# write` + `pull-requests: write` on this repo. When present, the
# inline push triggers CI runs on the source PR (GITHUB_TOKEN-authored
# commits are otherwise muted by GitHub's recursion protection).
# When absent, falls back to GITHUB_TOKEN and the PR-comment path
# (the operator applies the regen manually).
on:
pull_request:
types: [opened, synchronize]
concurrency:
group: contract-drift-${{ github.event.pull_request.number }}
cancel-in-progress: true
# Workflow-scope permissions: contents:write for the inline regen push to
# the source PR head ref, pull-requests:write for the comment-fallback
# path, issues:write for the tracking-issue fallback when regen cannot
# produce a clean patch. The autofix job re-asserts the same set at job
# scope (Scorecard token-permissions).
permissions:
contents: write
pull-requests: write
issues: write
jobs:
autofix:
name: Detect and patch contract drift
runs-on: ubuntu-latest
timeout-minutes: 10
permissions:
contents: write
pull-requests: write
issues: write
# Recursion guard: don't run on PRs the bot itself opened, and don't
# run on legacy bot-contract-drift-* branches (kept for safety while
# any old open bot PRs drain).
if: >
github.event.pull_request.user.login != 'github-actions[bot]' &&
github.event.pull_request.user.login != 'bernstein[bot]' &&
github.event.pull_request.user.login != 'bernstein-orchestrator[bot]' &&
!startsWith(github.event.pull_request.head.ref, 'bot/contract-drift-')
steps:
- name: Harden runner (audit mode)
uses: step-security/harden-runner@9af89fc71515a100421586dfdb3dc9c984fbf411 # v2.19.4
with:
egress-policy: audit
- name: Detect fork PR
id: forkcheck
env:
HEAD_REPO: ${{ github.event.pull_request.head.repo.full_name }}
BASE_REPO: ${{ github.repository }}
run: |
# Fork PRs have head.repo != base repo. We cannot push to a fork's
# head ref, so we fall back to the PR-comment path for those.
if [ "$HEAD_REPO" != "$BASE_REPO" ]; then
echo "is_fork=true" >> "$GITHUB_OUTPUT"
echo "::notice::PR is from a fork ($HEAD_REPO); will use PR-comment fallback if drift detected."
else
echo "is_fork=false" >> "$GITHUB_OUTPUT"
fi
# persist-credentials kept: this job pushes the regen commit back to
# the source PR's head ref (same-repo PRs only).
- name: Checkout PR head
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7 # zizmor: ignore[artipacked]
with:
ref: ${{ github.event.pull_request.head.sha }}
fetch-depth: 0
# Use BOT_PAT if present so the inline push triggers CI on the
# source PR. GITHUB_TOKEN-authored pushes are muted by GitHub.
token: ${{ secrets.BOT_PAT || secrets.GITHUB_TOKEN }}
- uses: astral-sh/setup-uv@fac544c07dec837d0ccb6301d7b5580bf5edae39 # v8.2.0
- name: Sync project (dev group)
run: uv sync --group dev
- name: Run drift tests (capture failures, don't fail the job)
id: drift
run: |
set +e
uv run pytest \
tests/unit/test_readme_api_coverage.py::test_all_cli_commands_are_documented \
tests/unit/test_api_v1_routing.py::TestVersionedRoutesParity::test_every_root_route_has_v1_counterpart \
tests/unit/test_cli_run_params.py::test_run_params_match_cli_call \
--tb=short
status=$?
echo "status=$status" >> "$GITHUB_OUTPUT"
if [ "$status" -eq 0 ]; then
echo "drift=false" >> "$GITHUB_OUTPUT"
echo "::notice::No contract drift detected."
else
echo "drift=true" >> "$GITHUB_OUTPUT"
echo "::notice::Contract drift detected - running regen."
fi
exit 0
- name: Regenerate drift fixtures
if: steps.drift.outputs.drift == 'true'
id: regen
run: |
set +e
uv run python scripts/regen_contract_drift.py --fixture all
# The regen script returns 0 only when at least one fixture was
# patched. Any non-zero exit means nothing was patched - either
# because the drift wasn't one of the three known patterns or
# because the diff exceeded the 30-LOC safety cap.
regen_status=$?
echo "regen_status=$regen_status" >> "$GITHUB_OUTPUT"
exit 0
- name: Verify regen produced something
if: steps.drift.outputs.drift == 'true'
id: verify
run: |
if git diff --quiet; then
echo "changed=false" >> "$GITHUB_OUTPUT"
echo "::warning::Drift was detected but regen produced no diff. Likely an unhandled drift pattern or LOC-cap trip - opening tracking issue."
else
echo "changed=true" >> "$GITHUB_OUTPUT"
# Compute a coarse summary of what was regenerated.
CHANGED_FILES=$(git diff --name-only | sort -u)
{
echo "changed_files<<EOF"
echo "$CHANGED_FILES"
echo "EOF"
} >> "$GITHUB_OUTPUT"
# Compute LOC delta (added+removed) for the safety report.
STAT=$(git diff --shortstat)
INS=$(echo "$STAT" | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0")
DEL=$(echo "$STAT" | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0")
LOC=$((INS + DEL))
echo "loc_delta=${LOC}" >> "$GITHUB_OUTPUT"
echo "::notice::Regen produced ${LOC:-0} LOC of changes across:"
while IFS= read -r f; do
[ -z "$f" ] && continue
echo " $f"
done <<< "$CHANGED_FILES"
# Persist the diff for the comment-fallback path.
git diff > "${RUNNER_TEMP}/contract-drift.patch"
fi
- name: Re-run drift tests to confirm regen fixed them
if: steps.drift.outputs.drift == 'true' && steps.verify.outputs.changed == 'true'
id: reverify
run: |
set +e
uv run pytest \
tests/unit/test_readme_api_coverage.py::test_all_cli_commands_are_documented \
tests/unit/test_api_v1_routing.py::TestVersionedRoutesParity::test_every_root_route_has_v1_counterpart \
tests/unit/test_cli_run_params.py::test_run_params_match_cli_call \
--tb=short
status=$?
echo "status=$status" >> "$GITHUB_OUTPUT"
if [ "$status" -ne 0 ]; then
echo "::warning::Regen ran but drift tests still fail. Bot will open a tracking issue."
fi
exit 0
- name: Commit regen inline to source PR head ref
# Primary path. Same-repo PRs receive the regen patch directly on
# their head branch via git push, so the autofix rides into main
# with the source PR. No separate bot PR is opened.
if: >
steps.drift.outputs.drift == 'true' &&
steps.verify.outputs.changed == 'true' &&
steps.reverify.outputs.status == '0' &&
steps.forkcheck.outputs.is_fork == 'false'
id: inline_push
continue-on-error: true
env:
PR_HEAD_REF: ${{ github.event.pull_request.head.ref }}
PR_NUMBER: ${{ github.event.pull_request.number }}
RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
run: |
set -euo pipefail
# Configure a stable bot identity for the commit. The identity is
# intentionally distinct from any real user; the commit author
# uses the GitHub-actions noreply email so commits attribute to
# the bot, not the workflow's PAT owner.
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add -A
git commit -m "chore(ci): regenerate contract drift allow-lists
Auto-applied by contract-drift-autofix.yml on PR #${PR_NUMBER}.
Regenerated via scripts/regen_contract_drift.py. Refs #1273.
Source CI run: ${RUN_URL}"
# --force-with-lease is the safe push variant: it rejects if the
# remote moved since we checked out, so a concurrent push from
# the PR author isn't clobbered. If the lease check fails the
# comment-fallback path will fire to surface the patch.
git push --force-with-lease origin "HEAD:${PR_HEAD_REF}"
echo "pushed=true" >> "$GITHUB_OUTPUT"
- name: Post drift patch as PR comment (fallback)
# Fallback path. Fires when:
# * the PR is from a fork (no push access to head ref), OR
# * the inline push step failed (lease conflict, branch protection,
# transient API outage).
# Body is idempotent on a hidden marker so re-runs edit the same
# comment rather than spamming the PR.
if: >
steps.drift.outputs.drift == 'true' &&
steps.verify.outputs.changed == 'true' &&
steps.reverify.outputs.status == '0' && (
steps.forkcheck.outputs.is_fork == 'true' ||
steps.inline_push.outcome == 'failure' ||
steps.inline_push.outputs.pushed != 'true'
)
id: comment
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
env:
PR_NUMBER: ${{ github.event.pull_request.number }}
CHANGED_FILES: ${{ steps.verify.outputs.changed_files }}
LOC_DELTA: ${{ steps.verify.outputs.loc_delta }}
RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
IS_FORK: ${{ steps.forkcheck.outputs.is_fork }}
INLINE_OUTCOME: ${{ steps.inline_push.outcome }}
with:
script: |
const fs = require('fs');
const path = require('path');
const marker = '<!-- contract-drift-autofix:patch -->';
const prNumber = Number(process.env.PR_NUMBER);
const changedFiles = (process.env.CHANGED_FILES || '').trim();
const locDelta = process.env.LOC_DELTA || '?';
const runUrl = process.env.RUN_URL;
const isFork = process.env.IS_FORK === 'true';
const inlineOutcome = process.env.INLINE_OUTCOME || '';
const reason = isFork
? 'PR is from a fork; the bot cannot push to fork head refs, so apply the patch below manually.'
: `Inline autofix push failed (\`${inlineOutcome || 'skipped'}\`). Apply the patch below manually.`;
const patchPath = path.join(process.env.RUNNER_TEMP, 'contract-drift.patch');
let patch = fs.readFileSync(patchPath, 'utf-8');
// Defensive cap: GitHub comment body limit is 65536 chars.
// The regen script caps the diff at 30 LOC so we are nowhere
// near this, but keep a guard in case the cap is ever raised.
const PATCH_CAP = 50000;
let truncated = false;
if (patch.length > PATCH_CAP) {
patch = patch.slice(0, PATCH_CAP);
truncated = true;
}
const body = [
marker,
'## Contract drift detected - proposed patch',
'',
reason,
'',
'Three contract tests act as drift detectors against the public CLI / API surface:',
'',
'- `tests/unit/test_readme_api_coverage.py::test_all_cli_commands_are_documented`',
'- `tests/unit/test_api_v1_routing.py::TestVersionedRoutesParity::test_every_root_route_has_v1_counterpart`',
'- `tests/unit/test_cli_run_params.py::test_run_params_match_cli_call`',
'',
`One or more failed on this PR. \`scripts/regen_contract_drift.py\` produced the patch below (${locDelta} LOC, cap: 30).`,
'',
'**Files changed:**',
'```',
changedFiles || '(none)',
'```',
'',
'### How to apply',
'',
'Either run the regen script locally:',
'',
'```bash',
'uv run python scripts/regen_contract_drift.py --fixture all',
'git add -A && git commit -m "chore(ci): regenerate contract drift allow-lists"',
'git push',
'```',
'',
'Or apply the patch directly:',
'',
'```bash',
`gh pr checkout ${prNumber}`,
'git apply <<\'PATCH\'',
patch.trimEnd(),
'PATCH',
'git add -A && git commit -m "chore(ci): regenerate contract drift allow-lists"',
'git push',
'```',
'',
truncated ? '_Patch truncated to fit comment limit; re-run regen locally for the full diff._' : '',
'',
'<details><summary>Full diff</summary>',
'',
'```diff',
patch.trimEnd(),
'```',
'',
'</details>',
'',
`**Source CI run:** ${runUrl}`,
'',
'_Refs #1273._',
].join('\n');
// Idempotency: edit the existing autofix comment in place
// rather than spamming a new one on every push.
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
per_page: 100,
});
const existing = comments.find((c) => c.body && c.body.includes(marker));
if (existing) {
await github.rest.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: existing.id,
body,
});
core.info(`Updated existing drift-patch comment #${existing.id}.`);
} else {
const { data: created } = await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
body,
});
core.info(`Created drift-patch comment #${created.id}.`);
}
- name: Fall back to issue when regen could not fix the drift
if: >
steps.drift.outputs.drift == 'true' && (
steps.verify.outputs.changed != 'true' ||
steps.reverify.outputs.status != '0'
)
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
# Title is idempotent on PR number so re-runs don't spam issues.
TITLE="Contract drift on PR #${{ github.event.pull_request.number }} could not be auto-patched"
EXISTING=$(gh issue list --search "$TITLE in:title" --state open --json number --jq '.[0].number // ""')
if [ -n "$EXISTING" ]; then
echo "::notice::Issue #$EXISTING already tracks this drift - skipping."
exit 0
fi
gh issue create \
--title "$TITLE" \
--label ci,contract-drift,bot \
--body "Contract drift was detected on PR #${{ github.event.pull_request.number }} but \`scripts/regen_contract_drift.py\` could not produce a clean patch.
Possible reasons:
- Drift exceeded the 30-LOC safety cap (real semantic change, not drift).
- Drift pattern is not one of the three the script handles.
- Regen ran but the targeted tests still fail (regen bug).
See workflow run ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }} for details. Refs #1273."