Skip to content

Commit ad56b89

Browse files
jpayne3506Copilot
andcommitted
Address copilot-pull-request-reviewer findings
Four issues raised by the GitHub Copilot bot review on PR #4440: 1. Per-workflow run-ID disambiguation. Discovery now binds GOVULNCHECK_RUN_ID / GOVULNCHECK_RUN_URL / GOVULNCHECK_SOURCE_SHA / GOVULNCHECK_SOURCE_BRANCH and BASEIMAGES_* separately, plus PRIMARY_SOURCE_SHA / PRIMARY_RUN_URL for downstream sections that need a single canonical reference (fix branch name, Fix-PR body). The previous single RUN_ID made multi-workflow invocations ambiguous and could feed the wrong log into the govulncheck version-banner parser. 2. Govulncheck version detection now reads GOVULNCHECK_RUN_ID explicitly, so it parses the right run's log even when both workflows failed in the same invocation. 3. Drop the 'gh pr view --json headRefOid' fallback in Fix-mode setup step 3. PRIMARY_SOURCE_SHA from Discovery (which comes from the workflow run's head_sha field) is the failing run's exact head SHA. The PR-view fallback returned the PR's CURRENT head, which can differ from the failing run after a force-push - directly violating the section's own 'exact head SHA' guarantee. 4. Govulncheck playbook: replace 'git add -A' with explicit allowlist-only staging per touched module. The BPF setup step (make bpf-lib + go generate ./...) can regenerate bpf2go output files (*_bpfel.go, *_bpfeb.go) under the module package; under git add -A those generated files would land in the commit alongside go.mod/go.sum, silently breaching the govulncheck allowlist. Also added 'git checkout -- . && git clean -fd' after the commit so the baseimages playbook starts from a clean tree (its first_diff check would otherwise observe leftover BPF artifacts as drift). Baseimages playbook now asserts a clean tree at start as a defensive precondition. Track FIXED_MODULES across the per-module loop so commit-time staging knows which paths to add. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent fd28f13 commit ad56b89

1 file changed

Lines changed: 120 additions & 35 deletions

File tree

.github/agents/ci-mx.md

Lines changed: 120 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -115,26 +115,56 @@ The flow is: (1) find the workflow's most recent failure regardless of
115115
branch, then (2) decide whether that failure applies to `$target_branch`
116116
by reading the target's actual contents.
117117

118-
### 1. Workflow-scoped query
118+
### 1. Workflow-scoped query (per workflow)
119119

120120
For each in-scope workflow, fetch the most recent failed run regardless
121-
of which branch it ran on:
121+
of which branch it ran on. **Bind separate variables per workflow**
122+
both can fail in the same invocation, and downstream steps need to
123+
distinguish them (e.g., the govulncheck version banner is in the
124+
govulncheck run's log, not the baseimages run's).
122125

123126
```bash
124-
# Govulncheck (repeat for baseimages.yaml).
125-
read RUN_ID run_url SOURCE_SHA SOURCE_BRANCH < <(
127+
# Govulncheck.
128+
read GOVULNCHECK_RUN_ID GOVULNCHECK_RUN_URL \
129+
GOVULNCHECK_SOURCE_SHA GOVULNCHECK_SOURCE_BRANCH < <(
126130
gh api "/repos/$GH_OWNER/$GH_REPO/actions/workflows/govulncheck.yaml/runs?per_page=20&status=failure" \
127131
--jq '.workflow_runs[0] | "\(.id) \(.html_url) \(.head_sha) \(.head_branch)"'
128132
)
129-
# When the user supplied an explicit run URL or ID, use it directly
130-
# (preserving its head_sha and head_branch).
133+
134+
# Baseimages.
135+
read BASEIMAGES_RUN_ID BASEIMAGES_RUN_URL \
136+
BASEIMAGES_SOURCE_SHA BASEIMAGES_SOURCE_BRANCH < <(
137+
gh api "/repos/$GH_OWNER/$GH_REPO/actions/workflows/baseimages.yaml/runs?per_page=20&status=failure" \
138+
--jq '.workflow_runs[0] | "\(.id) \(.html_url) \(.head_sha) \(.head_branch)"'
139+
)
140+
141+
# When the user supplied an explicit run URL or ID, set the matching
142+
# pair from `gh run view <id> --json databaseId,url,headSha,headBranch`
143+
# and clear the other pair if it was incidental.
131144
```
132145

133-
`$SOURCE_BRANCH` / `$SOURCE_SHA` describe **where** the failure was
146+
`$*_SOURCE_BRANCH` / `$*_SOURCE_SHA` describe **where** each failure was
134147
first observed — usually a feature branch or a merge-queue ref. They
135148
are NOT the target the agent fixes; they're the evidence we reason
136149
from.
137150

151+
For downstream sections that need a single canonical reference (the
152+
fix-branch name and the Fix-PR body), define:
153+
154+
```bash
155+
# Most-recent failure SHA across both workflows. The fix is applied at
156+
# this SHA so the fix PR cleanly applies to the most current known
157+
# broken state.
158+
if [ -n "$GOVULNCHECK_SOURCE_SHA" ] && [ -n "$BASEIMAGES_SOURCE_SHA" ]; then
159+
# Prefer whichever run is newer (compare via createdAt from above).
160+
PRIMARY_SOURCE_SHA="$GOVULNCHECK_SOURCE_SHA" # or BASEIMAGES_SOURCE_SHA
161+
PRIMARY_RUN_URL="$GOVULNCHECK_RUN_URL" # or BASEIMAGES_RUN_URL
162+
else
163+
PRIMARY_SOURCE_SHA="${GOVULNCHECK_SOURCE_SHA:-$BASEIMAGES_SOURCE_SHA}"
164+
PRIMARY_RUN_URL="${GOVULNCHECK_RUN_URL:-$BASEIMAGES_RUN_URL}"
165+
fi
166+
```
167+
138168
### 2. Tiebreaker: prefer a newer target-scoped signal if it exists
139169

140170
A run scoped directly to `$target_branch` that's **newer** than the
@@ -146,10 +176,14 @@ read TARGET_RUN_ID target_run_conclusion < <(
146176
--limit 1 --json databaseId,conclusion,createdAt \
147177
--jq '.[0] | "\(.databaseId) \(.conclusion)"'
148178
)
149-
# If TARGET_RUN_ID exists and its createdAt > SOURCE workflow run's createdAt:
150-
# - conclusion=success → DEFINITIVE NEGATIVE: workflow is healthy on target.
151-
# - conclusion=failure → DEFINITIVE POSITIVE: re-anchor RUN_ID, run_url,
152-
# SOURCE_SHA, SOURCE_BRANCH to the target run; proceed.
179+
# If TARGET_RUN_ID exists and its createdAt > the matching workflow run's
180+
# createdAt:
181+
# - conclusion=success → DEFINITIVE NEGATIVE: workflow is healthy on
182+
# target.
183+
# - conclusion=failure → DEFINITIVE POSITIVE: re-anchor that
184+
# workflow's *_RUN_ID, *_RUN_URL, *_SOURCE_SHA, *_SOURCE_BRANCH
185+
# (and PRIMARY_* if applicable) to the target run; proceed.
186+
# Repeat for baseimages.yaml.
153187
```
154188

155189
This also handles fork PRs: when `--branch` returns nothing (fork heads
@@ -176,16 +210,16 @@ reads (base64-decoded). The conclusion bucket per failure is one of:
176210

177211
| Signal on `$target_branch` | Bucket |
178212
|---|---|
179-
| Target-scoped run newer than `$RUN_ID` and passed | `does-not-apply` |
213+
| Target-scoped run newer than the matching `*_RUN_ID` and passed | `does-not-apply` |
180214
| Target-scoped run newer and also failed | `fixable` (re-anchor to target run) |
181-
| No newer target-scoped run; target's render-input files (`build/images.mk`, every `*/Dockerfile.tmpl`, every `*/manifests/*` referenced by the renderkit) are **byte-identical** to those on `$SOURCE_SHA` (compare SHAs via `gh api /git/trees/{sha}?recursive=1` or per-file blob SHAs) | `fixable` (inferred positive — same templates + same external images = same render diff) |
182-
| Render-input files differ between target and `$SOURCE_SHA` | `needs-probe` (drift may or may not produce a diff on target — only an actual render can tell) |
215+
| No newer target-scoped run; target's render-input files (`build/images.mk`, every `*/Dockerfile.tmpl`, every `*/manifests/*` referenced by the renderkit) are **byte-identical** to those on `$BASEIMAGES_SOURCE_SHA` (compare SHAs via `gh api /git/trees/{sha}?recursive=1` or per-file blob SHAs) | `fixable` (inferred positive — same templates + same external images = same render diff) |
216+
| Render-input files differ between target and `$BASEIMAGES_SOURCE_SHA` | `needs-probe` (drift may or may not produce a diff on target — only an actual render can tell) |
183217

184218
#### Govulncheck applicability table (per finding)
185219

186220
Each finding identifies a vulnerable module path, vulnerable version
187221
range, fixed version, package, and **proven call-graph reachability on
188-
`$SOURCE_SHA`**. For each finding, read `$target_branch`'s
222+
`$GOVULNCHECK_SOURCE_SHA`**. For each finding, read `$target_branch`'s
189223
`<matrix-module>/go.mod` and `<matrix-module>/go.sum`:
190224

191225
| Signal on `$target_branch` | Bucket |
@@ -194,7 +228,7 @@ range, fixed version, package, and **proven call-graph reachability on
194228
| Target-scoped run newer and reports the same finding | `fixable` (re-anchor) |
195229
| Vulnerable module path is **not** required by target's `go.mod` and is **absent** from target's `go.sum` | `does-not-apply` |
196230
| Target's `go.sum` resolves the vulnerable module to a version **outside** the vulnerable range | `does-not-apply` |
197-
| Target's `go.sum` resolves the vulnerable module to a version **inside** the vulnerable range AND the diff between `$SOURCE_SHA` and target's HEAD does **not** touch the affected packages | `fixable` (reachability proven on source carries over to target) |
231+
| Target's `go.sum` resolves the vulnerable module to a version **inside** the vulnerable range AND the diff between `$GOVULNCHECK_SOURCE_SHA` and target's HEAD does **not** touch the affected packages | `fixable` (reachability proven on source carries over to target) |
198232
| Same as above, but the source↔target diff **does** touch the affected packages | `needs-probe` (reachability may have changed; fix mode's post-bump re-run will prove or disprove) |
199233

200234
The remaining classification rules still apply to whatever survives as
@@ -220,7 +254,7 @@ version in the first few log lines:
220254

221255
```bash
222256
GOVULNCHECK_VERSION=$(
223-
gh run view "$RUN_ID" --log-failed \
257+
gh run view "$GOVULNCHECK_RUN_ID" --log-failed \
224258
| grep -oE 'govulncheck@v[0-9][0-9.]*' | head -1 \
225259
| sed 's/^govulncheck@//'
226260
)
@@ -234,7 +268,7 @@ isn't present.
234268

235269
Read-only. For every in-scope workflow with a current failure, emit:
236270

237-
- Workflow name, run URL, head SHA (`$SOURCE_BRANCH` + `$SOURCE_SHA`).
271+
- Workflow name, run URL, head SHA (`$*_SOURCE_BRANCH` + `$*_SOURCE_SHA`).
238272
- Per failing job, the **applicability bucket** from Discovery:
239273
- `fixable` — include the exact `go get <module>@<fixed>` command(s)
240274
or the `make dockerfiles` action that fix mode would run, plus a
@@ -282,15 +316,20 @@ if [ -n "$source_pr_number" ]; then
282316
fi
283317
fi
284318

285-
# 3. Require the run's exact head SHA. Do NOT fall back to the branch tip:
286-
# the branch may have been force-pushed since the failing run.
287-
if [ -z "${RUN_HEAD_SHA:-}" ] && [ -n "$source_pr_number" ]; then
288-
RUN_HEAD_SHA=$(gh pr view "$source_pr_number" --json headRefOid \
289-
-q .headRefOid)
319+
# 3. Use the failing run's exact head SHA. Discovery already bound
320+
# PRIMARY_SOURCE_SHA from the workflow run's head_sha — that is the
321+
# exact commit the workflow ran against. Do NOT fall back to
322+
# `gh pr view --json headRefOid` (the PR's CURRENT head), because if
323+
# the PR was force-pushed after the failing run, those two SHAs
324+
# differ and we would fix the wrong commit.
325+
target_head_sha="${PRIMARY_SOURCE_SHA:-}"
326+
if [ -z "$target_head_sha" ] && [ -n "${USER_PROVIDED_RUN_ID:-}" ]; then
327+
# User supplied a run URL/ID without Discovery; resolve directly.
328+
target_head_sha=$(gh run view "$USER_PROVIDED_RUN_ID" \
329+
--json headSha -q .headSha)
290330
fi
291-
[ -n "${RUN_HEAD_SHA:-}" ] || \
292-
{ echo "stop:input-invalid (no run head SHA)"; exit 1; }
293-
target_head_sha="$RUN_HEAD_SHA"
331+
[ -n "$target_head_sha" ] || \
332+
{ echo "stop:input-invalid (could not resolve failing run's head SHA)"; exit 1; }
294333

295334
# 4. Fetch the ref so the SHA is present locally.
296335
if [ -n "$source_pr_number" ]; then
@@ -301,15 +340,15 @@ fi
301340
git cat-file -e "$target_head_sha^{commit}" \
302341
|| { echo "stop:input-invalid (could not fetch $target_head_sha)"; exit 1; }
303342

304-
# 5. Names. Include RUN_ID for collision-avoidance.
343+
# 5. Names. Include the primary run ID for collision-avoidance.
305344
short_sha=${target_head_sha:0:8}
306-
fix_branch="ci-mx/fix-${source_pr_number:-$short_sha}-${RUN_ID:-$$}"
345+
fix_branch="ci-mx/fix-${source_pr_number:-$short_sha}-${GOVULNCHECK_RUN_ID:-${BASEIMAGES_RUN_ID:-$$}}"
307346

308347
# 6. Capture main_repo BEFORE entering the worktree, so teardown is reliable.
309348
main_repo="$(git rev-parse --show-toplevel)"
310349

311350
# 7. Worktree as sibling of the repo (not inside .git/).
312-
work_dir="$(dirname "$main_repo")/ci-mx-work-${RUN_ID:-$$}"
351+
work_dir="$(dirname "$main_repo")/ci-mx-work-${GOVULNCHECK_RUN_ID:-${BASEIMAGES_RUN_ID:-$$}}"
313352

314353
git worktree add --detach "$work_dir" "$target_head_sha"
315354
cd "$work_dir"
@@ -348,10 +387,20 @@ Run only for jobs Discovery classified as `fixable`. If **any** job in the
348387
failing matrix is `stop:*`, do not start the playbook — report all
349388
blockers and exit. No partial fixes.
350389

390+
Track every module the playbook actually touches in a shell array so
391+
the commit step can stage allowlisted paths explicitly:
392+
393+
```bash
394+
FIXED_MODULES=() # populated per successful module below
395+
```
396+
351397
For each `fixable` matrix module (in matrix order):
352398

353399
1. `cd "$work_dir/<module>"` (`.` means `$work_dir`).
354-
2. If the module is BPF, mirror the workflow setup:
400+
2. If the module is BPF, mirror the workflow setup so `govulncheck`
401+
can load the package. These regenerated artifacts (e.g.
402+
`*_bpfel.go`, `*_bpfeb.go`) are build-verification side-effects,
403+
**not** allowlisted edits — they must not be committed:
355404
```bash
356405
( cd "$work_dir" && make bpf-lib )
357406
go generate ./...
@@ -396,20 +445,53 @@ For each `fixable` matrix module (in matrix order):
396445
```
397446
Findings remain → run Cleanup snippet, then
398447
`stop:unfixable (post-bump findings)`.
399-
400-
After all `fixable` modules complete:
448+
10. Module succeeded; record it for staging:
449+
```bash
450+
module_path="$(realpath --relative-to="$work_dir" .)"
451+
FIXED_MODULES+=("$module_path")
452+
```
453+
454+
After all `fixable` modules complete, commit **only the allowlisted
455+
paths** for each touched module. `git add -A` is **forbidden** here — it
456+
would sweep in BPF regen output (`*_bpfel.go`, `*_bpfeb.go`) and any
457+
other build-verification artifacts, violating the govulncheck
458+
allowlist:
401459

402460
```bash
403461
cd "$work_dir"
404-
git add -A
462+
for mod in "${FIXED_MODULES[@]}"; do
463+
git add "$mod/go.mod" "$mod/go.sum"
464+
[ -d "$mod/vendor" ] && git add "$mod/vendor"
465+
done
405466
git commit -m "fix(deps): resolve govulncheck findings"
467+
468+
# Reset every other tracked path the BPF setup may have touched, and
469+
# remove any untracked artifacts it produced, so the baseimages
470+
# playbook starts from a clean tree.
471+
git checkout -- .
472+
git clean -fd
406473
```
407474

408475
## Baseimages playbook (only in `op_mode=fix`)
409476

410477
Assumes Fix-mode setup is done. Runs in the same `$work_dir`; if both
411478
playbooks run, commits stack on the same `$fix_branch`.
412479

480+
If the govulncheck playbook ran first, it already left the tree clean
481+
(`git checkout -- .` + `git clean -fd` after its commit), so the
482+
`first_diff` check below faithfully reflects only `make dockerfiles`'s
483+
output. **If you skip govulncheck**, assert a clean tree before step
484+
2 — anything left over from a prior step would corrupt `first_diff`:
485+
486+
```bash
487+
cd "$work_dir"
488+
if [ -n "$(git status --porcelain)" ]; then
489+
echo "stop:env-broken (baseimages playbook requires a clean tree at start)"
490+
# Run the Cleanup snippet, then:
491+
exit 1
492+
fi
493+
```
494+
413495
1. Preflight: confirm `go` and `skopeo` on `PATH`. Missing → run Cleanup
414496
snippet, then `stop:env-broken (missing tooling: <name>)`.
415497
2. From `$work_dir`:
@@ -430,7 +512,10 @@ playbooks run, commits stack on the same `$fix_branch`.
430512
exit 1
431513
fi
432514
```
433-
5. Commit:
515+
5. Commit. `git add -A` is acceptable here because `make dockerfiles`
516+
only writes rendered Dockerfile outputs into known paths, and the
517+
pre-step tree-clean assertion above guarantees nothing else is in
518+
play:
434519
```bash
435520
git add -A
436521
git commit -m "chore(images): re-render Dockerfiles"
@@ -451,7 +536,7 @@ fix_pr_title="ci-mx: fix CI failures on $target_branch"
451536
fix_pr_body=$(cat <<EOF
452537
Automated fix from \`ci-mx\` for CI failures on \`$target_branch\` at \`$target_head_sha\`.
453538
454-
- Failing run: $run_url
539+
- Failing run: $PRIMARY_RUN_URL
455540
- Verified by re-running the failing CI check locally per the ci-mx contract.
456541
457542
Scope: govulncheck dependency bumps and/or \`make dockerfiles\` re-render

0 commit comments

Comments
 (0)