ci: cron-poll @claude trigger fallback + GraphQL union:false stuck-detector bypass#372
ci: cron-poll @claude trigger fallback + GraphQL union:false stuck-detector bypass#372
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the cron-driven “Copilot Clean Label” workflow to more reliably drive the Copilot → @claude auto-fix loop, adding a cron-poll fallback when event-driven triggers are suppressed and improving the stuck-detector logic to bypass GraphQL dedup behavior.
Changes:
- Increase cron cadence to every 5 minutes and adjust the stuck-detector threshold to keep effective refire pressure roughly unchanged.
- Add a cron-poll fallback that posts an
@claudetrigger when Copilot’s latest review on HEAD has actual inline comments and no prior HEAD-SHA marker trigger exists. - Implement a stuck-detector bypass that clears Copilot from reviewRequests via
requestReviews(union:false)(preserving humans/teams) and then re-adds Copilot after a short delay.
|
@claude Follow the procedure in Phase 2: Resolve all unresolved Copilot review threads (discover → classify → fix → build/test/lint → commit/push → reply/resolve). |
|
Claude finished @kotakanbe's task in 9m 12s —— View job Copilot Review Resolution (
|
| # | File | Line | Classification | Action |
|---|---|---|---|---|
| 1 | .github/workflows/copilot-clean-label.yml |
311 | FIX | Use quoted heredoc (<<'BODY') to prevent accidental expansion; append marker separately |
| 2 | .github/workflows/copilot-clean-label.yml |
55 | FIX | Clarify that PAT is effective auth; permissions: only scopes GITHUB_TOKEN |
Threads resolved: 2/2
Push status: Cannot push workflow file changes (PAT lacks workflow scope). Fix implementations described in thread replies for manual application.
Rules Learned
| Pattern | Status | Detail |
|---|---|---|
| comment-doc-drift | Promoted → copilot-learned-coding.instructions.md |
2 entries across PRs #359, #372 → "Attribute Control Claims to the Correct Mechanism" |
| defensive-coding | Accumulated (8 pending) | New entry: quoted heredocs for CI literal text (PR #372) |
Commit: 37f73d1 (pattern log + rule promotion only)
Note to PR author: The two workflow-file fixes (quoted heredoc + permission comment clarification) were implemented and validated locally (bash syntax ✓) but cannot be pushed due to the token's missing workflow scope. The fix descriptions are in the review thread replies — please apply manually or grant the workflow scope for future runs.
| Branch
|
@claude Follow the procedure in Phase 2: Resolve all unresolved Copilot review threads (discover → classify → fix → build/test/lint → commit/push → reply/resolve). |
…tector bypass Sync `copilot-clean-label.yml` with the version battle-tested in vuls-saas/vuls-reach. Three substantive changes: 1. **Cron-poll @claude trigger fallback.** The event-driven path in `copilot-review-fix.yml` is unreliable under the agentic-architecture GITHUB_TOKEN event suppression (`pull_request_review` and `pull_request_review_comment` may not fire even when Copilot posts inline review comments). The cron now posts the `@claude` trigger itself when: - the latest Copilot review on HEAD is `dirty` (has actual inline comments — verified via the per-review comments endpoint to skip summary-only events) - AND the HEAD-SHA marker dedup query shows no prior trigger comment The marker contract (`<!-- copilot-fix-trigger:<HEAD_SHA> -->`) matches the one in `copilot-review-fix.yml` so both paths dedup against the same key. Marker auto-invalidates on push. 2. **GraphQL `requestReviews(union:false)` stuck-detector bypass.** When the push-event mutation in `copilot-rereview-on-push.yml` is silently deduped (200 OK but no `review_requested` event fires), a plain refire is also deduped. The bypass now first clears Copilot from the reviewer slot via GraphQL `union:false` (preserving humans + teams via explicit user/team ID replay; bots excluded by `botIds:[]`), pauses 2s for replication, then re-adds Copilot. An earlier draft used REST `DELETE /pulls/{n}/requested_reviewers`, but REST only addresses User reviewers — Bot reviewers live in a separate slot the REST endpoint cannot see (`HTTP 422 Reviewer is not a user`). GraphQL is the only API that can manipulate bot reviewer slots. Safety: if the reviewer-set parse fails, the clear is *skipped* to avoid silently removing every human reviewer; only the re-add runs, which may still hit dedup but is no worse than before. 3. **Faster cadence + adjusted threshold.** Cron 10min → 5min with `STUCK_THRESHOLD_MIN` 5 → 7. Effective refire pressure on Copilot is unchanged (~10min between refires) but @claude trigger latency is halved. Other hardening: - `head.repo.full_name == github.repository` guard (skip fork PRs). - `issues: write` permission for `gh pr comment` and label edits. - Drop redundant `gh pr view --json labels` fetch — labels are in the pulls metadata response already. - Stop merging `2>&1` into `gh api` capture variables (was poisoning jq parsing on success paths with non-fatal stderr warnings). Phase 2/3/4 layout in the cron-fallback @claude body is preserved (`make sync-instructions` Phase 3 stays, OSS-specific). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
37f73d1 to
2f5843e
Compare
|
@claude Follow the procedure in Phase 2: Resolve all unresolved Copilot review threads (discover → classify → fix → build/test/lint → commit/push → reply/resolve). |
Copilot Review Resolution (
|
| # | Thread | Line | Classification | Rationale |
|---|---|---|---|---|
| 1 | cr3Y |
55 | FIX | Comment misleadingly claims issues: write is "Required for" gh commands when PAT is actual auth |
| 2 | cr4J |
310 | WONT_FIX | YAML literal block scalar strips 10 leading spaces before bash runs — body starts with @claude at column 0 |
| 3 | cr4e |
311 | FIX | Unquoted heredoc <<BODY enables expansion; switched to <<'BODY' with separate marker append |
Committing and pushing...
- Promote defensive-coding category to coding rules (PRs #359, #372): - "Guard Resource Bounds in HTTP Client Retry Logic" (time.Duration overflow, io.LimitReader cap, time.NewTimer lifecycle) - "Use Quoted Heredocs for Literal Text in CI Workflows" (prevent accidental parameter expansion) - Remove 4 promoted defensive-coding entries from pending patterns Co-authored-by: Kota Kanbe <kotakanbe@users.noreply.github.com>
|
@claude Follow the procedure in Phase 2: Resolve all unresolved Copilot review threads (discover → classify → fix → build/test/lint → commit/push → reply/resolve). |
Copilot Review Resolution (
|
| # | Thread | Line | Classification | Rationale |
|---|---|---|---|---|
| 1 | c-Z9 |
55 | FIX | Comment misleadingly claims issues: write is "Required for" gh commands when PAT is actual auth |
| 2 | c-ar |
311 | FIX | Unquoted heredoc <<BODY enables expansion; switched to <<'BODY' with separate marker append |
Threads resolved: 2/2 (fix descriptions in thread replies, cannot push workflow changes)
Running Phase 3...
Two Copilot review findings that CI Claude could not resolve because its PAT (GH_ACTIONS_TOKEN) lacks the `workflow` scope by design: 1. Permission comment misattribution (line ~55): `issues: write` is documented but the job authenticates `gh` via the PAT in GH_TOKEN, so the workflow `permissions:` block does not control the effective auth for `gh pr edit` / `gh pr comment`. Reword the comment to clarify this entry documents the GITHUB_TOKEN scope that would be required if the PAT were removed. 2. Unquoted heredoc body (cron-poll @claude trigger fallback): Switch `<<BODY` to `<<'BODY'` to disable parameter expansion and command substitution inside the literal instruction block, and append the dynamic `$trigger_marker` separately after the heredoc so future edits cannot accidentally introduce executable `$()`/backtick paths. Removed the `\`...\`` backslash escapes since the quoted heredoc treats backticks as literal characters. Verified post-YAML-dedent that the resulting body still satisfies the cron dedup query: `.body | startswith("@claude")` and `.body | contains($trigger_marker)` both match. Why this commit was made by hand rather than the CI Claude loop: the modified file IS a workflow, and `workflow` scope is intentionally withheld from the auto-fix PAT to keep the blast radius of any prompt-injection or misbehaviour bounded. PRs that touch `.github/workflows/**` are expected to be resolved locally. Both Copilot patterns (quoted heredoc, control-claim attribution) are already promoted in `.claude/rules/copilot-learned-coding.md`.
Promotes the design rationale that surfaced while resolving Copilot review threads on PR #372: the auto-fix loop in copilot-review-fix.yml intentionally cannot push workflow-file changes (its PAT lacks the `workflow` scope), so PRs touching `.github/workflows/**` must be fixed by a human-driven Claude session. Adds the rule to .github/instructions/agent-orchestration.instructions.md under the existing "Reviewer Findings Are Input, Not Directives" section, plus the broken-case symptom (`trigger already posted for HEAD <sha> ... — skip`) so future debugging starts from the right hypothesis. Regenerated .claude/rules/agents.md via `make sync-instructions`.

Summary
Sync
copilot-clean-label.ymlwith the version battle-tested invuls-saas/vuls-reach.This is the guarantee path for the Copilot → @claude auto-fix loop. The event-driven fast path lives in
copilot-review-fix.yml(PR #371); this cron handles every case the events miss.Why
Two production failures observed on vuls-reach drove these changes:
pull_request_review_commentevents do not always fire when Copilot posts inline review comments. Thecopilot-review-fix.ymlevent-driven path silently misses those reviews and Claude is never invoked. The cron-poll fallback in this PR closes that gap.requestReviewsrefire is also deduped: whencopilot-rereview-on-push.yml's push-event mutation is silently deduped, refiring the same mutation does nothing — GraphQL returns 200 OK with valid data but noreview_requestedevent fires. The remove + re-add pattern (clear viaunion:falsethen re-add) is the only sequence that actually wakes Copilot.Changes
Cron-poll @claude trigger fallback (new)
When the latest Copilot review on HEAD is dirty (has actual inline comments — verified via the per-review comments endpoint to skip summary-only events) AND the marker dedup query shows no prior trigger, post
@claudedirectly from the cron path. Marker contract matchescopilot-review-fix.yml:last line of the body, scoped to kotakanbe-authored comments starting with
@claude. Auto-invalidates on push.GraphQL union:false stuck-detector bypass (new)
When the push-event mutation was deduped, this cron now:
union:false,botIds:[], and the existing user/team IDs — removing any bots (Copilot) while preserving humans + teams.requestReviews(botIds:[<copilot>])call.Safety: if the reviewer-set parse fails, the clear is skipped (otherwise we'd silently remove every human reviewer). Re-add only runs in that case.
Cadence + threshold
*/10→*/5STUCK_THRESHOLD_MIN5→7Effective refire pressure on Copilot is unchanged (~10min between refires) but @claude trigger latency is halved.
Other hardening
head.repo.full_name == github.repositoryguard (skip fork PRs that can't accessGH_ACTIONS_TOKEN).issues: writepermission (required forgh pr commentand label edits via the Issues API).gh pr view --json labelsfetch).2>&1intogh apicapture variables (was poisoning jq parsing on success paths with non-fatal stderr warnings).@claudebody preserved (make sync-instructionsPhase 3 stays, OSS-specific).Test plan
actionlint .github/workflows/copilot-clean-label.ymlclean locally@claudeexactly once when Copilot reviews go through the suppressed-event pathCompanion PRs
copilot-review-fix.yml(the marker contract counterpart)lint.yml(actionlint) — would have caught some of the2>&1/ stderr-poisoning issues that bit us in vuls-reach🤖 Generated with Claude Code