Skip to content

Commit 8fa1875

Browse files
committed
Add agent label automation and docs
Introduce automated agent labeling for PR reviews: add a new shared labeler script (.github/scripts/shared/Update-AgentLabels.ps1) and wire it into Review-PR.ps1 as Phase 4 (Apply Labels). The labeler parses phase content.md files (gate/try-fix/report) to determine outcome, gate and fix signal labels, ensures labels exist, and applies/removes mutually-exclusive outcome/signal labels plus a tracking label (s/agent-reviewed). Add comprehensive docs (.github/docs/agent-labels.md) and update the PR agent SHARED-RULES.md to describe label meanings and expectations. Operations are idempotent and non-fatal; Review-PR.ps1 attempts a targeted recovery if the helper is missing.
1 parent 4384c00 commit 8fa1875

4 files changed

Lines changed: 692 additions & 0 deletions

File tree

.github/agents/pr/SHARED-RULES.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,43 @@ EOF
119119

120120
---
121121

122+
## Agent Labels (Automated by Review-PR.ps1)
123+
124+
After all phases complete, `Review-PR.ps1` automatically applies GitHub labels based on phase outcomes. The agent does NOT need to apply labels — just write accurate `content.md` files.
125+
126+
### Label Categories
127+
128+
**Outcome labels** (mutually exclusive — exactly one per PR):
129+
| Label | When Applied |
130+
|-------|-------------|
131+
| `s/agent-approved` | Report recommends APPROVE |
132+
| `s/agent-changes-requested` | Report recommends REQUEST CHANGES |
133+
| `s/agent-review-incomplete` | Agent didn't complete all phases |
134+
135+
**Signal labels** (additive — multiple can coexist):
136+
| Label | When Applied |
137+
|-------|-------------|
138+
| `s/agent-gate-passed` | Gate phase passes |
139+
| `s/agent-gate-failed` | Gate phase fails |
140+
| `s/agent-fix-win` | Agent found a better alternative fix than the PR |
141+
| `s/agent-fix-pr-picked` | PR's fix is the best — agent couldn't beat it |
142+
143+
**Tracking label** (always applied):
144+
| Label | When Applied |
145+
|-------|-------------|
146+
| `s/agent-reviewed` | Every completed agent run |
147+
148+
### How Labels Are Determined
149+
150+
Labels are parsed from `content.md` files:
151+
- **Outcome**: from `report/content.md` — looks for `Final Recommendation: APPROVE` or `REQUEST CHANGES`
152+
- **Gate**: from `gate/content.md` — looks for `PASSED` or `FAILED`
153+
- **Fix**: from `try-fix/content.md` — looks for alternative selected (win = agent beat PR) vs `Selected Fix: PR` (lose = PR was best)
154+
155+
**Agent responsibility**: Write clear, parseable `content.md` with standard markers (`✅ PASSED`, `❌ FAILED`, `Selected Fix: PR`, `Final Recommendation: APPROVE`).
156+
157+
---
158+
122159
## No Direct Git Commands
123160

124161
**Never run git commands that change branch or file state.**

.github/docs/agent-labels.md

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
# Agent Workflow Labels
2+
3+
GitHub labels for tracking outcomes of the AI agent PR review workflow (`Review-PR.ps1`).
4+
5+
All labels use the **`s/agent-*`** prefix for easy querying on GitHub.
6+
7+
---
8+
9+
## Label Categories
10+
11+
### Outcome Labels
12+
13+
Mutually exclusive — exactly **one** is applied per PR review run.
14+
15+
| Label | Color | Description | Applied When |
16+
|-------|-------|-------------|--------------|
17+
| `s/agent-approved` | 🟢 `#2E7D32` | AI agent recommends approval — PR fix is correct and optimal | Report phase recommends APPROVE |
18+
| `s/agent-changes-requested` | 🟠 `#E65100` | AI agent recommends changes — found a better alternative or issues | Report phase recommends REQUEST CHANGES |
19+
| `s/agent-review-incomplete` | 🔴 `#B71C1C` | AI agent could not complete all phases (blocker, timeout, error) | Agent exits without completing all phases |
20+
21+
When a new outcome label is applied, any previously applied outcome label is automatically removed.
22+
23+
### Signal Labels
24+
25+
Additive — **multiple** can coexist on a single PR.
26+
27+
| Label | Color | Description | Applied When |
28+
|-------|-------|-------------|--------------|
29+
| `s/agent-gate-passed` | 🟢 `#4CAF50` | AI verified tests catch the bug (fail without fix, pass with fix) | Gate phase passes |
30+
| `s/agent-gate-failed` | 🟠 `#FF9800` | AI could not verify tests catch the bug | Gate phase fails |
31+
| `s/agent-fix-win` | 🟢 `#66BB6A` | AI found a better alternative fix than the PR | Fix phase: alternative selected over PR's fix |
32+
| `s/agent-fix-pr-picked` | 🟠 `#FF7043` | AI could not beat the PR fix — PR is the best among all candidates | Fix phase: PR selected as best after comparison |
33+
34+
Gate labels (`gate-passed`/`gate-failed`) are mutually exclusive with each other. Fix labels (`fix-win`/`fix-lose`) are mutually exclusive with each other.
35+
36+
### Tracking Label
37+
38+
Always applied on every completed agent run.
39+
40+
| Label | Color | Description | Applied When |
41+
|-------|-------|-------------|--------------|
42+
| `s/agent-reviewed` | 🔵 `#1565C0` | PR was reviewed by AI agent workflow (full 4-phase review) | Every completed agent run |
43+
44+
### Manual Label
45+
46+
Applied by MAUI maintainers, not by automation.
47+
48+
| Label | Color | Description | Applied When |
49+
|-------|-------|-------------|--------------|
50+
| `s/agent-fix-implemented` | 🟣 `#7B1FA2` | PR author implemented the agent's suggested fix | Maintainer applies when PR author adopts agent's recommendation |
51+
52+
---
53+
54+
## How It Works
55+
56+
### Architecture
57+
58+
```
59+
Review-PR.ps1
60+
├── Phase 1: PR Agent Review (Copilot CLI)
61+
│ ├── Pre-Flight → writes content.md
62+
│ ├── Gate → writes content.md
63+
│ ├── Fix → writes content.md
64+
│ └── Report → writes content.md
65+
├── Phase 2: PR Finalize (optional)
66+
├── Phase 3: Post Comments (optional)
67+
└── Phase 4: Apply Labels ← labels are applied here
68+
├── Parse content.md files
69+
├── Determine outcome + signal labels
70+
├── Apply via GitHub REST API
71+
└── Non-fatal: errors warn but don't fail the workflow
72+
```
73+
74+
Labels are applied exclusively from `Review-PR.ps1` Phase 4. No other script applies agent labels. This single-source design avoids label conflicts and simplifies debugging.
75+
76+
### How Labels Are Parsed
77+
78+
The `Parse-PhaseOutcomes` function in `Update-AgentLabels.ps1` reads `content.md` files from each phase directory:
79+
80+
| Source File | What's Parsed | Resulting Label |
81+
|-------------|---------------|-----------------|
82+
| `gate/content.md` | `**Result:** ✅ PASSED` | `s/agent-gate-passed` |
83+
| `gate/content.md` | `**Result:** ❌ FAILED` | `s/agent-gate-failed` |
84+
| `try-fix/content.md` | `**Selected Fix:** Candidate ...` | `s/agent-fix-win` |
85+
| `try-fix/content.md` | `**Selected Fix:** PR ...` | `s/agent-fix-pr-picked` |
86+
| `report/content.md` | `Final Recommendation: APPROVE` | `s/agent-approved` |
87+
| `report/content.md` | `Final Recommendation: REQUEST CHANGES` | `s/agent-changes-requested` |
88+
| *(missing report)* | No report file exists | `s/agent-review-incomplete` |
89+
90+
### Self-Bootstrapping
91+
92+
Labels are created automatically on first use via `Ensure-LabelExists`. No manual setup required. If a label already exists but has a stale description or color, it is updated.
93+
94+
---
95+
96+
## Querying Labels
97+
98+
All labels use the `s/agent-*` prefix, making them easy to filter on GitHub.
99+
100+
### Common Queries
101+
102+
```
103+
# PRs the agent approved
104+
is:pr label:s/agent-approved
105+
106+
# PRs where agent found a better fix
107+
is:pr label:s/agent-fix-pr-picked
108+
109+
# PRs where agent found better fix AND author implemented it
110+
is:pr label:s/agent-changes-requested label:s/agent-fix-implemented
111+
112+
# PRs where tests don't catch the bug
113+
is:pr label:s/agent-gate-failed
114+
115+
# Agent-reviewed PRs that are still open
116+
is:pr is:open label:s/agent-reviewed
117+
118+
# All agent-reviewed PRs (total count)
119+
is:pr label:s/agent-reviewed
120+
```
121+
122+
### Metrics You Can Derive
123+
124+
| Metric | Query |
125+
|--------|-------|
126+
| Total agent reviews | `is:pr label:s/agent-reviewed` |
127+
| Approval rate | Compare `label:s/agent-approved` vs `label:s/agent-changes-requested` counts |
128+
| Gate pass rate | Compare `label:s/agent-gate-passed` vs `label:s/agent-gate-failed` counts |
129+
| Fix win rate | Compare `label:s/agent-fix-win` vs `label:s/agent-fix-pr-picked` counts |
130+
| Agent adoption rate | `label:s/agent-fix-implemented` / `label:s/agent-changes-requested` |
131+
| Incomplete review rate | `label:s/agent-review-incomplete` / `label:s/agent-reviewed` |
132+
133+
---
134+
135+
## Implementation Details
136+
137+
### Files
138+
139+
| File | Purpose |
140+
|------|---------|
141+
| `.github/scripts/shared/Update-AgentLabels.ps1` | Label helper module (all label logic) |
142+
| `.github/scripts/Review-PR.ps1` | Orchestrator that calls `Apply-AgentLabels` in Phase 4 |
143+
| `.github/agents/pr/SHARED-RULES.md` | Documents label system for the PR agent |
144+
145+
### Key Functions
146+
147+
| Function | Description |
148+
|----------|-------------|
149+
| `Apply-AgentLabels` | Main entry point — parses phases and applies all labels |
150+
| `Parse-PhaseOutcomes` | Reads `content.md` files, returns outcome/gate/fix results |
151+
| `Update-AgentOutcomeLabel` | Applies one outcome label, removes conflicting ones |
152+
| `Update-AgentSignalLabels` | Adds/removes gate and fix signal labels |
153+
| `Update-AgentReviewedLabel` | Ensures tracking label is present |
154+
| `Ensure-LabelExists` | Creates or updates a label in the repository |
155+
156+
### Design Principles
157+
158+
- **Idempotent**: Safe to re-run — checks before add/remove, GitHub ignores duplicate adds
159+
- **Non-fatal**: Label failures emit warnings but never fail the overall workflow
160+
- **Single source**: All labels applied from `Review-PR.ps1` only — no other scripts touch labels
161+
- **Self-bootstrapping**: Labels are created on first use via GitHub API
162+
- **Mutual exclusivity enforced**: Outcome labels and same-category signal labels automatically remove their counterpart
163+
164+
---
165+
166+
## Migrated From
167+
168+
The following old infrastructure was removed as part of this implementation:
169+
170+
- **`Update-VerificationLabels`** function in `verify-tests-fail.ps1` — removed (labels now come from `Review-PR.ps1` only)
171+
- **`s/ai-reproduction-confirmed`** / **`s/ai-reproduction-failed`** labels — superseded by `s/agent-gate-passed` / `s/agent-gate-failed`

.github/scripts/Review-PR.ps1

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -531,6 +531,30 @@ if ($DryRun) {
531531
}
532532
}
533533

534+
# Phase 4: Apply Labels
535+
Write-Host ""
536+
Write-Host "╔═══════════════════════════════════════════════════════════╗" -ForegroundColor Blue
537+
Write-Host "║ PHASE 4: APPLY LABELS ║" -ForegroundColor Blue
538+
Write-Host "╚═══════════════════════════════════════════════════════════╝" -ForegroundColor Blue
539+
Write-Host ""
540+
541+
$labelHelperPath = Join-Path $RepoRoot ".github/scripts/shared/Update-AgentLabels.ps1"
542+
if (-not (Test-Path $labelHelperPath)) {
543+
Write-Host "⚠️ Label helper missing, attempting targeted recovery..." -ForegroundColor Yellow
544+
git checkout $savedHead -- $labelHelperPath 2>&1 | Out-Null
545+
}
546+
547+
if (Test-Path $labelHelperPath) {
548+
try {
549+
. $labelHelperPath
550+
Apply-AgentLabels -PRNumber $PRNumber -RepoRoot $RepoRoot
551+
}
552+
catch {
553+
Write-Host "⚠️ Label application failed (non-fatal): $_" -ForegroundColor Yellow
554+
}
555+
} else {
556+
Write-Host "⚠️ Label helper not found at: $labelHelperPath — skipping labels" -ForegroundColor Yellow
557+
}
534558
}
535559
}
536560

0 commit comments

Comments
 (0)