|
| 1 | +# Agent Workflow Labels |
| 2 | + |
| 3 | +GitHub labels for tracking outcomes of the AI agent PR review workflow (`Review-PR.ps1`). |
| 4 | + |
| 5 | +All labels use the **`s/agent-*`** prefix for easy querying on GitHub. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## Label Categories |
| 10 | + |
| 11 | +### Outcome Labels |
| 12 | + |
| 13 | +Mutually exclusive — exactly **one** is applied per PR review run. |
| 14 | + |
| 15 | +| Label | Color | Description | Applied When | |
| 16 | +|-------|-------|-------------|--------------| |
| 17 | +| `s/agent-approved` | 🟢 `#2E7D32` | AI agent recommends approval — PR fix is correct and optimal | Report phase recommends APPROVE | |
| 18 | +| `s/agent-changes-requested` | 🟠 `#E65100` | AI agent recommends changes — found a better alternative or issues | Report phase recommends REQUEST CHANGES | |
| 19 | +| `s/agent-review-incomplete` | 🔴 `#B71C1C` | AI agent could not complete all phases (blocker, timeout, error) | Agent exits without completing all phases | |
| 20 | + |
| 21 | +When a new outcome label is applied, any previously applied outcome label is automatically removed. |
| 22 | + |
| 23 | +### Signal Labels |
| 24 | + |
| 25 | +Additive — **multiple** can coexist on a single PR. |
| 26 | + |
| 27 | +| Label | Color | Description | Applied When | |
| 28 | +|-------|-------|-------------|--------------| |
| 29 | +| `s/agent-gate-passed` | 🟢 `#4CAF50` | AI verified tests catch the bug (fail without fix, pass with fix) | Gate phase passes | |
| 30 | +| `s/agent-gate-failed` | 🟠 `#FF9800` | AI could not verify tests catch the bug | Gate phase fails | |
| 31 | +| `s/agent-fix-win` | 🟢 `#66BB6A` | AI found a better alternative fix than the PR | Fix phase: alternative selected over PR's fix | |
| 32 | +| `s/agent-fix-pr-picked` | 🟠 `#FF7043` | AI could not beat the PR fix — PR is the best among all candidates | Fix phase: PR selected as best after comparison | |
| 33 | + |
| 34 | +Gate labels (`gate-passed`/`gate-failed`) are mutually exclusive with each other. Fix labels (`fix-win`/`fix-lose`) are mutually exclusive with each other. |
| 35 | + |
| 36 | +### Tracking Label |
| 37 | + |
| 38 | +Always applied on every completed agent run. |
| 39 | + |
| 40 | +| Label | Color | Description | Applied When | |
| 41 | +|-------|-------|-------------|--------------| |
| 42 | +| `s/agent-reviewed` | 🔵 `#1565C0` | PR was reviewed by AI agent workflow (full 4-phase review) | Every completed agent run | |
| 43 | + |
| 44 | +### Manual Label |
| 45 | + |
| 46 | +Applied by MAUI maintainers, not by automation. |
| 47 | + |
| 48 | +| Label | Color | Description | Applied When | |
| 49 | +|-------|-------|-------------|--------------| |
| 50 | +| `s/agent-fix-implemented` | 🟣 `#7B1FA2` | PR author implemented the agent's suggested fix | Maintainer applies when PR author adopts agent's recommendation | |
| 51 | + |
| 52 | +--- |
| 53 | + |
| 54 | +## How It Works |
| 55 | + |
| 56 | +### Architecture |
| 57 | + |
| 58 | +``` |
| 59 | +Review-PR.ps1 |
| 60 | +├── Phase 1: PR Agent Review (Copilot CLI) |
| 61 | +│ ├── Pre-Flight → writes content.md |
| 62 | +│ ├── Gate → writes content.md |
| 63 | +│ ├── Fix → writes content.md |
| 64 | +│ └── Report → writes content.md |
| 65 | +├── Phase 2: PR Finalize (optional) |
| 66 | +├── Phase 3: Post Comments (optional) |
| 67 | +└── Phase 4: Apply Labels ← labels are applied here |
| 68 | + ├── Parse content.md files |
| 69 | + ├── Determine outcome + signal labels |
| 70 | + ├── Apply via GitHub REST API |
| 71 | + └── Non-fatal: errors warn but don't fail the workflow |
| 72 | +``` |
| 73 | + |
| 74 | +Labels are applied exclusively from `Review-PR.ps1` Phase 4. No other script applies agent labels. This single-source design avoids label conflicts and simplifies debugging. |
| 75 | + |
| 76 | +### How Labels Are Parsed |
| 77 | + |
| 78 | +The `Parse-PhaseOutcomes` function in `Update-AgentLabels.ps1` reads `content.md` files from each phase directory: |
| 79 | + |
| 80 | +| Source File | What's Parsed | Resulting Label | |
| 81 | +|-------------|---------------|-----------------| |
| 82 | +| `gate/content.md` | `**Result:** ✅ PASSED` | `s/agent-gate-passed` | |
| 83 | +| `gate/content.md` | `**Result:** ❌ FAILED` | `s/agent-gate-failed` | |
| 84 | +| `try-fix/content.md` | `**Selected Fix:** Candidate ...` | `s/agent-fix-win` | |
| 85 | +| `try-fix/content.md` | `**Selected Fix:** PR ...` | `s/agent-fix-pr-picked` | |
| 86 | +| `report/content.md` | `Final Recommendation: APPROVE` | `s/agent-approved` | |
| 87 | +| `report/content.md` | `Final Recommendation: REQUEST CHANGES` | `s/agent-changes-requested` | |
| 88 | +| *(missing report)* | No report file exists | `s/agent-review-incomplete` | |
| 89 | + |
| 90 | +### Self-Bootstrapping |
| 91 | + |
| 92 | +Labels are created automatically on first use via `Ensure-LabelExists`. No manual setup required. If a label already exists but has a stale description or color, it is updated. |
| 93 | + |
| 94 | +--- |
| 95 | + |
| 96 | +## Querying Labels |
| 97 | + |
| 98 | +All labels use the `s/agent-*` prefix, making them easy to filter on GitHub. |
| 99 | + |
| 100 | +### Common Queries |
| 101 | + |
| 102 | +``` |
| 103 | +# PRs the agent approved |
| 104 | +is:pr label:s/agent-approved |
| 105 | +
|
| 106 | +# PRs where agent found a better fix |
| 107 | +is:pr label:s/agent-fix-pr-picked |
| 108 | +
|
| 109 | +# PRs where agent found better fix AND author implemented it |
| 110 | +is:pr label:s/agent-changes-requested label:s/agent-fix-implemented |
| 111 | +
|
| 112 | +# PRs where tests don't catch the bug |
| 113 | +is:pr label:s/agent-gate-failed |
| 114 | +
|
| 115 | +# Agent-reviewed PRs that are still open |
| 116 | +is:pr is:open label:s/agent-reviewed |
| 117 | +
|
| 118 | +# All agent-reviewed PRs (total count) |
| 119 | +is:pr label:s/agent-reviewed |
| 120 | +``` |
| 121 | + |
| 122 | +### Metrics You Can Derive |
| 123 | + |
| 124 | +| Metric | Query | |
| 125 | +|--------|-------| |
| 126 | +| Total agent reviews | `is:pr label:s/agent-reviewed` | |
| 127 | +| Approval rate | Compare `label:s/agent-approved` vs `label:s/agent-changes-requested` counts | |
| 128 | +| Gate pass rate | Compare `label:s/agent-gate-passed` vs `label:s/agent-gate-failed` counts | |
| 129 | +| Fix win rate | Compare `label:s/agent-fix-win` vs `label:s/agent-fix-pr-picked` counts | |
| 130 | +| Agent adoption rate | `label:s/agent-fix-implemented` / `label:s/agent-changes-requested` | |
| 131 | +| Incomplete review rate | `label:s/agent-review-incomplete` / `label:s/agent-reviewed` | |
| 132 | + |
| 133 | +--- |
| 134 | + |
| 135 | +## Implementation Details |
| 136 | + |
| 137 | +### Files |
| 138 | + |
| 139 | +| File | Purpose | |
| 140 | +|------|---------| |
| 141 | +| `.github/scripts/shared/Update-AgentLabels.ps1` | Label helper module (all label logic) | |
| 142 | +| `.github/scripts/Review-PR.ps1` | Orchestrator that calls `Apply-AgentLabels` in Phase 4 | |
| 143 | +| `.github/agents/pr/SHARED-RULES.md` | Documents label system for the PR agent | |
| 144 | + |
| 145 | +### Key Functions |
| 146 | + |
| 147 | +| Function | Description | |
| 148 | +|----------|-------------| |
| 149 | +| `Apply-AgentLabels` | Main entry point — parses phases and applies all labels | |
| 150 | +| `Parse-PhaseOutcomes` | Reads `content.md` files, returns outcome/gate/fix results | |
| 151 | +| `Update-AgentOutcomeLabel` | Applies one outcome label, removes conflicting ones | |
| 152 | +| `Update-AgentSignalLabels` | Adds/removes gate and fix signal labels | |
| 153 | +| `Update-AgentReviewedLabel` | Ensures tracking label is present | |
| 154 | +| `Ensure-LabelExists` | Creates or updates a label in the repository | |
| 155 | + |
| 156 | +### Design Principles |
| 157 | + |
| 158 | +- **Idempotent**: Safe to re-run — checks before add/remove, GitHub ignores duplicate adds |
| 159 | +- **Non-fatal**: Label failures emit warnings but never fail the overall workflow |
| 160 | +- **Single source**: All labels applied from `Review-PR.ps1` only — no other scripts touch labels |
| 161 | +- **Self-bootstrapping**: Labels are created on first use via GitHub API |
| 162 | +- **Mutual exclusivity enforced**: Outcome labels and same-category signal labels automatically remove their counterpart |
| 163 | + |
| 164 | +--- |
| 165 | + |
| 166 | +## Migrated From |
| 167 | + |
| 168 | +The following old infrastructure was removed as part of this implementation: |
| 169 | + |
| 170 | +- **`Update-VerificationLabels`** function in `verify-tests-fail.ps1` — removed (labels now come from `Review-PR.ps1` only) |
| 171 | +- **`s/ai-reproduction-confirmed`** / **`s/ai-reproduction-failed`** labels — superseded by `s/agent-gate-passed` / `s/agent-gate-failed` |
0 commit comments