Add agent label automation and docs

kubaflo · kubaflo · commit 8fa1875fa799 · 2026-03-03T13:08:28.000+01:00
Introduce automated agent labeling for PR reviews: add a new shared labeler script (.github/scripts/shared/Update-AgentLabels.ps1) and wire it into Review-PR.ps1 as Phase 4 (Apply Labels). The labeler parses phase content.md files (gate/try-fix/report) to determine outcome, gate and fix signal labels, ensures labels exist, and applies/removes mutually-exclusive outcome/signal labels plus a tracking label (s/agent-reviewed). Add comprehensive docs (.github/docs/agent-labels.md) and update the PR agent SHARED-RULES.md to describe label meanings and expectations. Operations are idempotent and non-fatal; Review-PR.ps1 attempts a targeted recovery if the helper is missing.
diff --git a/.github/agents/pr/SHARED-RULES.md b/.github/agents/pr/SHARED-RULES.md
@@ -119,6 +119,43 @@ EOF
 
 ---
 
+## Agent Labels (Automated by Review-PR.ps1)
+
+After all phases complete, `Review-PR.ps1` automatically applies GitHub labels based on phase outcomes. The agent does NOT need to apply labels — just write accurate `content.md` files.
+
+### Label Categories
+
+**Outcome labels** (mutually exclusive — exactly one per PR):
+| Label | When Applied |
+|-------|-------------|
+| `s/agent-approved` | Report recommends APPROVE |
+| `s/agent-changes-requested` | Report recommends REQUEST CHANGES |
+| `s/agent-review-incomplete` | Agent didn't complete all phases |
+
+**Signal labels** (additive — multiple can coexist):
+| Label | When Applied |
+|-------|-------------|
+| `s/agent-gate-passed` | Gate phase passes |
+| `s/agent-gate-failed` | Gate phase fails |
+| `s/agent-fix-win` | Agent found a better alternative fix than the PR |
+| `s/agent-fix-pr-picked` | PR's fix is the best — agent couldn't beat it |
+
+**Tracking label** (always applied):
+| Label | When Applied |
+|-------|-------------|
+| `s/agent-reviewed` | Every completed agent run |
+
+### How Labels Are Determined
+
+Labels are parsed from `content.md` files:
+- **Outcome**: from `report/content.md` — looks for `Final Recommendation: APPROVE` or `REQUEST CHANGES`
+- **Gate**: from `gate/content.md` — looks for `PASSED` or `FAILED`
+- **Fix**: from `try-fix/content.md` — looks for alternative selected (win = agent beat PR) vs `Selected Fix: PR` (lose = PR was best)
+
+**Agent responsibility**: Write clear, parseable `content.md` with standard markers (`✅ PASSED`, `❌ FAILED`, `Selected Fix: PR`, `Final Recommendation: APPROVE`).
+
+---
+
 ## No Direct Git Commands
 
 **Never run git commands that change branch or file state.**
diff --git a/.github/docs/agent-labels.md b/.github/docs/agent-labels.md
@@ -0,0 +1,171 @@
+# Agent Workflow Labels
+
+GitHub labels for tracking outcomes of the AI agent PR review workflow (`Review-PR.ps1`).
+
+All labels use the **`s/agent-*`** prefix for easy querying on GitHub.
+
+---
+
+## Label Categories
+
+### Outcome Labels
+
+Mutually exclusive — exactly **one** is applied per PR review run.
+
+| Label | Color | Description | Applied When |
+|-------|-------|-------------|--------------|
+| `s/agent-approved` | 🟢 `#2E7D32` | AI agent recommends approval — PR fix is correct and optimal | Report phase recommends APPROVE |
+| `s/agent-changes-requested` | 🟠 `#E65100` | AI agent recommends changes — found a better alternative or issues | Report phase recommends REQUEST CHANGES |
+| `s/agent-review-incomplete` | 🔴 `#B71C1C` | AI agent could not complete all phases (blocker, timeout, error) | Agent exits without completing all phases |
+
+When a new outcome label is applied, any previously applied outcome label is automatically removed.
+
+### Signal Labels
+
+Additive — **multiple** can coexist on a single PR.
+
+| Label | Color | Description | Applied When |
+|-------|-------|-------------|--------------|
+| `s/agent-gate-passed` | 🟢 `#4CAF50` | AI verified tests catch the bug (fail without fix, pass with fix) | Gate phase passes |
+| `s/agent-gate-failed` | 🟠 `#FF9800` | AI could not verify tests catch the bug | Gate phase fails |
+| `s/agent-fix-win` | 🟢 `#66BB6A` | AI found a better alternative fix than the PR | Fix phase: alternative selected over PR's fix |
+| `s/agent-fix-pr-picked` | 🟠 `#FF7043` | AI could not beat the PR fix — PR is the best among all candidates | Fix phase: PR selected as best after comparison |
+
+Gate labels (`gate-passed`/`gate-failed`) are mutually exclusive with each other. Fix labels (`fix-win`/`fix-lose`) are mutually exclusive with each other.
+
+### Tracking Label
+
+Always applied on every completed agent run.
+
+| Label | Color | Description | Applied When |
+|-------|-------|-------------|--------------|
+| `s/agent-reviewed` | 🔵 `#1565C0` | PR was reviewed by AI agent workflow (full 4-phase review) | Every completed agent run |
+
+### Manual Label
+
+Applied by MAUI maintainers, not by automation.
+
+| Label | Color | Description | Applied When |
+|-------|-------|-------------|--------------|
+| `s/agent-fix-implemented` | 🟣 `#7B1FA2` | PR author implemented the agent's suggested fix | Maintainer applies when PR author adopts agent's recommendation |
+
+---
+
+## How It Works
+
+### Architecture
+
+```
+Review-PR.ps1
+├── Phase 1: PR Agent Review (Copilot CLI)
+│   ├── Pre-Flight → writes content.md
+│   ├── Gate       → writes content.md
+│   ├── Fix        → writes content.md
+│   └── Report     → writes content.md
+├── Phase 2: PR Finalize (optional)
+├── Phase 3: Post Comments (optional)
+└── Phase 4: Apply Labels  ← labels are applied here
+    ├── Parse content.md files
+    ├── Determine outcome + signal labels
+    ├── Apply via GitHub REST API
+    └── Non-fatal: errors warn but don't fail the workflow
+```
+
+Labels are applied exclusively from `Review-PR.ps1` Phase 4. No other script applies agent labels. This single-source design avoids label conflicts and simplifies debugging.
+
+### How Labels Are Parsed
+
+The `Parse-PhaseOutcomes` function in `Update-AgentLabels.ps1` reads `content.md` files from each phase directory:
+
+| Source File | What's Parsed | Resulting Label |
+|-------------|---------------|-----------------|
+| `gate/content.md` | `**Result:** ✅ PASSED` | `s/agent-gate-passed` |
+| `gate/content.md` | `**Result:** ❌ FAILED` | `s/agent-gate-failed` |
+| `try-fix/content.md` | `**Selected Fix:** Candidate ...` | `s/agent-fix-win` |
+| `try-fix/content.md` | `**Selected Fix:** PR ...` | `s/agent-fix-pr-picked` |
+| `report/content.md` | `Final Recommendation: APPROVE` | `s/agent-approved` |
+| `report/content.md` | `Final Recommendation: REQUEST CHANGES` | `s/agent-changes-requested` |
+| *(missing report)* | No report file exists | `s/agent-review-incomplete` |
+
+### Self-Bootstrapping
+
+Labels are created automatically on first use via `Ensure-LabelExists`. No manual setup required. If a label already exists but has a stale description or color, it is updated.
+
+---
+
+## Querying Labels
+
+All labels use the `s/agent-*` prefix, making them easy to filter on GitHub.
+
+### Common Queries
+
+```
+# PRs the agent approved
+is:pr label:s/agent-approved
+
+# PRs where agent found a better fix
+is:pr label:s/agent-fix-pr-picked
+
+# PRs where agent found better fix AND author implemented it
+is:pr label:s/agent-changes-requested label:s/agent-fix-implemented
+
+# PRs where tests don't catch the bug
+is:pr label:s/agent-gate-failed
+
+# Agent-reviewed PRs that are still open
+is:pr is:open label:s/agent-reviewed
+
+# All agent-reviewed PRs (total count)
+is:pr label:s/agent-reviewed
+```
+
+### Metrics You Can Derive
+
+| Metric | Query |
+|--------|-------|
+| Total agent reviews | `is:pr label:s/agent-reviewed` |
+| Approval rate | Compare `label:s/agent-approved` vs `label:s/agent-changes-requested` counts |
+| Gate pass rate | Compare `label:s/agent-gate-passed` vs `label:s/agent-gate-failed` counts |
+| Fix win rate | Compare `label:s/agent-fix-win` vs `label:s/agent-fix-pr-picked` counts |
+| Agent adoption rate | `label:s/agent-fix-implemented` / `label:s/agent-changes-requested` |
+| Incomplete review rate | `label:s/agent-review-incomplete` / `label:s/agent-reviewed` |
+
+---
+
+## Implementation Details
+
+### Files
+
+| File | Purpose |
+|------|---------|
+| `.github/scripts/shared/Update-AgentLabels.ps1` | Label helper module (all label logic) |
+| `.github/scripts/Review-PR.ps1` | Orchestrator that calls `Apply-AgentLabels` in Phase 4 |
+| `.github/agents/pr/SHARED-RULES.md` | Documents label system for the PR agent |
+
+### Key Functions
+
+| Function | Description |
+|----------|-------------|
+| `Apply-AgentLabels` | Main entry point — parses phases and applies all labels |
+| `Parse-PhaseOutcomes` | Reads `content.md` files, returns outcome/gate/fix results |
+| `Update-AgentOutcomeLabel` | Applies one outcome label, removes conflicting ones |
+| `Update-AgentSignalLabels` | Adds/removes gate and fix signal labels |
+| `Update-AgentReviewedLabel` | Ensures tracking label is present |
+| `Ensure-LabelExists` | Creates or updates a label in the repository |
+
+### Design Principles
+
+- **Idempotent**: Safe to re-run — checks before add/remove, GitHub ignores duplicate adds
+- **Non-fatal**: Label failures emit warnings but never fail the overall workflow
+- **Single source**: All labels applied from `Review-PR.ps1` only — no other scripts touch labels
+- **Self-bootstrapping**: Labels are created on first use via GitHub API
+- **Mutual exclusivity enforced**: Outcome labels and same-category signal labels automatically remove their counterpart
+
+---
+
+## Migrated From
+
+The following old infrastructure was removed as part of this implementation:
+
+- **`Update-VerificationLabels`** function in `verify-tests-fail.ps1` — removed (labels now come from `Review-PR.ps1` only)
+- **`s/ai-reproduction-confirmed`** / **`s/ai-reproduction-failed`** labels — superseded by `s/agent-gate-passed` / `s/agent-gate-failed`
diff --git a/.github/scripts/Review-PR.ps1 b/.github/scripts/Review-PR.ps1
@@ -531,6 +531,30 @@ if ($DryRun) {
             }
         }
 
+        # Phase 4: Apply Labels
+        Write-Host ""
+        Write-Host "╔═══════════════════════════════════════════════════════════╗" -ForegroundColor Blue
+        Write-Host "║  PHASE 4: APPLY LABELS                                    ║" -ForegroundColor Blue
+        Write-Host "╚═══════════════════════════════════════════════════════════╝" -ForegroundColor Blue
+        Write-Host ""
+
+        $labelHelperPath = Join-Path $RepoRoot ".github/scripts/shared/Update-AgentLabels.ps1"
+        if (-not (Test-Path $labelHelperPath)) {
+            Write-Host "⚠️ Label helper missing, attempting targeted recovery..." -ForegroundColor Yellow
+            git checkout $savedHead -- $labelHelperPath 2>&1 | Out-Null
+        }
+
+        if (Test-Path $labelHelperPath) {
+            try {
+                . $labelHelperPath
+                Apply-AgentLabels -PRNumber $PRNumber -RepoRoot $RepoRoot
+            }
+            catch {
+                Write-Host "⚠️ Label application failed (non-fatal): $_" -ForegroundColor Yellow
+            }
+        } else {
+            Write-Host "⚠️ Label helper not found at: $labelHelperPath — skipping labels" -ForegroundColor Yellow
+        }
     }
 }
 
diff --git a/.github/scripts/shared/Update-AgentLabels.ps1 b/.github/scripts/shared/Update-AgentLabels.ps1