Skip to content

Commit d69a772

Browse files
authored
fix(review): queue reviewers when subagent slots fill (#716)
1 parent 48e83d9 commit d69a772

5 files changed

Lines changed: 25 additions & 11 deletions

File tree

plugins/compound-engineering/AGENTS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -198,7 +198,7 @@ Design rules for blocking question menus (`AskUserQuestion` / `request_user_inpu
198198
### Cross-Platform Sub-Agent Dispatch
199199

200200
- [ ] When a skill dispatches sub-agents, instruct use of the platform's subagent primitive and name the known equivalents (`Agent`/`Task` in Claude Code, `spawn_agent` in Codex, `subagent` in Pi via the `pi-subagents` extension)
201-
- [ ] Prefer parallel execution but include a sequential fallback for platforms that do not support parallel dispatch
201+
- [ ] Prefer bounded parallel execution: respect platform active-subagent limits, queue overflow work, and treat limit-related spawn errors as backpressure. Include a sequential fallback for platforms that do not support parallel dispatch
202202
- [ ] Prefer sub-agents shipped with this plugin (`ce-*`) over platform built-ins. Built-ins have different names on each target (e.g., Claude Code's `Explore` is `explorer` on Codex via `spawn_agent`'s `agent_type`, `scout` on Pi via `pi-subagents`) — using our own avoids the enumeration tax. Exception: when a built-in offers a meaningful benefit worth keeping, enumerate the per-platform equivalents inline at the call site so the model can route correctly on each target.
203203

204204
### Script Path References in Skills

plugins/compound-engineering/skills/ce-code-review/SKILL.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -425,7 +425,9 @@ Omit the `mode` parameter when dispatching sub-agents so the user's configured p
425425

426426
**Model override at dispatch time.** Pass the platform's mid-tier model on every dispatch except `ce-correctness-reviewer`, `ce-security-reviewer`, and `ce-adversarial-reviewer`, which inherit the session model (per the Model tiering subsection above). In Claude Code, add `model: "sonnet"` to the `Agent` tool call. In Codex, pass the equivalent mid-tier on `spawn_agent` (e.g., `gpt-5.4-mini` as of April 2026). In Pi, pass the equivalent on `subagent` via the `pi-subagents` extension. On platforms where the dispatch primitive has no model-override parameter or the available model names are unknown, omit the override — a working review on the parent model beats a broken dispatch on an unrecognized name. Check this on every Agent / `spawn_agent` / `subagent` call in the parallel dispatch; omitting it on Opus sessions silently 3-4x's the cost of a review.
427427

428-
Spawn each selected persona reviewer as a parallel sub-agent using the subagent template included below. Each persona sub-agent receives:
428+
**Bounded parallel dispatch.** Respect the current harness's active-subagent limit. Queue selected reviewers, dispatch only as many as the harness accepts, and fill freed slots as reviewers complete. Treat active-agent/thread/concurrency-limit spawn errors as backpressure, not reviewer failure: leave the reviewer queued and retry after a slot frees. Record a reviewer as failed only after a successful dispatch times out/fails, or when dispatch fails for a non-capacity reason.
429+
430+
Spawn each selected persona reviewer using the subagent template included below. Each persona sub-agent receives:
429431

430432
1. Their persona file content (identity, failure modes, calibration, suppress conditions)
431433
2. Shared diff-scope rules from the diff-scope reference included below
@@ -465,9 +467,9 @@ Each persona sub-agent writes full JSON (all schema fields) to `/tmp/compound-en
465467

466468
Detail-tier fields (`why_it_matters`, `evidence`) are in the artifact file only. `suggested_fix` is optional in both tiers -- included in compact returns when present so the orchestrator has fix context for auto-apply decisions. If the file write fails, the compact return still provides everything the merge needs.
467469

468-
**CE always-on agents** (ce-agent-native-reviewer, ce-learnings-researcher) are dispatched as standard Agent calls in parallel with the persona agents. Give them the same review context bundle the personas receive: entry mode, any PR metadata gathered in Stage 1, intent summary, review base branch name when known, `BASE:` marker, file list, diff, and `UNTRACKED:` scope notes. Do not invoke them with a generic "review this" prompt. Their output is unstructured and synthesized separately in Stage 6.
470+
**CE always-on agents** (ce-agent-native-reviewer, ce-learnings-researcher) are dispatched as standard Agent calls through the same bounded parallel scheduler as the persona agents. Give them the same review context bundle the personas receive: entry mode, any PR metadata gathered in Stage 1, intent summary, review base branch name when known, `BASE:` marker, file list, diff, and `UNTRACKED:` scope notes. Do not invoke them with a generic "review this" prompt. Their output is unstructured and synthesized separately in Stage 6.
469471

470-
**CE conditional agents** (ce-schema-drift-detector, ce-deployment-verification-agent) are also dispatched as standard Agent calls when applicable. Pass the same review context bundle plus the applicability reason (for example, which migration files triggered the agent). For ce-schema-drift-detector specifically, pass the resolved review base branch explicitly so it never assumes `main`. Their output is unstructured and must be preserved for Stage 6 synthesis just like the CE always-on agents.
472+
**CE conditional agents** (ce-schema-drift-detector, ce-deployment-verification-agent) are also dispatched as standard Agent calls through the same bounded parallel scheduler when applicable. Pass the same review context bundle plus the applicability reason (for example, which migration files triggered the agent). For ce-schema-drift-detector specifically, pass the resolved review base branch explicitly so it never assumes `main`. Their output is unstructured and must be preserved for Stage 6 synthesis just like the CE always-on agents.
471473

472474
### Stage 5: Merge findings
473475

@@ -554,7 +556,7 @@ When Stage 5b does not run, the merged finding set from Stage 5 flows through to
554556
- **headless/autofix:** All survivors of Stage 5.
555557
- **interactive File-tickets (option C):** All pending findings regardless of recommended action. Option C externalizes every finding as a ticket, so every finding needs validation.
556558
2. **Apply dispatch budget cap.** If the selected set exceeds 15 findings, validate the highest-severity 15 (P0 first, then P1, then P2, then P3, breaking ties by anchor descending). Drop the remainder and record the over-budget count for the Coverage section. The blunt drop is intentional; a review producing 15+ surviving findings is already in territory where a second wave would not change the user's triage approach.
557-
3. **Spawn validators in parallel.** One sub-agent per finding, dispatched concurrently using the validator template. Each validator receives:
559+
3. **Spawn validators with bounded parallelism.** One sub-agent per finding, dispatched independently using the validator template and the same bounded scheduler from Stage 4. Each validator receives:
558560
- The finding's title, severity, file, line, suggested_fix, original reviewer name, and confidence anchor
559561
- `why_it_matters` when available — loaded from the per-agent artifact file at `/tmp/compound-engineering/ce-code-review/{run_id}/{reviewer_name}.json`; omit when the file is absent or the artifact write failed. The validator proceeds without it, using the diff and cited code directly.
560562
- The full diff
@@ -566,7 +568,7 @@ When Stage 5b does not run, the merged finding set from Stage 5 flows through to
566568
5. **Use mid-tier model for validators.** Same model class (sonnet) the persona reviewers use. Validators are read-only — same constraints as persona reviewers. They may use non-mutating inspection commands (Read, Grep, Glob, git blame, gh).
567569
6. **Record metrics for Coverage.** Total dispatched, validated true count, validated false count (with reasons), failures, and over-budget drops.
568570

569-
**Why per-finding parallel dispatch (not batched):** Independence is the point. A single batched validator looking at all findings together pattern-matches across them and recreates the persona-bias problem. Per-finding parallel dispatch preserves fresh context per call. Per-file batching is a plausible future optimization for reviews with many findings clustered in few files; not implemented today.
571+
**Why per-finding bounded dispatch (not batched):** Independence is the point. A single batched validator looking at all findings together pattern-matches across them and recreates the persona-bias problem. Per-finding dispatch preserves fresh context while the scheduler respects harness limits. Per-file batching is a plausible future optimization for reviews with many findings clustered in few files; not implemented today.
570572

571573
### Stage 6: Synthesize and present
572574

@@ -851,7 +853,7 @@ If "Push fixes": push the branch with `git push` to update the existing PR.
851853

852854
## Fallback
853855

854-
If the platform doesn't support parallel sub-agents, run reviewers sequentially. Everything else (stages, output format, merge pipeline) stays the same.
856+
If the platform doesn't support parallel sub-agents, run reviewers sequentially. If the platform supports sub-agents but caps active concurrency, use the bounded queueing rules in Stage 4 rather than treating cap-related spawn failures as reviewer failures. Everything else (stages, output format, merge pipeline) stays the same.
855857

856858
---
857859

plugins/compound-engineering/skills/ce-doc-review/SKILL.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,9 @@ Add activated conditional personas:
121121

122122
### Dispatch
123123

124-
Dispatch all agents in **parallel** using the platform's subagent primitive (e.g., `Agent` in Claude Code, `spawn_agent` in Codex, `subagent` in Pi via the `pi-subagents` extension). Omit the `mode` parameter so the user's configured permission settings apply. Each agent receives the prompt built from the subagent template included below with these variables filled:
124+
Dispatch agents using **bounded parallelism** with the platform's subagent primitive (e.g., `Agent` in Claude Code, `spawn_agent` in Codex, `subagent` in Pi via the `pi-subagents` extension). Omit the `mode` parameter so the user's configured permission settings apply. Respect the current harness's active-subagent limit: queue selected reviewers, dispatch only as many as the harness accepts, and fill freed slots as reviewers complete. Treat active-agent/thread/concurrency-limit spawn errors as backpressure, not reviewer failure: leave the reviewer queued and retry after a slot frees. Record a reviewer as failed only after a successful dispatch times out/fails, or when dispatch fails for a non-capacity reason.
125+
126+
Each agent receives the prompt built from the subagent template included below with these variables filled:
125127

126128
| Variable | Value |
127129
|----------|-------|
@@ -173,7 +175,7 @@ Cross-session persistence is out of scope. A new invocation of ce-doc-review on
173175

174176
**Error handling:** If an agent fails or times out, proceed with findings from agents that completed. Note the failed agent in the Coverage section. Do not block the entire review on a single agent failure.
175177

176-
**Dispatch limit:** Even at maximum (7 agents), use parallel dispatch. These are document reviewers with bounded scope reading a single document -- parallel is safe and fast.
178+
**Dispatch limit:** Even at maximum (7 agents), use bounded parallel dispatch. If the harness cap is lower than the selected team size, queue the remainder and launch them as active reviewers complete.
177179

178180
## Phases 3-5: Synthesis, Presentation, and Next Action
179181

tests/pipeline-review-contract.test.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -521,6 +521,10 @@ describe("ce-doc-review contract", () => {
521521
expect(content).toContain("AskUserQuestion")
522522
expect(content).toContain("ToolSearch")
523523
expect(content).toContain("numbered-list fallback")
524+
expect(content).toContain("bounded parallelism")
525+
expect(content).toContain("active-subagent limit")
526+
expect(content).toContain("spawn errors as backpressure, not reviewer failure")
527+
expect(content).toContain("queue the remainder")
524528

525529
// Decision primer variable in the dispatch table
526530
expect(content).toContain("{decision_primer}")

tests/review-skill-contract.test.ts

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -301,6 +301,10 @@ describe("ce-code-review contract", () => {
301301
expect(spawning).toContain("Agent")
302302
expect(spawning).toContain("spawn_agent")
303303
expect(spawning).toContain("subagent")
304+
expect(spawning).toMatch(/Bounded parallel dispatch/)
305+
expect(spawning).toMatch(/active-subagent limit/)
306+
expect(spawning).toMatch(/spawn errors as backpressure, not reviewer failure/)
307+
expect(spawning).toMatch(/fill freed slots/)
304308
// Exceptions are restated at point of action so the agent does not have to recall them
305309
// from the Model tiering subsection above during a 12-agent parallel dispatch.
306310
expect(spawning).toContain("ce-correctness-reviewer")
@@ -354,9 +358,11 @@ describe("ce-code-review contract", () => {
354358
expect(content).toMatch(/best-judgment-the-rest handoff \| No --/)
355359
expect(content).toMatch(/best-judgment path skips Stage 5b deliberately/i)
356360

357-
// Per-finding parallel dispatch (not batched)
358-
expect(content).toMatch(/per.finding parallel dispatch/i)
361+
// Per-finding bounded dispatch (not batched)
362+
expect(content).toMatch(/per.finding bounded dispatch/i)
359363
expect(content).toMatch(/Independence is the point/i)
364+
expect(content).toMatch(/same bounded scheduler from Stage 4/i)
365+
expect(content).toMatch(/active-subagent limit/i)
360366

361367
// Budget cap of 15
362368
expect(content).toMatch(/exceeds 15 findings/i)

0 commit comments

Comments
 (0)