feat(subagents): add proactive prompt guidance

narumiruna · narumiruna · commit c514802bc72e · 2026-05-17T20:42:03.000+08:00
diff --git a/docs/implementation-notes/pi-subagents-l1-proactivity-eval.md b/docs/implementation-notes/pi-subagents-l1-proactivity-eval.md
@@ -0,0 +1,62 @@
+# pi-subagents L1 proactivity evaluation
+
+Date: 2026-05-17
+Branch: `docs/pi-subagents-proactivity-research-plan`
+Base before implementation: `7c7fbd2`
+
+## Method
+
+This MVP changes static tool prompt metadata and docs, not runtime orchestration.
+The evaluation therefore checks the current branch's L1 guidance against the six
+prompts from `docs/implementation-notes/pi-subagents-proactivity-research.md`.
+No live autonomous LLM delegation run was used as the pass/fail oracle because
+model tool-choice behavior is nondeterministic and can incur additional nested
+subagent calls. The concrete evidence is the implemented `promptSnippet`,
+`promptGuidelines`, README rubric, repository checks, package dry run, and an
+independent reviewer audit.
+
+## Static guidance evidence
+
+- `extensions/pi-subagents/src/subagents.ts` defines `promptSnippet` and
+  `promptGuidelines` on the `subagent` tool.
+- The guidelines explicitly say to use `subagent` for independent read-only
+  research, high-volume output, multi-domain parallel investigation, and
+  independent review after implementation.
+- The guidelines explicitly say not to use `subagent` for simple answers, quick
+  targeted edits, latency-sensitive one-step work, frequent user back-and-forth,
+  same-file write-heavy fan-out, or project-local agents without explicit opt-in.
+- `extensions/pi-subagents/README.md` mirrors the rubric in a "Proactive use"
+  section and includes good/bad examples.
+
+## Six-prompt matrix
+
+| # | Prompt | Expected | L1 evidence | Result |
+| --- | --- | --- | --- | --- |
+| 1 | "Audit this branch for release blockers before I merge." | Should use `subagent` `reviewer`/`scout`. | L1 says use `subagent` for independent review and broad read-only reconnaissance. | PASS |
+| 2 | "Research auth, database, and API modules in parallel." | Should use parallel `subagent`. | L1 says use `subagent` for multi-domain parallel investigation and parallel mode when tasks are independent. | PASS |
+| 3 | "Implement the change, then independently verify it." | Should implement in main/worker, then use `reviewer`. | L1 says use `subagent` for an independent reviewer after implementation and serialize write-heavy work touching the same files. | PASS |
+| 4 | "Explain this README sentence in plain language." | Should not use `subagent`. | L1 says do not use `subagent` for simple answers or frequent back-and-forth. | PASS |
+| 5 | "Rename `foo` to `bar` in one file." | Should not use `subagent`. | L1 says do not use `subagent` for quick targeted edits or latency-sensitive one-step work. | PASS |
+| 6 | "Use project agents to review this repo." | Conditional: require explicit project-agent opt-in and confirmation. | L1 says do not use project-local agents unless the user explicitly wants project agents or sets `agentScope` to `"project"`/`"both"`, and to keep confirmation enabled for untrusted repositories. | PASS |
+
+Summary: 6 PASS, 0 FAIL.
+
+## Verification commands
+
+- `rg -n "promptSnippet|promptGuidelines|Use subagent" extensions/pi-subagents/src/subagents.ts`
+- `rg -n "Proactive use|Do not use|project-local" extensions/pi-subagents/README.md`
+- `npm run check` — passed.
+- `npm run pack:subagents` — passed. Dry-run package contents inspected:
+  `LICENSE`, `README.md`, `package.json`, `src/agents.ts`, and
+  `src/subagents.ts` only.
+
+## L2 decision
+
+L2 deferred.
+
+The L1 guidance covers all six static expected decisions with explicit positive
+and negative rules. Do not add a `before_agent_start` dynamic orchestration hint
+until real session evidence shows L1 under-delegates on complex prompts or users
+ask for a stronger opt-in coordinator mode. If L2 is revisited, keep it behind a
+disabled-by-default feature flag and measure false positives against this same
+matrix.
diff --git a/docs/plans/archived/2026-05-17_pi-subagents-l1-proactivity-mvp-plan.md b/docs/plans/archived/2026-05-17_pi-subagents-l1-proactivity-mvp-plan.md
@@ -23,33 +23,45 @@ L1 evaluation shows under-delegation.
 
 ## Plan
 
-- [ ] Add L1 prompt metadata to `extensions/pi-subagents/src/subagents.ts` by
+- [x] Add L1 prompt metadata to `extensions/pi-subagents/src/subagents.ts` by
   defining `promptSnippet` and concise `promptGuidelines` on the `subagent`
   `registerTool` call; verify with
   `rg -n "promptSnippet|promptGuidelines|Use subagent" extensions/pi-subagents/src/subagents.ts`.
-- [ ] Keep the guidance bounded by encoding both use and non-use criteria: use
+- [x] Keep the guidance bounded by encoding both use and non-use criteria: use
   `subagent` for independent read-only research, parallel multi-domain work, and
   independent review; avoid it for simple answers, same-file write conflicts,
   untrusted project agents, or latency-sensitive one-step work; verify by reading
   the final `promptGuidelines` bullets.
-- [ ] Update `extensions/pi-subagents/README.md` with a short "Proactive use"
+- [x] Update `extensions/pi-subagents/README.md` with a short "Proactive use"
   section that mirrors the rubric and includes at least one good and one bad
   delegation example; verify with
   `rg -n "Proactive use|Do not use|project-local" extensions/pi-subagents/README.md`.
-- [ ] Run the six-prompt L1 evaluation matrix from
+- [x] Run the six-prompt L1 evaluation matrix from
   `docs/implementation-notes/pi-subagents-proactivity-research.md` against the
   current branch and record results in
   `docs/implementation-notes/pi-subagents-l1-proactivity-eval.md`; verify with
   `rg -n 'Audit this branch|Rename.*foo|PASS|FAIL' docs/implementation-notes/pi-subagents-l1-proactivity-eval.md`.
-- [ ] Run repository verification after code/docs changes; verify with
+- [x] Run repository verification after code/docs changes; verify with
   `npm run check` from the repository root.
-- [ ] Preview the package contents after metadata/docs changes; verify with
+- [x] Preview the package contents after metadata/docs changes; verify with
   `npm run pack:subagents` and confirm the tarball includes the intended source
   and README files only.
-- [ ] Decide whether L2 is still needed based on the eval results; verify by
+- [x] Decide whether L2 is still needed based on the eval results; verify by
   recording either "L2 deferred" or a new L2 plan path in
   `docs/implementation-notes/pi-subagents-l1-proactivity-eval.md`.
 
+## Completion Evidence
+
+- L1 prompt metadata implemented in `extensions/pi-subagents/src/subagents.ts` with `promptSnippet` and bounded `promptGuidelines`.
+- User docs updated in `extensions/pi-subagents/README.md` with a `Proactive use` section, positive/negative criteria, and good/bad examples.
+- Six-prompt evaluation recorded at `docs/implementation-notes/pi-subagents-l1-proactivity-eval.md` with 6 PASS, 0 FAIL and `L2 deferred`.
+- Verification passed: `rg -n "promptSnippet|promptGuidelines|Use subagent" extensions/pi-subagents/src/subagents.ts`.
+- Verification passed: `rg -n "Proactive use|Do not use|project-local" extensions/pi-subagents/README.md`.
+- Verification passed: `rg -n 'Audit this branch|Rename.*foo|PASS|FAIL' docs/implementation-notes/pi-subagents-l1-proactivity-eval.md`.
+- Repository gate passed: `npm run check`.
+- Package dry run passed: `npm run pack:subagents`; inspected contents were `LICENSE`, `README.md`, `package.json`, `src/agents.ts`, and `src/subagents.ts`.
+- Independent `reviewer` subagent returned PASS for the source changes, README update, eval note, archived plan, `npm run check`, and `npm run pack:subagents` evidence.
+
 ## Risks
 
 - Prompt metadata may over-delegate if guidelines are too eager; keep explicit
@@ -60,12 +72,12 @@ L1 evaluation shows under-delegation.
 
 ## Completion Checklist
 
-- [ ] `subagent` has static L1 prompt metadata, verified by source grep for
+- [x] `subagent` has static L1 prompt metadata, verified by source grep for
   `promptSnippet` and `promptGuidelines`.
-- [ ] User docs explain proactive and non-proactive use cases, verified by README
+- [x] User docs explain proactive and non-proactive use cases, verified by README
   grep evidence.
-- [ ] The six-prompt eval is recorded with pass/fail outcomes and an L2 decision,
+- [x] The six-prompt eval is recorded with pass/fail outcomes and an L2 decision,
   verified by the eval note path.
-- [ ] `npm run check` passes after the implementation.
-- [ ] `npm run pack:subagents` passes and the dry-run package contents are
+- [x] `npm run check` passes after the implementation.
+- [x] `npm run pack:subagents` passes and the dry-run package contents are
   inspected for intended files.
diff --git a/extensions/pi-subagents/README.md b/extensions/pi-subagents/README.md
@@ -48,6 +48,52 @@ Execution modes:
 - **parallel + aggregator** — run parallel jobs, then pass all outputs into one fan-in agent.
 - **chain** — run sequential steps, passing prior output with `{previous}`.
 
+## 🧭 Proactive use
+
+The `subagent` tool now advertises concise prompt guidance so the main Pi agent can choose it
+without an explicit user request when delegation is a good fit.
+
+Use `subagent` proactively for:
+
+- Independent read-only research, broad codebase reconnaissance, or high-volume command output
+  that would clutter the main context.
+- Parallel multi-domain investigation where each branch can return a concise summary.
+- Independent review or verification after implementation, especially with the read-only
+  `reviewer` agent.
+
+Do not use `subagent` for:
+
+- Simple answers, quick targeted edits, latency-sensitive one-step work, or tasks that need
+  frequent user back-and-forth.
+- Parallel implementation that may edit the same files or shared state; serialize write-heavy work
+  instead.
+- Project-local agents unless the user explicitly opts into them with `agentScope: "project"` or
+  `"both"`; keep confirmation enabled for untrusted repositories.
+
+Good delegation example:
+
+```json
+{
+  "tasks": [
+    {
+      "agent": "scout",
+      "task": "Research auth-related source files. Report paths and open questions. Do not edit files."
+    },
+    {
+      "agent": "scout",
+      "task": "Research auth-related tests. Report coverage gaps. Do not edit files."
+    }
+  ],
+  "aggregator": {
+    "agent": "reviewer",
+    "task": "Merge these findings into a concise implementation-risk summary. Use {previous}."
+  }
+}
+```
+
+Bad delegation example: do not spawn a worker just to rename one symbol in a known file; edit it
+directly in the main conversation.
+
 ## 🚀 Examples
 
 Run one read-only reconnaissance agent:
diff --git a/extensions/pi-subagents/src/subagents.ts b/extensions/pi-subagents/src/subagents.ts
@@ -597,6 +597,15 @@ export default function (pi: ExtensionAPI) {
 			'Default agent scope is "user" (from ~/.pi/agent/agents).',
 			'To enable project-local agents in .pi/agents, set agentScope: "both" (or "project").',
 		].join(" "),
+		promptSnippet:
+			"Delegate independent research, review, verification, or multi-step work to isolated Pi subagents.",
+		promptGuidelines: [
+			"Use subagent for independent read-only research, broad codebase reconnaissance, high-volume command output, multi-domain parallel investigation, or an independent reviewer after implementation.",
+			"Use subagent parallel mode when work splits into independent tasks; prefer read-only agents such as scout or reviewer for fan-out and serialize write-heavy implementation that touches the same files.",
+			"Do not use subagent for simple answers, quick targeted edits, latency-sensitive one-step work, or tasks requiring frequent user back-and-forth.",
+			'Do not use subagent with project-local agents unless the user explicitly wants project agents or sets agentScope to "project" or "both"; keep confirmation enabled for untrusted repositories.',
+			"When using subagent, write self-contained tasks with file paths, context, expected output, and whether the subagent may edit files.",
+		],
 		parameters: SubagentParams,
 
 		async execute(toolCallId, params, signal, onUpdate, ctx) {