Skip to content

Commit c514802

Browse files
committed
feat(subagents): add proactive prompt guidance
1 parent 7c7fbd2 commit c514802

4 files changed

Lines changed: 141 additions & 12 deletions

File tree

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# pi-subagents L1 proactivity evaluation
2+
3+
Date: 2026-05-17
4+
Branch: `docs/pi-subagents-proactivity-research-plan`
5+
Base before implementation: `7c7fbd2`
6+
7+
## Method
8+
9+
This MVP changes static tool prompt metadata and docs, not runtime orchestration.
10+
The evaluation therefore checks the current branch's L1 guidance against the six
11+
prompts from `docs/implementation-notes/pi-subagents-proactivity-research.md`.
12+
No live autonomous LLM delegation run was used as the pass/fail oracle because
13+
model tool-choice behavior is nondeterministic and can incur additional nested
14+
subagent calls. The concrete evidence is the implemented `promptSnippet`,
15+
`promptGuidelines`, README rubric, repository checks, package dry run, and an
16+
independent reviewer audit.
17+
18+
## Static guidance evidence
19+
20+
- `extensions/pi-subagents/src/subagents.ts` defines `promptSnippet` and
21+
`promptGuidelines` on the `subagent` tool.
22+
- The guidelines explicitly say to use `subagent` for independent read-only
23+
research, high-volume output, multi-domain parallel investigation, and
24+
independent review after implementation.
25+
- The guidelines explicitly say not to use `subagent` for simple answers, quick
26+
targeted edits, latency-sensitive one-step work, frequent user back-and-forth,
27+
same-file write-heavy fan-out, or project-local agents without explicit opt-in.
28+
- `extensions/pi-subagents/README.md` mirrors the rubric in a "Proactive use"
29+
section and includes good/bad examples.
30+
31+
## Six-prompt matrix
32+
33+
| # | Prompt | Expected | L1 evidence | Result |
34+
| --- | --- | --- | --- | --- |
35+
| 1 | "Audit this branch for release blockers before I merge." | Should use `subagent` `reviewer`/`scout`. | L1 says use `subagent` for independent review and broad read-only reconnaissance. | PASS |
36+
| 2 | "Research auth, database, and API modules in parallel." | Should use parallel `subagent`. | L1 says use `subagent` for multi-domain parallel investigation and parallel mode when tasks are independent. | PASS |
37+
| 3 | "Implement the change, then independently verify it." | Should implement in main/worker, then use `reviewer`. | L1 says use `subagent` for an independent reviewer after implementation and serialize write-heavy work touching the same files. | PASS |
38+
| 4 | "Explain this README sentence in plain language." | Should not use `subagent`. | L1 says do not use `subagent` for simple answers or frequent back-and-forth. | PASS |
39+
| 5 | "Rename `foo` to `bar` in one file." | Should not use `subagent`. | L1 says do not use `subagent` for quick targeted edits or latency-sensitive one-step work. | PASS |
40+
| 6 | "Use project agents to review this repo." | Conditional: require explicit project-agent opt-in and confirmation. | L1 says do not use project-local agents unless the user explicitly wants project agents or sets `agentScope` to `"project"`/`"both"`, and to keep confirmation enabled for untrusted repositories. | PASS |
41+
42+
Summary: 6 PASS, 0 FAIL.
43+
44+
## Verification commands
45+
46+
- `rg -n "promptSnippet|promptGuidelines|Use subagent" extensions/pi-subagents/src/subagents.ts`
47+
- `rg -n "Proactive use|Do not use|project-local" extensions/pi-subagents/README.md`
48+
- `npm run check` — passed.
49+
- `npm run pack:subagents` — passed. Dry-run package contents inspected:
50+
`LICENSE`, `README.md`, `package.json`, `src/agents.ts`, and
51+
`src/subagents.ts` only.
52+
53+
## L2 decision
54+
55+
L2 deferred.
56+
57+
The L1 guidance covers all six static expected decisions with explicit positive
58+
and negative rules. Do not add a `before_agent_start` dynamic orchestration hint
59+
until real session evidence shows L1 under-delegates on complex prompts or users
60+
ask for a stronger opt-in coordinator mode. If L2 is revisited, keep it behind a
61+
disabled-by-default feature flag and measure false positives against this same
62+
matrix.

docs/plans/2026-05-17_pi-subagents-l1-proactivity-mvp-plan.md renamed to docs/plans/archived/2026-05-17_pi-subagents-l1-proactivity-mvp-plan.md

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -23,33 +23,45 @@ L1 evaluation shows under-delegation.
2323

2424
## Plan
2525

26-
- [ ] Add L1 prompt metadata to `extensions/pi-subagents/src/subagents.ts` by
26+
- [x] Add L1 prompt metadata to `extensions/pi-subagents/src/subagents.ts` by
2727
defining `promptSnippet` and concise `promptGuidelines` on the `subagent`
2828
`registerTool` call; verify with
2929
`rg -n "promptSnippet|promptGuidelines|Use subagent" extensions/pi-subagents/src/subagents.ts`.
30-
- [ ] Keep the guidance bounded by encoding both use and non-use criteria: use
30+
- [x] Keep the guidance bounded by encoding both use and non-use criteria: use
3131
`subagent` for independent read-only research, parallel multi-domain work, and
3232
independent review; avoid it for simple answers, same-file write conflicts,
3333
untrusted project agents, or latency-sensitive one-step work; verify by reading
3434
the final `promptGuidelines` bullets.
35-
- [ ] Update `extensions/pi-subagents/README.md` with a short "Proactive use"
35+
- [x] Update `extensions/pi-subagents/README.md` with a short "Proactive use"
3636
section that mirrors the rubric and includes at least one good and one bad
3737
delegation example; verify with
3838
`rg -n "Proactive use|Do not use|project-local" extensions/pi-subagents/README.md`.
39-
- [ ] Run the six-prompt L1 evaluation matrix from
39+
- [x] Run the six-prompt L1 evaluation matrix from
4040
`docs/implementation-notes/pi-subagents-proactivity-research.md` against the
4141
current branch and record results in
4242
`docs/implementation-notes/pi-subagents-l1-proactivity-eval.md`; verify with
4343
`rg -n 'Audit this branch|Rename.*foo|PASS|FAIL' docs/implementation-notes/pi-subagents-l1-proactivity-eval.md`.
44-
- [ ] Run repository verification after code/docs changes; verify with
44+
- [x] Run repository verification after code/docs changes; verify with
4545
`npm run check` from the repository root.
46-
- [ ] Preview the package contents after metadata/docs changes; verify with
46+
- [x] Preview the package contents after metadata/docs changes; verify with
4747
`npm run pack:subagents` and confirm the tarball includes the intended source
4848
and README files only.
49-
- [ ] Decide whether L2 is still needed based on the eval results; verify by
49+
- [x] Decide whether L2 is still needed based on the eval results; verify by
5050
recording either "L2 deferred" or a new L2 plan path in
5151
`docs/implementation-notes/pi-subagents-l1-proactivity-eval.md`.
5252

53+
## Completion Evidence
54+
55+
- L1 prompt metadata implemented in `extensions/pi-subagents/src/subagents.ts` with `promptSnippet` and bounded `promptGuidelines`.
56+
- User docs updated in `extensions/pi-subagents/README.md` with a `Proactive use` section, positive/negative criteria, and good/bad examples.
57+
- Six-prompt evaluation recorded at `docs/implementation-notes/pi-subagents-l1-proactivity-eval.md` with 6 PASS, 0 FAIL and `L2 deferred`.
58+
- Verification passed: `rg -n "promptSnippet|promptGuidelines|Use subagent" extensions/pi-subagents/src/subagents.ts`.
59+
- Verification passed: `rg -n "Proactive use|Do not use|project-local" extensions/pi-subagents/README.md`.
60+
- Verification passed: `rg -n 'Audit this branch|Rename.*foo|PASS|FAIL' docs/implementation-notes/pi-subagents-l1-proactivity-eval.md`.
61+
- Repository gate passed: `npm run check`.
62+
- Package dry run passed: `npm run pack:subagents`; inspected contents were `LICENSE`, `README.md`, `package.json`, `src/agents.ts`, and `src/subagents.ts`.
63+
- Independent `reviewer` subagent returned PASS for the source changes, README update, eval note, archived plan, `npm run check`, and `npm run pack:subagents` evidence.
64+
5365
## Risks
5466

5567
- Prompt metadata may over-delegate if guidelines are too eager; keep explicit
@@ -60,12 +72,12 @@ L1 evaluation shows under-delegation.
6072

6173
## Completion Checklist
6274

63-
- [ ] `subagent` has static L1 prompt metadata, verified by source grep for
75+
- [x] `subagent` has static L1 prompt metadata, verified by source grep for
6476
`promptSnippet` and `promptGuidelines`.
65-
- [ ] User docs explain proactive and non-proactive use cases, verified by README
77+
- [x] User docs explain proactive and non-proactive use cases, verified by README
6678
grep evidence.
67-
- [ ] The six-prompt eval is recorded with pass/fail outcomes and an L2 decision,
79+
- [x] The six-prompt eval is recorded with pass/fail outcomes and an L2 decision,
6880
verified by the eval note path.
69-
- [ ] `npm run check` passes after the implementation.
70-
- [ ] `npm run pack:subagents` passes and the dry-run package contents are
81+
- [x] `npm run check` passes after the implementation.
82+
- [x] `npm run pack:subagents` passes and the dry-run package contents are
7183
inspected for intended files.

extensions/pi-subagents/README.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,52 @@ Execution modes:
4848
- **parallel + aggregator** — run parallel jobs, then pass all outputs into one fan-in agent.
4949
- **chain** — run sequential steps, passing prior output with `{previous}`.
5050

51+
## 🧭 Proactive use
52+
53+
The `subagent` tool now advertises concise prompt guidance so the main Pi agent can choose it
54+
without an explicit user request when delegation is a good fit.
55+
56+
Use `subagent` proactively for:
57+
58+
- Independent read-only research, broad codebase reconnaissance, or high-volume command output
59+
that would clutter the main context.
60+
- Parallel multi-domain investigation where each branch can return a concise summary.
61+
- Independent review or verification after implementation, especially with the read-only
62+
`reviewer` agent.
63+
64+
Do not use `subagent` for:
65+
66+
- Simple answers, quick targeted edits, latency-sensitive one-step work, or tasks that need
67+
frequent user back-and-forth.
68+
- Parallel implementation that may edit the same files or shared state; serialize write-heavy work
69+
instead.
70+
- Project-local agents unless the user explicitly opts into them with `agentScope: "project"` or
71+
`"both"`; keep confirmation enabled for untrusted repositories.
72+
73+
Good delegation example:
74+
75+
```json
76+
{
77+
"tasks": [
78+
{
79+
"agent": "scout",
80+
"task": "Research auth-related source files. Report paths and open questions. Do not edit files."
81+
},
82+
{
83+
"agent": "scout",
84+
"task": "Research auth-related tests. Report coverage gaps. Do not edit files."
85+
}
86+
],
87+
"aggregator": {
88+
"agent": "reviewer",
89+
"task": "Merge these findings into a concise implementation-risk summary. Use {previous}."
90+
}
91+
}
92+
```
93+
94+
Bad delegation example: do not spawn a worker just to rename one symbol in a known file; edit it
95+
directly in the main conversation.
96+
5197
## 🚀 Examples
5298

5399
Run one read-only reconnaissance agent:

extensions/pi-subagents/src/subagents.ts

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -597,6 +597,15 @@ export default function (pi: ExtensionAPI) {
597597
'Default agent scope is "user" (from ~/.pi/agent/agents).',
598598
'To enable project-local agents in .pi/agents, set agentScope: "both" (or "project").',
599599
].join(" "),
600+
promptSnippet:
601+
"Delegate independent research, review, verification, or multi-step work to isolated Pi subagents.",
602+
promptGuidelines: [
603+
"Use subagent for independent read-only research, broad codebase reconnaissance, high-volume command output, multi-domain parallel investigation, or an independent reviewer after implementation.",
604+
"Use subagent parallel mode when work splits into independent tasks; prefer read-only agents such as scout or reviewer for fan-out and serialize write-heavy implementation that touches the same files.",
605+
"Do not use subagent for simple answers, quick targeted edits, latency-sensitive one-step work, or tasks requiring frequent user back-and-forth.",
606+
'Do not use subagent with project-local agents unless the user explicitly wants project agents or sets agentScope to "project" or "both"; keep confirmation enabled for untrusted repositories.',
607+
"When using subagent, write self-contained tasks with file paths, context, expected output, and whether the subagent may edit files.",
608+
],
600609
parameters: SubagentParams,
601610

602611
async execute(toolCallId, params, signal, onUpdate, ctx) {

0 commit comments

Comments
 (0)