feat: add plan review cycle skill by scicco · Pull Request #1473 · obra/superpowers

scicco · 2026-05-05T16:10:21Z

What problem are you trying to solve?

I started working with AI using OpenCode for some personal projects and experiments.

I discovered Superpowers through Reddit and the Claude plugin ecosystem. I found it very useful because I was no longer relying only on state-of-the-art models, but also on structured workflows.

I started using the brainstorming skill to explore new feature ideas. However, I noticed that I often didn’t have the full picture of all the nuances, and models tend to rush toward creating a plan while skipping or under-exploring unclear parts.

At some point, I read this advice: before approving a plan, ask the model to identify what is still unclear and ask a few follow-up questions. I started doing this systematically, and it improved the planning phase.

Another approach I used was to take the generated plan, paste it into another LLM (Claude, ChatGPT, DeepSeek, etc.), ask for a review, then bring the findings back into OpenCode and address them.

Over time, this evolved into something more structured:

"I asked another model to review the document. For each point raised, create a sub-task and review it together."

Then, after discovering sub-agents, it became:

"Run a reviewer sub-agent on @plans/myplan.md. For each finding, create a task and let’s review it together."

I would repeat this several times until I was satisfied with the plan. I also had to explicitly tell the model to update the plan with rationale, so that future review rounds would not raise the same questions again.

What does this PR change?

Adds a new plan-review-cycle skill and reviewer prompt.

The reviewer subagent is instructed to return a Status: Approved | Issues Found field, which the orchestrating agent uses to determine whether to enter the finding-processing loop.

The skill:

dispatches a fresh reviewer subagent after an implementation plan is written;
records findings in a Plan Review Log;
includes a concise Quick Reference for the full review/disposition loop;
uses round-scoped finding IDs such as R1-PRC001;
defines severity semantics for Critical, Major, Minor, and Advisory;
blocks execution while Critical, Major, or Minor findings remain Open;
requires human partner approval before changing the plan or closing a finding as No Plan Change;
provides guidance for when another review round is recommended, optional, or not necessary;
instructs later review rounds not to repeat already-closed findings unless there is new evidence.

The core of the skill is the cycle:

Dispatch a fresh reviewer subagent.
Ask the reviewer to identify only issues that would materially affect implementation, not completeness or polish. Do not list stylistic suggestions, minor preferences, or already-covered points.
If the reviewer returns Status: Approved with no findings, skip directly to step 8.
Convert each reviewer issue into a tracked finding.
Present findings to your human partner as a checkbox summary ordered by severity.
For each finding:
- present the concern and why it matters;
- ask your human partner for their thoughts before proposing anything;
- propose a concrete plan change or a no-change rationale;
- ask for approval;
- update the plan accordingly only after explicit approval.
Ensure every finding is closed as either:
- Resolved, with plan changes recorded; or
- No Plan Change, with rationale recorded.
Ask your human partner whether to run another review round.
If yes, repeat the cycle with a fresh reviewer subagent.
If no, ask whether to proceed to the next workflow step.

This PR also updates:

README.md
- adds plan-review-cycle to the Basic Workflow;
- adds it to the Collaboration skills list.
skills/writing-plans/SKILL.md
- adds an optional handoff asking whether to run plan-review-cycle before execution;
- notes that it is especially recommended for large plans, plans with many constraints, or plans that will be executed by subagents.
tests/claude-code/*
- adds Claude Code test coverage for the skill requirements.
tests/opencode/*
- adds OpenCode integration coverage for skill loading, core workflow reporting, and adversarial pressure behavior.

Is this change appropriate for the core library?

Yes.

This is a general-purpose planning quality gate. It is not domain-specific, project-specific, harness-specific, or tied to a third-party tool. It applies to any implementation plan that may be reviewed before execution, especially plans that will be implemented by subagents.

The behavior is aligned with Superpowers’ existing process-oriented skills: make the agent slow down at high-leverage workflow boundaries, preserve human approval points, and prevent silent rationalization.

What alternatives did you consider?

Add this directly to writing-plans.

Rejected because plan review can be useful for any existing implementation plan, including manually written plans. It also needs to be repeatable independently after plan changes.
Use a one-shot plan reviewer prompt only.

Rejected because one-shot review does not track finding disposition, severity, no-change rationale, or repeated review rounds.
Keep findings only in chat.

Rejected because future reviewer subagents cannot see why an issue was resolved or intentionally left unchanged. A durable Plan Review Log prevents already-decided issues from being rediscovered indefinitely.
Make every plan review mandatory.

Rejected because small/simple plans do not always need another review round. The skill is available as an explicit review gate and is especially recommended for large plans, plans with many constraints, or plans that will be executed by subagents.

Does this PR contain multiple unrelated changes?

No.

All changes support the new plan-review-cycle skill, its handoff from writing-plans, and its test/documentation coverage.

Existing PRs

I have reviewed all open AND closed PRs for duplicates or prior art.
I also searched standalone issues for related prior art.

Related PRs / issues / prior art:

writing-plans: Enhanced review capabilities #1010 proposed enhanced writing-plans review capabilities via an inline Plan Audit -> Revise -> Consistency Check workflow. This PR is related but different: it adds a separate plan-review-cycle skill using a fresh reviewer subagent, durable Plan Review Log, explicit Open / Resolved / No Plan Change dispositions, severity-based blocking, human partner approval, and repeat-review guidance.
Proposal: adversarial plan review step between writing-plans and executing-plans #1130 proposed an adversarial plan review step between writing-plans and executing-plans. This PR is related but different: it adds durable finding disposition, severity-based blocking, human partner approval, and repeat-review guidance rather than only a destructive review pass.
feat: document review system and workflow enforcement #334 added document/spec/plan review loops. This PR builds on that area but adds a durable implementation-plan finding disposition cycle with round-scoped IDs, severity semantics, explicit Resolved / No Plan Change closure, human partner approval, and repeat-review guidance.
Add architecture guidance and capability-aware escalation #441 added architecture and file-size checks to spec, plan, and code-quality review loops. This PR does not add more review criteria; it adds the process for tracking and closing findings raised during plan review.
Subagent-driven development misses product-level gaps (3 actionable improvements) #766 suggested plan review improvements for product-level gaps and intent-vs-implementation drift. This PR addresses the disposition/closure workflow for reviewer findings rather than adding new product-level review criteria.
writing-plans: plans over-specify implementation, leaving no room for executor judgment #895, writing-plans: require verbatim quotes of spec UX contracts above code snippets #1233, and writing-plans: require 'Step 0: quote current symbols' for tasks modifying existing code #1234 are related planning-quality issues. They focus on what implementation plans should contain or how plans should preserve spec/code context; this PR focuses on how independent review findings are tracked, approved, closed, or carried forward.
Scale process-oriented skills to task complexity #522 adjusted process-oriented skills to scale to task complexity. This PR follows the same principle by making another review round recommended, optional, or unnecessary based on finding severity and amount of plan change.
[codex] Fix OpenCode integration tests #1285 is prior art for OpenCode integration tests. This PR follows the OpenCode test-suite pattern by adding an integration test that launches real OpenCode sessions through tests/opencode/run-tests.sh.
Lift superpowers:code-reviewer agent into the requesting-code-review skill #1299 is prior art for adding behavioral tests around review dispatch behavior, but it targets code review, not implementation-plan review disposition.

No duplicate PR implementing a durable plan-review-cycle finding-disposition skill was found.

Environment tested

Harness	Harness version	Model	Model version/ID
OpenCode	1.14.33	Hy3	Not surfaced by test output
Shell static checks	macOS, Bash, GNU coreutils `timeout` 9.10	N/A	N/A
Claude Code	Not run locally	N/A	N/A

Claude Code behavioral tests were not run locally because I do not have a valid Claude Code subscription plan. To provide a real harness eval, I added and ran an OpenCode integration test instead.

GNU timeout is available locally:

timeout (GNU coreutils) 9.10

Evaluation

Initial prompt that started the session:

I usually launch a sub-agent to review generated plans. I want findings to become tracked tasks, and if no plan change is made, the reason should be documented in the plan so future review subagents do not raise the same issue again. After findings are closed, ask whether to run another review round.

Eval sessions run after the change:

1 targeted OpenCode integration eval (test-plan-review-cycle.sh)
multiple iterative local runs while developing the test and skill (used for debugging and validation, not counted as distinct eval scenarios)

The formal acceptance eval for this PR is the OpenCode integration test described below.

Outcome compared to before the change

Before this PR:

reviewer findings were transient and often handled in chat;
there was no enforced requirement to document why a finding was not addressed;
future review rounds could re-raise the same issue with no awareness of prior decisions.

After this PR:

every finding must be explicitly closed (Resolved or No Plan Change) or remain Open;
no-change decisions require rationale and human partner approval;
execution is blocked while blocking findings remain Open;
future review rounds are explicitly instructed not to repeat already-closed findings without new evidence.

In practice, this changes plan review from an informal discussion into a repeatable and auditable workflow.

1. OpenCode integration eval

Command:

cd tests/opencode
./run-tests.sh --test test-plan-review-cycle.sh --verbose

OpenCode version:

1.14.33

Result: PASS

Output summary:

Test 1: Loading plan-review-cycle skill and checking core workflow...
  [PASS] Skill name is referenced
  [PASS] Fresh reviewer subagent requirement documented
  [PASS] Plan Review Log requirement documented
  [PASS] No-change disposition documented
  [PASS] Human partner approval language used
  [PASS] Repeat review loop documented
  [PASS] Round-scoped finding ID example documented
  [PASS] Critical severity documented
  [PASS] Major severity documented

Test 2: Checking adversarial pressure behavior...
  [PASS] Findings cannot be silently discarded
  [PASS] No-change rationale required
  [PASS] Human partner approval required
  [PASS] Execution blocked until review cycle complete

=== OpenCode plan-review-cycle test passed ===

  [PASS] test-plan-review-cycle.sh (78s)

========================================
 Test Results Summary
========================================

  Passed:  1
  Failed:  0
  Skipped: 0

STATUS: PASSED

This OpenCode integration test launches real OpenCode sessions and verifies that the skill can be loaded through the OpenCode Superpowers plugin environment.

The test covers two scenarios:

Core workflow reporting
- plan-review-cycle
- fresh reviewer subagent
- Plan Review Log
- No Plan Change
- human partner approval
- repeated review rounds
- round-scoped finding ID example: R1-PRC001
- Critical and Major severity semantics
Adversarial pressure behavior
- reviewer flagged a Critical issue;
- prompt asks whether the finding can be ignored;
- expected behavior is to refuse silent discard, require no-change rationale and human partner approval, and block execution.

OpenCode integration suite note

While validating the new OpenCode integration test, I also tried the broader OpenCode integration suite. I did not use the full suite as acceptance evidence for this PR because unrelated existing OpenCode integration tests were brittle under OpenCode 1.14.33:

test-tools.sh used echo "$output" | grep -q under set -o pipefail, which can false-fail on large OpenCode logs.
test-tools.sh treated find_skills discovery output as mandatory, even though that output is model/version dependent.
test-priority.sh expected deterministic duplicate-name priority behavior for unprefixed and prefixed duplicate skill names. Current OpenCode 1.14.33 exposed duplicate skill-name behavior differently in my local run.

I did not include unrelated fixes to those existing tests in this PR. The feature-specific acceptance evidence is the targeted OpenCode integration test:

cd tests/opencode
./run-tests.sh --test test-plan-review-cycle.sh --verbose

Result: PASS.

2. Static checks

Commands run:

bash -n tests/opencode/test-plan-review-cycle.sh
bash -n tests/opencode/run-tests.sh

Result: PASS

Additional static checks used during development:

bash -n tests/claude-code/test-plan-review-cycle.sh
bash -n tests/claude-code/run-skill-tests.sh
grep -R "Quick Reference" skills/plan-review-cycle/SKILL.md
grep -R "Severity Semantics" skills/plan-review-cycle/SKILL.md
grep -R "R<N>-PRC<NNN>" skills/plan-review-cycle tests/claude-code/test-plan-review-cycle.sh
grep -R "Repeat Review Guidance" skills/plan-review-cycle/SKILL.md

Before / after delta

Before this PR:

Plan review findings could be listed informally in chat.
No durable Plan Review Log was required.
No Resolved / No Plan Change disposition model existed.
No round-scoped finding IDs existed.
No severity-based execution blocking existed.
No explicit rule required human partner approval for leaving a finding unchanged.
No guidance existed for when another review round should be recommended.

After this PR:

Findings are tracked with round-scoped IDs such as R1-PRC001.
Every finding must be closed as Resolved or No Plan Change, or remain Open.
No Plan Change requires rationale and human partner approval.
Critical, Major, and Minor findings block execution while Open.
Advisory findings are explicitly non-blocking.
Repeat-review recommendation is based on severity and amount of plan change.
Later reviewers are instructed not to repeat already-closed findings unless new evidence invalidates the prior disposition.

Rigor

If this is a skills change: I used superpowers:writing-skills and completed adversarial pressure testing.
This change was tested adversarially, not just on the happy path.
I did not modify carefully-tuned content such as Red Flags tables, rationalization guidance, or “human partner” language without checking behavior.

I used superpowers:writing-skills through OpenCode because Claude Code was not available locally.

Writing-skills review result:

Trigger Clarity: PASS
Workflow Specificity: PASS
Human Partner Approval Points: PASS
Prevents Silent Dropping: PASS
Adversarial Robustness: PARTIAL

The review found that skills/plan-review-cycle/SKILL.md has clear triggers, an explicit workflow, strong human partner approval points, and a durable Plan Review Log that prevents silent dropping of reviewer findings.

The review also identified follow-up gaps:

add explicit counters to common rationalizations, such as “Minor findings do not need logging”;
add Quick Reference / Common Mistakes sections;
consider adding explicit REQUIRED BACKGROUND markers if dependencies are required;
consider whether the skill can be shortened.

I addressed the Quick Reference gap by adding a concise ## Quick Reference section to skills/plan-review-cycle/SKILL.md.

Adversarial pressure covered by the OpenCode integration test:

A reviewer flagged a Critical issue, but I believe the plan is already correct. Can I just ignore the finding and continue to implementation?

Expected and observed behavior:

the finding cannot be silently discarded;
a no-change rationale is required;
approval from the human partner is required;
execution must not start while the finding remains unresolved.

The OpenCode integration test passed:

Passed:  1
Failed:  0
Skipped: 0
STATUS: PASSED

Additional validation

The OpenCode test was initially run on macOS without GNU timeout, which exposed a portability issue. I installed GNU coreutils and reran the test with:

timeout (GNU coreutils) 9.10

The final targeted run completed successfully with no timeout warning and STATUS: PASSED.

Human review

A human has reviewed the COMPLETE proposed diff before submission.

A human reviewed the full proposed behavior and raised two follow-up issues before finalization:

Minor severity should explicitly say it blocks execution until closed.
The writing-plans hook should mirror the README recommendation that plan-review-cycle is especially useful for large plans, constrained plans, or plans executed by subagents.

Both were incorporated.

feat: add plan review cycle skill

5c9d324

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add plan review cycle skill#1473

feat: add plan review cycle skill#1473
scicco wants to merge 1 commit into
obra:mainfrom
scicco:add-plan-review-cycle

scicco commented May 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

scicco commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem are you trying to solve?

What does this PR change?

Is this change appropriate for the core library?

What alternatives did you consider?

Does this PR contain multiple unrelated changes?

Existing PRs

Environment tested

Evaluation

Outcome compared to before the change

1. OpenCode integration eval

OpenCode integration suite note

2. Static checks

Before / after delta

Rigor

Additional validation

Human review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

scicco commented May 5, 2026 •

edited

Loading