Add refine-plan annotated plan refinement workflow#46
Add refine-plan annotated plan refinement workflow#46ZenusZhang merged 24 commits intoPolyArch:devfrom
Conversation
Implements a new planning-only command that processes annotated plans containing CMT:/ENDCMT comment blocks and produces a refined plan plus QA document. Core features: - Comment extraction with stable per-run IDs (CMT-1, CMT-2, ...) - Classification into question/change_request/research_request - Plan refinement preserving gen-plan schema structure - QA document generation with comment ledger and outcomes - Language variant support via alternative_plan_language config - Atomic write transactions for in-place mode safety Implementation includes: - scripts/validate-refine-plan-io.sh with exit codes 0-7 - commands/refine-plan.md with multi-phase workflow (592 lines) - prompt-template/plan/refine-plan-qa-template.md - skills/humanize-refine-plan/SKILL.md with user-invocable: false - tests/test-refine-plan.sh with comprehensive test coverage - Updated install scripts and documentation for all platforms - Version bump to 1.16.0 All 7 acceptance criteria met with full test coverage.
Round 1 fixes based on Codex review findings: 1. Replace grep-based CMT: counting in validate-refine-plan-io.sh with stateful awk scanner that correctly handles: - Markers inside HTML comments (ignored) - Markers inside fenced code blocks (ignored) - Empty CMT blocks (not counted as valid) - Unterminated blocks (exit code 3 with context) - Nested CMT blocks (exit code 3 with context) 2. Fix docs/install-for-kimi.md manual install path to strip user-invocable flag (matching installer behavior) and add humanize-refine-plan to uninstall section. 3. Expand tests/test-refine-plan.sh with AC-7 wiring assertions: - SKILL.md frontmatter verification - install-skill.sh wiring check - Install docs entries for claude/codex/kimi - Version parity across plugin.json/marketplace.json/README.md 4. Add regression tests for comment validation edge cases: - HTML-comment-only markers, code-fence markers, empty blocks - Unterminated and nested CMT blocks - Mixed valid/ignored/empty marker scenarios All 178 tests pass.
Record lesson learned from Round 0/1: grep-based CMT: counting does not handle document structure (code fences, HTML comments, malformed blocks). Fix was to use a stateful awk scanner.
Round 2 fixes for remaining Codex review blockers: 1. Replace grep-based required-section check in validator with scan_sections awk function that reuses the same fenced-code and HTML-comment tracking as scan_cmt_blocks. Headings inside ignored regions no longer falsely satisfy exit-code 4 preflight. 2. Add context excerpt to missing-ENDCMT error messages so the user sees the opening-line content alongside line/column/heading. 3. Align docs/install-for-kimi.md manual install to copy all six runtime bundle directories (scripts, hooks, prompt-template, templates, config, agents) matching install-skill.sh behavior. 4. Fix docs/install-for-codex.md runtime bundle listing to include templates/, config/, and agents/ directories. 5. Add regression tests for sections inside fenced code (exit 4), sections inside HTML comments (exit 4), unterminated CMT context excerpt, and real sections outside ignored regions (pass). All 182 tests pass.
Extended scope to cover both CMT counting (exit 3) and required section detection (exit 4) -- same grep-vs-scanner pattern.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…, bitlesson notes - [P2] Add output directory writability check to validate-refine-plan-io.sh Phase 1 now checks if output dir is writable, not just if it exists. Prevents late failures in Phase 7 when creating temp files. Updated docs and added regression test for 0555 directory. - [P3] Fix RLCR statusline to resolve from repo root statusline.sh now uses git rev-parse --show-toplevel to find repo root before probing state directory, so RLCR status shows correctly when Claude starts from a subdirectory. Added regression test. - [P3] Validate BitLesson notes for add/update actions bitlesson-validate-delta.sh now enforces Notes field presence and rejects placeholder text for Action: add/update. Added template and regression tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…, agent_teams - [P2] Ignore legacy .humanize-* dirs in dirty-tree checks setup-rlcr-loop.sh now filters all .humanize* patterns (not just .humanize/) so repositories with legacy folders like .humanize-old/ don't fail clean-tree gate. - [P2] Load merged config for gen-plan command commands/gen-plan.md Phase 0.5 now uses load_merged_config from config-loader.sh instead of reading only .humanize/config.json. Settings like gen_plan_mode and alternative_plan_language now respect user/default layers. - [P3] Honor agent_teams config key in loop bootstrap hooks/lib/loop-common.sh now loads agent_teams from merged config. setup-rlcr-loop.sh initializes AGENT_TEAMS from DEFAULT_AGENT_TEAMS. Project-level agent_teams setting now works without --agent-teams flag. Added regression test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- [P2] Validate new runtime bundle directories before mutating installs install-skill.sh validate_repo() now checks templates/, config/, agents/ directories before starting sync. Prevents partial updates when one of the new runtime bundle dirs is missing. - [P3] Align gen-plan template config contract with config-loader gen-plan-template.md now documents actual merged config behavior: alternative_plan_language resolves from default/user/project layers, no auto-creation of .humanize/config.json. Matches commands/gen-plan.md Phase 0.5 implementation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tability, wrapped hooks - [P1] Stop pre-creating round summaries with placeholder content setup-rlcr-loop.sh no longer pre-creates round-0-summary.md. The stop hook will properly block if Claude exits before writing a real summary. - [P1] Drop extra `-` from Claude selector invocation bitlesson-select.sh removed trailing `-` from codex exec and claude --print invocations. Both commands now consume the assembled BitLesson prompt only from stdin, not as a literal argument. - [P2] Check directory writability for in-place refine-plan output validate-refine-plan-io.sh now checks INPUT_DIR writability for in-place mode. Phase 7 always creates temp files in the same directory for atomic writes. Added regression test for read-only input directory. - [P3] Match wrapped hook invocations in direct-execution block loop-bash-validator.sh extended regex to match wrapped invocations (env, command, VAR=value prefixes). Closes bypass with wrappers like `env FOO=1 bash hooks/...`. Added regression tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add codex_model/codex_effort validation at config-load time in loop-common.sh to catch misconfigured project configs early instead of failing at runtime. Fix statusline to show session-unaware RLCR loops instead of hiding them when session-aware loops exist. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…wrapper regex Validate codex_model against known Codex prefixes (gpt-/o[0-9]) in addition to shell-safety. Remove finalize-summary.md pre-creation that bypassed the missing-summary guard. Extend hook direct-execution guard to block timeout/nice/nohup/strace/ltrace wrappers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…bitlesson_required Skip fenced code blocks and HTML comments when extracting BitLesson Delta sections. Broaden stop-hook .humanize filter to match legacy .humanize-* directories. Remove auto-enable of bitlesson_required for legacy loops that lack the field in state. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ast mode Tighten .humanize dirty-tree filter from \.humanize to \.humanize[-/] so unrelated files like .humanizeconfig are not silently dropped. Add zsh to blocked shell list in loop-bash-validator. Check project-local .claude/settings.json before global for fast mode in statusline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR extends the refine-plan workflow and related tooling/tests, while tightening loop behavior around legacy .humanize* paths, BitLesson Delta validation, and config-driven defaults (including agent_teams and translated-plan language).
Changes:
- Added
refine-plancommand documentation, QA template, IO validator script, skill packaging, and a comprehensive test suite. - Improved RLCR/stop-hook robustness: legacy
.humanize-*dirs ignored for git-dirty checks, BitLesson Delta parsing hardened, and stricter hook wrapper blocking. - Updated configuration semantics and docs: move to
alternative_plan_language, validate Codex config values, and allowagent_teamsdefault via merged config.
Reviewed changes
Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test-unified-codex-config.sh | Adds assertions for caller-presets and invalid config warnings/fallbacks. |
| tests/test-stop-hook-legacy-compat.sh | New regression tests for legacy .humanize-* handling and bitlesson_required behavior. |
| tests/test-refine-plan.sh | New comprehensive tests for refine-plan docs, validator exit codes, and reference behaviors. |
| tests/test-bitlesson-validate-delta.sh | New tests for BitLesson Delta validator rules (Notes required, ignore fenced/HTML blocks). |
| tests/test-bitlesson-select-routing.sh | Updates fixtures to use .humanize/bitlesson.md path. |
| tests/test-allowlist-validators.sh | Adds coverage for blocking wrapper-based direct hook execution. |
| tests/test-agent-teams.sh | Adds test asserting agent_teams can be enabled via project config. |
| tests/run-all-tests.sh | Wires in new test suites. |
| tests/robustness/test-setup-scripts-robustness.sh | Adds robustness check that .humanizeconfig still counts as dirty. |
| skills/humanize-refine-plan/SKILL.md | Introduces refine-plan flow skill definition and guarantees. |
| scripts/validate-refine-plan-io.sh | New validator for refine-plan inputs/outputs and mode flags. |
| scripts/setup-rlcr-loop.sh | Adds DEFAULT_AGENT_TEAMS usage; expands ignored untracked patterns; removes summary template creation block. |
| scripts/install-skill.sh | Adds refine-plan skill and ensures runtime bundle includes templates/config/agents. |
| scripts/bitlesson-validate-delta.sh | Extracts BitLesson Delta outside fences/HTML comments; enforces Notes for add/update. |
| scripts/bitlesson-select.sh | Adjusts provider invocation for Codex/Claude selector execution. |
| prompt-template/plan/refine-plan-qa-template.md | Adds QA ledger template for refine-plan outputs. |
| prompt-template/plan/gen-plan-template.md | Updates translated-variant/config semantics documentation. |
| prompt-template/block/bitlesson-delta-missing-notes.md | Adds block template for missing Notes in BitLesson Delta. |
| hooks/loop-codex-stop-hook.sh | Aligns dirty-check ignores with legacy variants; removes legacy bitlesson_required fallback; adjusts finalize scaffolding. |
| hooks/loop-bash-validator.sh | Strengthens detection of wrapped direct hook execution (env/timeout/nice/nohup/trace/etc.). |
| hooks/lib/loop-common.sh | Validates codex_model/effort; loads DEFAULT_AGENT_TEAMS from merged config. |
| docs/usage.md | Documents refine-plan flow and typical gen-plan -> refine-plan -> start-rlcr-loop usage. |
| docs/install-for-kimi.md | Adds refine-plan skill and runtime bundle components to Kimi install steps. |
| docs/install-for-codex.md | Adds refine-plan skill and runtime bundle components to Codex install steps. |
| docs/install-for-claude.md | Adds refine-plan command to post-install command list. |
| config/default_config.json | Replaces chinese_plan with alternative_plan_language. |
| commands/refine-plan.md | Adds the refine-plan command spec and phased workflow requirements. |
| commands/gen-plan.md | Updates config loading docs to merged config-loader semantics and legacy fallback rules. |
| agents/bitlesson-selector.md | Updates agent doc to reference configured BitLesson path (normally .humanize/bitlesson.md). |
| README.md | Adds refine-plan to quick start workflow. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| CODEX_EXEC_ARGS+=("$CODEX_AUTO_FLAG" "-C" "$CODEX_PROJECT_ROOT") | ||
|
|
||
| RAW_OUTPUT="$(printf '%s' "$PROMPT" | run_with_timeout "$SELECTOR_TIMEOUT" codex exec "${CODEX_EXEC_ARGS[@]}" -)" || CODEX_EXIT_CODE=$? | ||
| RAW_OUTPUT="$(printf '%s' "$PROMPT" | run_with_timeout "$SELECTOR_TIMEOUT" codex exec "${CODEX_EXEC_ARGS[@]}")" || CODEX_EXIT_CODE=$? |
There was a problem hiding this comment.
Restored to the origin/dev invocation form in d7d7bc6.
For the reviewed change, I rolled the selector invocation back to the prior stdin contract instead of keeping the altered form:
codex exec ... -claude --print ... -
I kept the additional regression coverage from 6e55a6a so the Codex selector path now explicitly proves that it requires the trailing - and consumes non-empty prompt content from stdin.
Validated with tests/test-bitlesson-select-routing.sh.
| SUMMARY_PATH="$LOOP_DIR/round-0-summary.md" | ||
|
|
||
| # Create the round-0 summary template with BitLesson Delta section | ||
| if [[ "$SKIP_IMPL" != "true" ]]; then | ||
| write_summary_template "$SUMMARY_PATH" | ||
| fi | ||
|
|
||
| if [[ "$SKIP_IMPL" == "true" ]]; then | ||
| # Skip-impl mode: create a prompt for code review only | ||
| cat > "$LOOP_DIR/round-0-prompt.md" << EOF |
There was a problem hiding this comment.
Fixed in 9c577bf.
I restored the round-0 summary scaffold creation during setup by calling write_summary_template "$SUMMARY_PATH" before branching into normal vs --skip-impl mode, so both paths now start with a valid round-0-summary.md.
That gives the stop hook an existing per-round summary target immediately, and it ensures the default ## BitLesson Delta section is present from the start.
Added regression coverage for both paths:
tests/robustness/test-setup-scripts-robustness.shchecks--skip-implcreatesround-0-summary.mdwith BitLesson Delta defaultstests/test-unified-codex-config.shchecks normal setup execution creates the same scaffold
Validated with:
bash -n scripts/setup-rlcr-loop.shtests/robustness/test-setup-scripts-robustness.shtests/test-unified-codex-config.sh
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c7f9a02682
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "bitlesson_model": "haiku", | ||
| "agent_teams": false, | ||
| "chinese_plan": false, | ||
| "alternative_plan_language": "", |
There was a problem hiding this comment.
Preserve legacy chinese_plan fallback behavior
Adding "alternative_plan_language": "" to the required default config means load_merged_config will always emit that key, so the HAS_ALT_LANGUAGE_KEY=false branch documented in commands/gen-plan.md (the only path that honors legacy chinese_plan: true) becomes unreachable. In practice, projects still using only chinese_plan are now forced into the "explicit empty alt language" path and lose translated-plan generation, which is a backward-compatibility regression.
Useful? React with 👍 / 👎.
| # variants from the dirty determination because local plugin state under | ||
| # .humanize/ (.humanize/bitlesson.md, config.json, rlcr/) is intentionally | ||
| # untracked. | ||
| HUMANIZE_UNTRACKED_PATTERN='^\?\? \.humanize[-/]' |
There was a problem hiding this comment.
Restrict .humanize ignore regex to directory paths
The pattern ^\?\? \.humanize[-/] matches any untracked name beginning with .humanize-, including regular files (for example .humanize-old.txt), but this block is intended to ignore only runtime directories. Because these matches are removed from dirty-tree checks, real untracked files can now be silently excluded and the loop can continue on a non-clean worktree.
Useful? React with 👍 / 👎.
Summary
This PR adds the new
/humanize:refine-planworkflow for turning annotatedgen-planoutput into a cleaned implementation plan plus a QA ledger, and includes the supporting validation, docs, skills, and regression coverage needed to ship it.1. refine-plan Command
/humanize:refine-planslash command in commands/refine-plan.md for planning-only refinement of annotated plans.CMT:/ENDCMTcomment structure, in-place vs new-file writes, and QA output paths with explicit exit codes.humanize-refine-planskill in skills/humanize-refine-plan/SKILL.md and wires it into Codex/Kimi installation docs and installer scripts.gen-planand user-facing docs so the intended flow is nowgen-plan -> refine-plan -> start-rlcr-loopwhen a plan has reviewer annotations.2. Other Changes In This Branch
origin/devplan-understanding-quiz documentation into docs/usage.md without losing the newrefine-planworkflow guidance.Validation
bash -n scripts/validate-refine-plan-io.shtests/test-refine-plan.shtests/test-unified-codex-config.sh