Add refine-plan annotated plan refinement workflow by ZenusZhang · Pull Request #46 · PolyArch/humanize

ZenusZhang · 2026-03-15T08:26:42Z

Summary

This PR adds the new /humanize:refine-plan workflow for turning annotated gen-plan output into a cleaned implementation plan plus a QA ledger, and includes the supporting validation, docs, skills, and regression coverage needed to ship it.

1. refine-plan Command

Adds a new /humanize:refine-plan slash command in commands/refine-plan.md for planning-only refinement of annotated plans.
Adds scripts/validate-refine-plan-io.sh to validate CLI arguments, required plan sections, CMT: / ENDCMT comment structure, in-place vs new-file writes, and QA output paths with explicit exit codes.
Adds prompt-template/plan/refine-plan-qa-template.md so each refinement run emits a structured QA ledger covering comment handling, answers, research, plan edits, remaining decisions, and metadata.
Adds the humanize-refine-plan skill in skills/humanize-refine-plan/SKILL.md and wires it into Codex/Kimi installation docs and installer scripts.
Extends gen-plan and user-facing docs so the intended flow is now gen-plan -> refine-plan -> start-rlcr-loop when a plan has reviewer annotations.
Adds broad regression coverage in tests/test-refine-plan.sh, plus supporting install/setup assertions for the new command and skill.

2. Other Changes In This Branch

Unifies Codex config handling across RLCR, PR loop, and ask-codex paths, removes stale reviewer-model config usage, and adds regression coverage in tests/test-unified-codex-config.sh.
Tightens hook and setup behavior around session-aware loop lookup, validator allowlists, stop-hook legacy compatibility, and agent-teams coverage.
Improves BitLesson selector and delta validation behavior, updates related notes/templates, and adds regression tests.
Updates statusline behavior to better match session-aware RLCR loop selection and refreshes README/usage documentation accordingly.
Merges the latest origin/dev plan-understanding-quiz documentation into docs/usage.md without losing the new refine-plan workflow guidance.

Validation

bash -n scripts/validate-refine-plan-io.sh
tests/test-refine-plan.sh
tests/test-unified-codex-config.sh

Implements a new planning-only command that processes annotated plans containing CMT:/ENDCMT comment blocks and produces a refined plan plus QA document. Core features: - Comment extraction with stable per-run IDs (CMT-1, CMT-2, ...) - Classification into question/change_request/research_request - Plan refinement preserving gen-plan schema structure - QA document generation with comment ledger and outcomes - Language variant support via alternative_plan_language config - Atomic write transactions for in-place mode safety Implementation includes: - scripts/validate-refine-plan-io.sh with exit codes 0-7 - commands/refine-plan.md with multi-phase workflow (592 lines) - prompt-template/plan/refine-plan-qa-template.md - skills/humanize-refine-plan/SKILL.md with user-invocable: false - tests/test-refine-plan.sh with comprehensive test coverage - Updated install scripts and documentation for all platforms - Version bump to 1.16.0 All 7 acceptance criteria met with full test coverage.

Round 1 fixes based on Codex review findings: 1. Replace grep-based CMT: counting in validate-refine-plan-io.sh with stateful awk scanner that correctly handles: - Markers inside HTML comments (ignored) - Markers inside fenced code blocks (ignored) - Empty CMT blocks (not counted as valid) - Unterminated blocks (exit code 3 with context) - Nested CMT blocks (exit code 3 with context) 2. Fix docs/install-for-kimi.md manual install path to strip user-invocable flag (matching installer behavior) and add humanize-refine-plan to uninstall section. 3. Expand tests/test-refine-plan.sh with AC-7 wiring assertions: - SKILL.md frontmatter verification - install-skill.sh wiring check - Install docs entries for claude/codex/kimi - Version parity across plugin.json/marketplace.json/README.md 4. Add regression tests for comment validation edge cases: - HTML-comment-only markers, code-fence markers, empty blocks - Unterminated and nested CMT blocks - Mixed valid/ignored/empty marker scenarios All 178 tests pass.

Record lesson learned from Round 0/1: grep-based CMT: counting does not handle document structure (code fences, HTML comments, malformed blocks). Fix was to use a stateful awk scanner.

Round 2 fixes for remaining Codex review blockers: 1. Replace grep-based required-section check in validator with scan_sections awk function that reuses the same fenced-code and HTML-comment tracking as scan_cmt_blocks. Headings inside ignored regions no longer falsely satisfy exit-code 4 preflight. 2. Add context excerpt to missing-ENDCMT error messages so the user sees the opening-line content alongside line/column/heading. 3. Align docs/install-for-kimi.md manual install to copy all six runtime bundle directories (scripts, hooks, prompt-template, templates, config, agents) matching install-skill.sh behavior. 4. Fix docs/install-for-codex.md runtime bundle listing to include templates/, config/, and agents/ directories. 5. Add regression tests for sections inside fenced code (exit 4), sections inside HTML comments (exit 4), unterminated CMT context excerpt, and real sections outside ignored regions (pass). All 182 tests pass.

Extended scope to cover both CMT counting (exit 3) and required section detection (exit 4) -- same grep-vs-scanner pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…, bitlesson notes - [P2] Add output directory writability check to validate-refine-plan-io.sh Phase 1 now checks if output dir is writable, not just if it exists. Prevents late failures in Phase 7 when creating temp files. Updated docs and added regression test for 0555 directory. - [P3] Fix RLCR statusline to resolve from repo root statusline.sh now uses git rev-parse --show-toplevel to find repo root before probing state directory, so RLCR status shows correctly when Claude starts from a subdirectory. Added regression test. - [P3] Validate BitLesson notes for add/update actions bitlesson-validate-delta.sh now enforces Notes field presence and rejects placeholder text for Action: add/update. Added template and regression tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…, agent_teams - [P2] Ignore legacy .humanize-* dirs in dirty-tree checks setup-rlcr-loop.sh now filters all .humanize* patterns (not just .humanize/) so repositories with legacy folders like .humanize-old/ don't fail clean-tree gate. - [P2] Load merged config for gen-plan command commands/gen-plan.md Phase 0.5 now uses load_merged_config from config-loader.sh instead of reading only .humanize/config.json. Settings like gen_plan_mode and alternative_plan_language now respect user/default layers. - [P3] Honor agent_teams config key in loop bootstrap hooks/lib/loop-common.sh now loads agent_teams from merged config. setup-rlcr-loop.sh initializes AGENT_TEAMS from DEFAULT_AGENT_TEAMS. Project-level agent_teams setting now works without --agent-teams flag. Added regression test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- [P2] Validate new runtime bundle directories before mutating installs install-skill.sh validate_repo() now checks templates/, config/, agents/ directories before starting sync. Prevents partial updates when one of the new runtime bundle dirs is missing. - [P3] Align gen-plan template config contract with config-loader gen-plan-template.md now documents actual merged config behavior: alternative_plan_language resolves from default/user/project layers, no auto-creation of .humanize/config.json. Matches commands/gen-plan.md Phase 0.5 implementation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tability, wrapped hooks - [P1] Stop pre-creating round summaries with placeholder content setup-rlcr-loop.sh no longer pre-creates round-0-summary.md. The stop hook will properly block if Claude exits before writing a real summary. - [P1] Drop extra `-` from Claude selector invocation bitlesson-select.sh removed trailing `-` from codex exec and claude --print invocations. Both commands now consume the assembled BitLesson prompt only from stdin, not as a literal argument. - [P2] Check directory writability for in-place refine-plan output validate-refine-plan-io.sh now checks INPUT_DIR writability for in-place mode. Phase 7 always creates temp files in the same directory for atomic writes. Added regression test for read-only input directory. - [P3] Match wrapped hook invocations in direct-execution block loop-bash-validator.sh extended regex to match wrapped invocations (env, command, VAR=value prefixes). Closes bypass with wrappers like `env FOO=1 bash hooks/...`. Added regression tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add codex_model/codex_effort validation at config-load time in loop-common.sh to catch misconfigured project configs early instead of failing at runtime. Fix statusline to show session-unaware RLCR loops instead of hiding them when session-aware loops exist. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…wrapper regex Validate codex_model against known Codex prefixes (gpt-/o[0-9]) in addition to shell-safety. Remove finalize-summary.md pre-creation that bypassed the missing-summary guard. Extend hook direct-execution guard to block timeout/nice/nohup/strace/ltrace wrappers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…bitlesson_required Skip fenced code blocks and HTML comments when extracting BitLesson Delta sections. Broaden stop-hook .humanize filter to match legacy .humanize-* directories. Remove auto-enable of bitlesson_required for legacy loops that lack the field in state. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ast mode Tighten .humanize dirty-tree filter from \.humanize to \.humanize[-/] so unrelated files like .humanizeconfig are not silently dropped. Add zsh to blocked shell list in loop-bash-validator. Check project-local .claude/settings.json before global for fast mode in statusline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR extends the refine-plan workflow and related tooling/tests, while tightening loop behavior around legacy .humanize* paths, BitLesson Delta validation, and config-driven defaults (including agent_teams and translated-plan language).

Changes:

Added refine-plan command documentation, QA template, IO validator script, skill packaging, and a comprehensive test suite.
Improved RLCR/stop-hook robustness: legacy .humanize-* dirs ignored for git-dirty checks, BitLesson Delta parsing hardened, and stricter hook wrapper blocking.
Updated configuration semantics and docs: move to alternative_plan_language, validate Codex config values, and allow agent_teams default via merged config.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/test-unified-codex-config.sh	Adds assertions for caller-presets and invalid config warnings/fallbacks.
tests/test-stop-hook-legacy-compat.sh	New regression tests for legacy `.humanize-*` handling and bitlesson_required behavior.
tests/test-refine-plan.sh	New comprehensive tests for refine-plan docs, validator exit codes, and reference behaviors.
tests/test-bitlesson-validate-delta.sh	New tests for BitLesson Delta validator rules (Notes required, ignore fenced/HTML blocks).
tests/test-bitlesson-select-routing.sh	Updates fixtures to use `.humanize/bitlesson.md` path.
tests/test-allowlist-validators.sh	Adds coverage for blocking wrapper-based direct hook execution.
tests/test-agent-teams.sh	Adds test asserting `agent_teams` can be enabled via project config.
tests/run-all-tests.sh	Wires in new test suites.
tests/robustness/test-setup-scripts-robustness.sh	Adds robustness check that `.humanizeconfig` still counts as dirty.
skills/humanize-refine-plan/SKILL.md	Introduces refine-plan flow skill definition and guarantees.
scripts/validate-refine-plan-io.sh	New validator for refine-plan inputs/outputs and mode flags.
scripts/setup-rlcr-loop.sh	Adds DEFAULT_AGENT_TEAMS usage; expands ignored untracked patterns; removes summary template creation block.
scripts/install-skill.sh	Adds refine-plan skill and ensures runtime bundle includes templates/config/agents.
scripts/bitlesson-validate-delta.sh	Extracts BitLesson Delta outside fences/HTML comments; enforces Notes for add/update.
scripts/bitlesson-select.sh	Adjusts provider invocation for Codex/Claude selector execution.
prompt-template/plan/refine-plan-qa-template.md	Adds QA ledger template for refine-plan outputs.
prompt-template/plan/gen-plan-template.md	Updates translated-variant/config semantics documentation.
prompt-template/block/bitlesson-delta-missing-notes.md	Adds block template for missing Notes in BitLesson Delta.
hooks/loop-codex-stop-hook.sh	Aligns dirty-check ignores with legacy variants; removes legacy bitlesson_required fallback; adjusts finalize scaffolding.
hooks/loop-bash-validator.sh	Strengthens detection of wrapped direct hook execution (env/timeout/nice/nohup/trace/etc.).
hooks/lib/loop-common.sh	Validates codex_model/effort; loads DEFAULT_AGENT_TEAMS from merged config.
docs/usage.md	Documents refine-plan flow and typical `gen-plan -> refine-plan -> start-rlcr-loop` usage.
docs/install-for-kimi.md	Adds refine-plan skill and runtime bundle components to Kimi install steps.
docs/install-for-codex.md	Adds refine-plan skill and runtime bundle components to Codex install steps.
docs/install-for-claude.md	Adds refine-plan command to post-install command list.
config/default_config.json	Replaces `chinese_plan` with `alternative_plan_language`.
commands/refine-plan.md	Adds the refine-plan command spec and phased workflow requirements.
commands/gen-plan.md	Updates config loading docs to merged config-loader semantics and legacy fallback rules.
agents/bitlesson-selector.md	Updates agent doc to reference configured BitLesson path (normally `.humanize/bitlesson.md`).
README.md	Adds refine-plan to quick start workflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

ZenusZhang · 2026-03-15T08:45:23Z

    CODEX_EXEC_ARGS+=("$CODEX_AUTO_FLAG" "-C" "$CODEX_PROJECT_ROOT")

-    RAW_OUTPUT="$(printf '%s' "$PROMPT" | run_with_timeout "$SELECTOR_TIMEOUT" codex exec "${CODEX_EXEC_ARGS[@]}" -)" || CODEX_EXIT_CODE=$?
+    RAW_OUTPUT="$(printf '%s' "$PROMPT" | run_with_timeout "$SELECTOR_TIMEOUT" codex exec "${CODEX_EXEC_ARGS[@]}")" || CODEX_EXIT_CODE=$?


Restored to the origin/dev invocation form in d7d7bc6.

For the reviewed change, I rolled the selector invocation back to the prior stdin contract instead of keeping the altered form:

codex exec ... -

claude --print ... -

I kept the additional regression coverage from 6e55a6a so the Codex selector path now explicitly proves that it requires the trailing - and consumes non-empty prompt content from stdin.

Validated with tests/test-bitlesson-select-routing.sh.

ZenusZhang · 2026-03-15T08:55:48Z

 SUMMARY_PATH="$LOOP_DIR/round-0-summary.md"

-# Create the round-0 summary template with BitLesson Delta section
-if [[ "$SKIP_IMPL" != "true" ]]; then
-    write_summary_template "$SUMMARY_PATH"
-fi
-
 if [[ "$SKIP_IMPL" == "true" ]]; then
    # Skip-impl mode: create a prompt for code review only
    cat > "$LOOP_DIR/round-0-prompt.md" << EOF


Fixed in 9c577bf.

I restored the round-0 summary scaffold creation during setup by calling write_summary_template "$SUMMARY_PATH" before branching into normal vs --skip-impl mode, so both paths now start with a valid round-0-summary.md.

That gives the stop hook an existing per-round summary target immediately, and it ensures the default ## BitLesson Delta section is present from the start.

Added regression coverage for both paths:

tests/robustness/test-setup-scripts-robustness.sh checks --skip-impl creates round-0-summary.md with BitLesson Delta defaults

tests/test-unified-codex-config.sh checks normal setup execution creates the same scaffold

Validated with:

bash -n scripts/setup-rlcr-loop.sh

tests/robustness/test-setup-scripts-robustness.sh

tests/test-unified-codex-config.sh

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c7f9a02682

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-15T08:33:53Z

  "bitlesson_model": "haiku",
  "agent_teams": false,
-  "chinese_plan": false,
+  "alternative_plan_language": "",


Preserve legacy chinese_plan fallback behavior

Adding "alternative_plan_language": "" to the required default config means load_merged_config will always emit that key, so the HAS_ALT_LANGUAGE_KEY=false branch documented in commands/gen-plan.md (the only path that honors legacy chinese_plan: true) becomes unreachable. In practice, projects still using only chinese_plan are now forced into the "explicit empty alt language" path and lose translated-plan generation, which is a backward-compatibility regression.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-15T08:33:53Z

+    # variants from the dirty determination because local plugin state under
+    # .humanize/ (.humanize/bitlesson.md, config.json, rlcr/) is intentionally
+    # untracked.
+    HUMANIZE_UNTRACKED_PATTERN='^\?\? \.humanize[-/]'


Restrict .humanize ignore regex to directory paths

The pattern ^\?\? \.humanize[-/] matches any untracked name beginning with .humanize-, including regular files (for example .humanize-old.txt), but this block is intended to ignore only runtime directories. Because these matches are removed from dirty-tree checks, real untracked files can now be silently excluded and the loop can continue on a non-clean worktree.

Useful? React with 👍 / 👎.

zenus and others added 19 commits March 13, 2026 18:09

Add BitLesson entry for validator grep-vs-scanner issue

bd958c5

Record lesson learned from Round 0/1: grep-based CMT: counting does not handle document structure (code fences, HTML comments, malformed blocks). Fix was to use a stateful awk scanner.

Update BitLesson entry for validator grep-vs-scanner pattern

09bea1d

Extended scope to cover both CMT counting (exit 3) and required section detection (exit 4) -- same grep-vs-scanner pattern.

Add Kimi install-doc regression assertions to PT-11

8b4607c

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Tighten PT-11 Kimi install-doc assertions to exact contract

e3de875

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: finish bitlesson path migration cleanup

910aad6

chore: bump version to 1.15.0 and align alt plan language

b0e9780

Align statusline with session-aware RLCR lookup

1bacbec

docs: add refine-plan to README and usage

c7f9a02

Copilot AI review requested due to automatic review settings March 15, 2026 08:26

Copilot started reviewing on behalf of ZenusZhang March 15, 2026 08:27 View session

Copilot AI reviewed Mar 15, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Mar 15, 2026

View reviewed changes

Merge origin/dev into refine-plan

86bc7d9

ZenusZhang changed the title ~~Align statusline RLCR lookup and document refine-plan flow~~ Add refine-plan annotated plan refinement workflow Mar 15, 2026

zenus added 4 commits March 15, 2026 16:44

Fix bitlesson codex stdin handling

6e55a6a

Restore bitlesson selector stdin invocation

d7d7bc6

Fix zsh parsing in bitlesson delta validator

7359345

Create round-0 summary scaffold during RLCR setup

9c577bf

ZenusZhang merged commit 4afc04a into PolyArch:dev Mar 16, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add refine-plan annotated plan refinement workflow#46

Add refine-plan annotated plan refinement workflow#46
ZenusZhang merged 24 commits intoPolyArch:devfrom
ZenusZhang:refine-plan

ZenusZhang commented Mar 15, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

ZenusZhang Mar 15, 2026 •

edited

Loading

Uh oh!

ZenusZhang Mar 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 15, 2026

Uh oh!

chatgpt-codex-connector Bot Mar 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ZenusZhang commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. refine-plan Command

2. Other Changes In This Branch

Validation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

ZenusZhang Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ZenusZhang Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZenusZhang commented Mar 15, 2026 •

edited

Loading

ZenusZhang Mar 15, 2026 •

edited

Loading