Improve gen-plan convergence and task-tag routing#35
Conversation
Cherry-picked from SHAs: c283a92 9c0eef7 5156a05 002308a 8ba3a57 437567b 3c8caf5 4a57429 821f225 Revert pair 5156a05+002308a included (net-zero; hooks/lib/loop-common.sh not in diff) Version files kept at origin/main values; version bump deferred per runbook 4.5 Fix: added missing append_task_tag_routing_note call in implementation-phase continuation
…mode Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lag tests - Flip Convergence Log and Codex Team Workflow presence→absence assertions - Add Convergence Status presence test - Add --discussion, --direct, and mutual-exclusion tests for validate script Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ression test - Fix Task Breakdown intro wording to match template exactly - Add Output File Convention section to gen-plan.md Plan Structure block - Add regression test verifying byte-for-byte sync between extracted block and template - Simplify awk pattern in test (idiomatic form, drop cat subprocess)
- tests/test-gen-plan.sh: remove redundant inner file guard, remove unused EXIT captures, add per-test comments for mode-flag tests - commands/gen-plan.md: clarify Phase 0 AUTO_START variable naming, move priority note to Phase 0.5 where resolution actually occurs
…ove bitlesson.md Reverts version to 1.13.2 across plugin.json, marketplace.json, and README.md. Adds "Hard Constraint: No Coding During Plan Generation" section to gen-plan.md and clarifies auto-start fallback behavior. Removes bitlesson.md from repository. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces a task-tag routing system for the humanize plugin's RLCR loop, adds Claude-Codex deliberation workflow to plan generation, extends the validation script with new options, bumps the version to 1.14.1, and adds corresponding tests and documentation. The PR title and description significantly understate the scope of changes.
Changes:
- Adds task-tag routing (
coding/analyze) to RLCR setup prompts, goal tracker, and stop hook follow-up prompts, plus a new test file (test-task-tag-routing.sh) and expanded tests intest-gen-plan.sh - Extends
gen-plan.mdwith a multi-phase Claude-Codex deliberation workflow (Codex first-pass analysis, iterative convergence loop, disagreement resolution), adds--discussion/--direct/--auto-start-rlcr-if-convergedoptions tovalidate-gen-plan-io.sh, and updatesgen-plan-template.mdwith new sections - Bumps version from 1.14.0 to 1.14.1 across
plugin.json,marketplace.json, andREADME.md
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
tests/test-task-tag-routing.sh |
New test file validating task-tag routing in round-0 prompts, goal tracker, and stop hook follow-up prompts |
tests/test-gen-plan.sh |
Adds tests for deliberation workflow (PT-5b), validate script new flags, and template sync verification |
tests/run-all-tests.sh |
Registers test-task-tag-routing.sh in the parallel test suite |
scripts/validate-gen-plan-io.sh |
Adds --auto-start-rlcr-if-converged, --discussion, --direct flags with mutual exclusion check |
scripts/setup-rlcr-loop.sh |
Adds task-tag routing section to round-0 prompt and updates goal tracker with Tag/Owner columns |
hooks/loop-codex-stop-hook.sh |
Adds append_task_tag_routing_note() function and calls it in follow-up prompts |
commands/gen-plan.md |
Major overhaul: adds Claude-Codex deliberation phases, convergence loop, auto-start logic, Chinese variant, task breakdown |
commands/start-rlcr-loop.md |
Updates documentation to reflect task-tag routing in RLCR workflow |
prompt-template/plan/gen-plan-template.md |
Adds Task Breakdown, Claude-Codex Deliberation, Pending User Decisions, and Output File Convention sections |
docs/usage.md |
Updates gen-plan command docs with new options and workflow step |
README.md |
Version bump to 1.14.1 |
.claude-plugin/plugin.json |
Version bump to 1.14.1 |
.claude-plugin/marketplace.json |
Version bump to 1.14.1 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| --- | ||
| current_round: 0 | ||
| max_iterations: 10 | ||
| codex_model: reviewer-model-placeholder |
There was a problem hiding this comment.
The codex_model value reviewer-model-placeholder deviates from the established convention in other test files, which consistently use concrete model names such as gpt-5.4 (e.g., test-pr-loop-stophook.sh:37, test-finalize-phase.sh:194, test-session-id.sh:123), o3-mini (e.g., test-concurrent-state-robustness.sh:42), or o3 (e.g., test-monitor-e2e-real.sh:69). Using a placeholder value could cause subtle differences in behavior if any downstream code parses or validates the model name. Consider using gpt-5.4 for consistency with the majority of test files.
| codex_model: reviewer-model-placeholder | |
| codex_model: gpt-5.4 |
|
|
||
| # Validate mutually exclusive flags | ||
| if [[ "$GEN_PLAN_MODE_DISCUSSION" == "true" && "$GEN_PLAN_MODE_DIRECT" == "true" ]]; then | ||
| echo "Error: --discussion and --direct are mutually exclusive" |
There was a problem hiding this comment.
The error message uses "Error: --discussion and --direct are mutually exclusive" with a capitalized Error, while all other error messages in this file use uppercase "ERROR:" prefix (e.g., lines 40, 48, 70, 84, 89). This is inconsistent with the file's own conventions. Consider changing to "ERROR: --discussion and --direct are mutually exclusive" for consistency.
| echo "Error: --discussion and --direct are mutually exclusive" | |
| echo "ERROR: --discussion and --direct are mutually exclusive" |
| # Humanize | ||
|
|
||
| **Current Version: 1.14.0** | ||
| **Current Version: 1.14.1** |
There was a problem hiding this comment.
The PR description only mentions removing bitlesson.md, setting a codex_model placeholder, and bumping the version to 1.14.1. However, the actual changes are far more extensive: they add an entire task-tag routing system (new sections in setup-rlcr-loop.sh, loop-codex-stop-hook.sh), a Claude-Codex deliberation workflow with multiple new phases in gen-plan.md, new options (--discussion, --direct, --auto-start-rlcr-if-converged) in validate-gen-plan-io.sh, template updates, documentation changes, and new/expanded test files. The PR description should be updated to accurately reflect the scope of these changes.
Add Step 1.5 in Phase 6 to consolidate pending user decisions before the manual review gate takes effect. This step runs unconditionally and transfers Phase 3 QUESTIONS_FOR_USER and Phase 5 needs_user_decision items into the Pending User Decisions section, ensuring the auto-start gate in Phase 8 correctly blocks when unresolved questions exist.
Summary
gen-planflow, prompts, and validation behaviortests/test-gen-plan.shandtests/test-task-tag-routing.shbitlesson.md1.14.1Validation
bash tests/test-task-tag-routing.sh