Skip to content

Improve gen-plan convergence and task-tag routing#35

Merged
SihaoLiu merged 18 commits intoPolyArch:devfrom
ZenusZhang:feat/gen-plan-convergence
Mar 6, 2026
Merged

Improve gen-plan convergence and task-tag routing#35
SihaoLiu merged 18 commits intoPolyArch:devfrom
ZenusZhang:feat/gen-plan-convergence

Conversation

@ZenusZhang
Copy link
Copy Markdown
Contributor

@ZenusZhang ZenusZhang commented Mar 6, 2026

Summary

  • refine gen-plan flow, prompts, and validation behavior
  • update RLCR/task-tag routing prompts and stop-hook coverage
  • add regression coverage in tests/test-gen-plan.sh and tests/test-task-tag-routing.sh
  • remove bitlesson.md
  • bump plugin/README version to 1.14.1

Validation

  • bash tests/test-task-tag-routing.sh

zenus and others added 12 commits March 6, 2026 11:29
Cherry-picked from SHAs: c283a92 9c0eef7 5156a05 002308a 8ba3a57 437567b 3c8caf5 4a57429 821f225
Revert pair 5156a05+002308a included (net-zero; hooks/lib/loop-common.sh not in diff)
Version files kept at origin/main values; version bump deferred per runbook 4.5
Fix: added missing append_task_tag_routing_note call in implementation-phase continuation
…mode

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lag tests

- Flip Convergence Log and Codex Team Workflow presence→absence assertions
- Add Convergence Status presence test
- Add --discussion, --direct, and mutual-exclusion tests for validate script

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ression test

- Fix Task Breakdown intro wording to match template exactly
- Add Output File Convention section to gen-plan.md Plan Structure block
- Add regression test verifying byte-for-byte sync between extracted block and template
- Simplify awk pattern in test (idiomatic form, drop cat subprocess)
- tests/test-gen-plan.sh: remove redundant inner file guard, remove
  unused EXIT captures, add per-test comments for mode-flag tests
- commands/gen-plan.md: clarify Phase 0 AUTO_START variable naming,
  move priority note to Phase 0.5 where resolution actually occurs
…ove bitlesson.md

Reverts version to 1.13.2 across plugin.json, marketplace.json, and
README.md. Adds "Hard Constraint: No Coding During Plan Generation"
section to gen-plan.md and clarifies auto-start fallback behavior.
Removes bitlesson.md from repository.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 6, 2026 11:44
@ZenusZhang ZenusZhang changed the title Remove bitlesson and bump version to 1.14.1 Improve gen-plan convergence and task-tag routing Mar 6, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a task-tag routing system for the humanize plugin's RLCR loop, adds Claude-Codex deliberation workflow to plan generation, extends the validation script with new options, bumps the version to 1.14.1, and adds corresponding tests and documentation. The PR title and description significantly understate the scope of changes.

Changes:

  • Adds task-tag routing (coding/analyze) to RLCR setup prompts, goal tracker, and stop hook follow-up prompts, plus a new test file (test-task-tag-routing.sh) and expanded tests in test-gen-plan.sh
  • Extends gen-plan.md with a multi-phase Claude-Codex deliberation workflow (Codex first-pass analysis, iterative convergence loop, disagreement resolution), adds --discussion/--direct/--auto-start-rlcr-if-converged options to validate-gen-plan-io.sh, and updates gen-plan-template.md with new sections
  • Bumps version from 1.14.0 to 1.14.1 across plugin.json, marketplace.json, and README.md

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/test-task-tag-routing.sh New test file validating task-tag routing in round-0 prompts, goal tracker, and stop hook follow-up prompts
tests/test-gen-plan.sh Adds tests for deliberation workflow (PT-5b), validate script new flags, and template sync verification
tests/run-all-tests.sh Registers test-task-tag-routing.sh in the parallel test suite
scripts/validate-gen-plan-io.sh Adds --auto-start-rlcr-if-converged, --discussion, --direct flags with mutual exclusion check
scripts/setup-rlcr-loop.sh Adds task-tag routing section to round-0 prompt and updates goal tracker with Tag/Owner columns
hooks/loop-codex-stop-hook.sh Adds append_task_tag_routing_note() function and calls it in follow-up prompts
commands/gen-plan.md Major overhaul: adds Claude-Codex deliberation phases, convergence loop, auto-start logic, Chinese variant, task breakdown
commands/start-rlcr-loop.md Updates documentation to reflect task-tag routing in RLCR workflow
prompt-template/plan/gen-plan-template.md Adds Task Breakdown, Claude-Codex Deliberation, Pending User Decisions, and Output File Convention sections
docs/usage.md Updates gen-plan command docs with new options and workflow step
README.md Version bump to 1.14.1
.claude-plugin/plugin.json Version bump to 1.14.1
.claude-plugin/marketplace.json Version bump to 1.14.1

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

---
current_round: 0
max_iterations: 10
codex_model: reviewer-model-placeholder
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The codex_model value reviewer-model-placeholder deviates from the established convention in other test files, which consistently use concrete model names such as gpt-5.4 (e.g., test-pr-loop-stophook.sh:37, test-finalize-phase.sh:194, test-session-id.sh:123), o3-mini (e.g., test-concurrent-state-robustness.sh:42), or o3 (e.g., test-monitor-e2e-real.sh:69). Using a placeholder value could cause subtle differences in behavior if any downstream code parses or validates the model name. Consider using gpt-5.4 for consistency with the majority of test files.

Suggested change
codex_model: reviewer-model-placeholder
codex_model: gpt-5.4

Copilot uses AI. Check for mistakes.

# Validate mutually exclusive flags
if [[ "$GEN_PLAN_MODE_DISCUSSION" == "true" && "$GEN_PLAN_MODE_DIRECT" == "true" ]]; then
echo "Error: --discussion and --direct are mutually exclusive"
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message uses "Error: --discussion and --direct are mutually exclusive" with a capitalized Error, while all other error messages in this file use uppercase "ERROR:" prefix (e.g., lines 40, 48, 70, 84, 89). This is inconsistent with the file's own conventions. Consider changing to "ERROR: --discussion and --direct are mutually exclusive" for consistency.

Suggested change
echo "Error: --discussion and --direct are mutually exclusive"
echo "ERROR: --discussion and --direct are mutually exclusive"

Copilot uses AI. Check for mistakes.
Comment thread README.md Outdated
# Humanize

**Current Version: 1.14.0**
**Current Version: 1.14.1**
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description only mentions removing bitlesson.md, setting a codex_model placeholder, and bumping the version to 1.14.1. However, the actual changes are far more extensive: they add an entire task-tag routing system (new sections in setup-rlcr-loop.sh, loop-codex-stop-hook.sh), a Claude-Codex deliberation workflow with multiple new phases in gen-plan.md, new options (--discussion, --direct, --auto-start-rlcr-if-converged) in validate-gen-plan-io.sh, template updates, documentation changes, and new/expanded test files. The PR description should be updated to accurately reflect the scope of these changes.

Copilot uses AI. Check for mistakes.
zenus and others added 6 commits March 6, 2026 12:04
Add Step 1.5 in Phase 6 to consolidate pending user decisions
before the manual review gate takes effect. This step runs
unconditionally and transfers Phase 3 QUESTIONS_FOR_USER and
Phase 5 needs_user_decision items into the Pending User Decisions
section, ensuring the auto-start gate in Phase 8 correctly blocks
when unresolved questions exist.
@SihaoLiu SihaoLiu changed the base branch from main to dev March 6, 2026 19:34
@SihaoLiu SihaoLiu merged commit 2cb2e92 into PolyArch:dev Mar 6, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants