Improve gen-plan convergence and task-tag routing (#35)

ZenusZhang · zenus · claude · web-flow · commit 2cb2e929bcba · 2026-03-06T11:45:11.000-08:00
* feat: extract F2 gen-plan-convergence changes Cherry-picked from SHAs: c283a92 9c0eef7 5156a05 002308a 8ba3a57 437567b 3c8caf5 4a57429 821f225 Revert pair 5156a05+002308a included (net-zero; hooks/lib/loop-common.sh not in diff) Version files kept at origin/main values; version bump deferred per runbook 4.5 Fix: added missing append_task_tag_routing_note call in implementation-phase continuation * docs: add F2 gen-plan-convergence feature analysis * refactor(gen-plan): remove redundant sections, add discussion/direct mode Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: delete f2 analysis docs from branch Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(gen-plan): flip section assertions to absence checks, add mode flag tests - Flip Convergence Log and Codex Team Workflow presence→absence assertions - Add Convergence Status presence test - Add --discussion, --direct, and mutual-exclusion tests for validate script Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: initialize bitlesson.md for project knowledge base * fix(gen-plan): sync Plan Structure block with template, add AC-12 regression test - Fix Task Breakdown intro wording to match template exactly - Add Output File Convention section to gen-plan.md Plan Structure block - Add regression test verifying byte-for-byte sync between extracted block and template - Simplify awk pattern in test (idiomatic form, drop cat subprocess) * refactor: simplify code-simplifier improvements from finalize phase - tests/test-gen-plan.sh: remove redundant inner file guard, remove unused EXIT captures, add per-test comments for mode-flag tests - commands/gen-plan.md: clarify Phase 0 AUTO_START variable naming, move priority note to Phase 0.5 where resolution actually occurs * chore: revert version to 1.13.2, add gen-plan hard constraint and remove bitlesson.md Reverts version to 1.13.2 across plugin.json, marketplace.json, and README.md. Adds "Hard Constraint: No Coding During Plan Generation" section to gen-plan.md and clarifies auto-start fallback behavior. Removes bitlesson.md from repository. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: restore bitlesson.md (auto-initialized by rlcr setup) * fix(gen-plan): gate auto-start to discussion mode * Remove bitlesson and bump version to 1.14.1 * Skip stophook CI test on forks * Skip stophook test in GitHub Actions * Enable Claude review on fork PRs * Restore Claude review workflow from main * Fix silent drop of user questions on auto-start path (v1.14.2) Add Step 1.5 in Phase 6 to consolidate pending user decisions before the manual review gate takes effect. This step runs unconditionally and transfers Phase 3 QUESTIONS_FOR_USER and Phase 5 needs_user_decision items into the Pending User Decisions section, ensuring the auto-start gate in Phase 8 correctly blocks when unresolved questions exist. * Bump version to 1.15.0 --------- Co-authored-by: zenus <q18003877513@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Sihao Liu <sihao@cs.ucla.edu>
diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
@@ -8,7 +8,7 @@
       "name": "humanize",
       "source": "./",
       "description": "Humanize - An iterative development plugin that uses Codex to review Claude's work. Creates a feedback loop where Claude implements plans and Codex independently reviews progress, ensuring quality through continuous refinement.",
-      "version": "1.14.0"
+      "version": "1.15.0"
     }
   ]
 }
diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
@@ -1,7 +1,7 @@
 {
   "name": "humanize",
   "description": "Humanize - An iterative development plugin that uses Codex to review Claude's work. Creates a feedback loop where Claude implements plans and Codex independently reviews progress, ensuring quality through continuous refinement.",
-  "version": "1.14.0",
+  "version": "1.15.0",
   "author": {
     "name": "humania-org"
   },
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 # Humanize
 
-**Current Version: 1.14.0**
+**Current Version: 1.15.0**
 
 > Derived from the [GAAC (GitHub-as-a-Context)](https://github.com/SihaoLiu/gaac) project.
 
diff --git a/commands/gen-plan.md b/commands/gen-plan.md
diff --git a/commands/start-rlcr-loop.md b/commands/start-rlcr-loop.md
@@ -67,7 +67,9 @@ If the pre-check passed (or was skipped), execute the setup script to initialize
 
 This command starts an iterative development loop where:
 
-1. You work on the implementation plan provided
+1. You execute the implementation plan with task-tag routing
+   - `coding` tasks: Claude executes directly
+   - `analyze` tasks: execute via `/humanize:ask-codex`
 2. Write a summary of your work to the specified summary file
 3. When you try to exit, Codex reviews your summary
 4. If Codex finds issues, you receive feedback and continue
@@ -86,9 +88,11 @@ This loop uses a **Goal Tracker** to prevent goal drift across iterations:
 
 ### Key Features
 1. **Acceptance Criteria**: Each task maps to a specific AC - nothing can be "forgotten"
-2. **Plan Evolution Log**: If you discover the plan needs changes, document the change with justification
-3. **Explicit Deferrals**: Deferred tasks require strong justification and impact analysis
-4. **Full Alignment Checks**: At configurable intervals (default every 5 rounds: rounds 4, 9, 14, etc.), Codex conducts a comprehensive goal alignment audit. Use `--full-review-round N` to customize (min: 2)
+2. **Task Tag Routing**: Every task should carry `coding` or `analyze` tag from plan generation
+   - `coding -> Claude`, `analyze -> Codex`
+3. **Plan Evolution Log**: If you discover the plan needs changes, document the change with justification
+4. **Explicit Deferrals**: Deferred tasks require strong justification and impact analysis
+5. **Full Alignment Checks**: At configurable intervals (default every 5 rounds: rounds 4, 9, 14, etc.), Codex conducts a comprehensive goal alignment audit. Use `--full-review-round N` to customize (min: 2)
 
 ### How to Use
 1. **Round 0**: Initialize the Goal Tracker with Ultimate Goal and Acceptance Criteria
@@ -113,7 +117,7 @@ This loop uses a **Goal Tracker** to prevent goal drift across iterations:
 
 The RLCR loop has two phases within the active loop:
 
-1. **Implementation Phase**: Work on the plan, Codex reviews your summary
+1. **Implementation Phase**: Work by task tags (`coding -> Claude`, `analyze -> /humanize:ask-codex`), then Codex reviews your summary
 2. **Review Phase**: After COMPLETE, `codex review` checks code quality with `[P0-9]` severity markers
 
 The `--base-branch` option specifies the base branch for code review comparison. If not provided, it auto-detects from: remote default > local main > local master.
diff --git a/docs/usage.md b/docs/usage.md
@@ -56,11 +56,16 @@ OPTIONS:
 ### gen-plan
 
 ```
-/humanize:gen-plan --input <path/to/draft.md> --output <path/to/plan.md>
+/humanize:gen-plan --input <path/to/draft.md> --output <path/to/plan.md> [OPTIONS]
 
 OPTIONS:
   --input   Path to the input draft file (required)
   --output  Path to the output plan file (required)
+  --auto-start-rlcr-if-converged
+             Start the RLCR loop automatically when the plan is converged
+             (discussion mode only; ignored in --direct)
+  --discussion  Use discussion mode (iterative Claude/Codex convergence rounds)
+  --direct      Use direct mode (skip convergence rounds, proceed immediately to plan)
   -h, --help             Show help message
 
 The gen-plan command transforms rough draft documents into structured implementation plans.
@@ -71,6 +76,7 @@ Workflow:
 3. Analyzes draft for clarity, consistency, completeness, and functionality
 4. Engages user to resolve any issues found
 5. Generates a structured plan.md with acceptance criteria
+6. Optionally starts `/humanize:start-rlcr-loop` if `--auto-start-rlcr-if-converged` conditions are met
 ```
 
 ### start-pr-loop
diff --git a/hooks/loop-codex-stop-hook.sh b/hooks/loop-codex-stop-hook.sh
@@ -1163,6 +1163,22 @@ Focus on the code changes made during this RLCR session. Focus more on changes b
     exit 0
 }
 
+# Append task tag routing reminder to follow-up prompts.
+# Arguments: $1=prompt_file_path
+append_task_tag_routing_note() {
+    local prompt_file="$1"
+
+    cat >> "$prompt_file" << 'ROUTING_EOF'
+
+## Task Tag Routing Reminder
+
+Follow the plan's per-task routing tags strictly:
+- `coding` task -> Claude executes directly
+- `analyze` task -> execute via `/humanize:ask-codex`, then integrate the result
+- Keep Goal Tracker Active Tasks columns `Tag` and `Owner` aligned with execution
+ROUTING_EOF
+}
+
 # Continue review loop when issues are found
 # Arguments: $1=round_number, $2=review_content
 continue_review_loop_with_issues() {
@@ -1198,6 +1214,7 @@ You are in the **Review Phase** of the RLCR loop. Codex has performed a code rev
     load_and_render_safe "$TEMPLATE_DIR" "claude/review-phase-prompt.md" "$fallback" \
         "REVIEW_CONTENT=$review_content" \
         "SUMMARY_FILE=$next_summary_file" > "$next_prompt_file"
+    append_task_tag_routing_note "$next_prompt_file"
 
     jq -n \
         --arg reason "$(cat "$next_prompt_file")" \
@@ -1625,6 +1642,7 @@ FOOTER_FALLBACK="## Before Exiting
 Commit your changes and write summary to {{NEXT_SUMMARY_FILE}}"
 load_and_render_safe "$TEMPLATE_DIR" "claude/next-round-footer.md" "$FOOTER_FALLBACK" \
     "NEXT_SUMMARY_FILE=$NEXT_SUMMARY_FILE" >> "$NEXT_PROMPT_FILE"
+append_task_tag_routing_note "$NEXT_PROMPT_FILE"
 
 # Add push instruction only if push_every_round is true
 if [[ "$PUSH_EVERY_ROUND" == "true" ]]; then
diff --git a/prompt-template/plan/gen-plan-template.md b/prompt-template/plan/gen-plan-template.md
@@ -66,9 +66,55 @@ Example: "The implementation includes core feature X with basic validation"
 
 <Describe relative dependencies between components, not time estimates>
 
+## Task Breakdown
+
+Each task must include exactly one routing tag:
+- `coding`: implemented by Claude
+- `analyze`: executed via Codex (`/humanize:ask-codex`)
+
+| Task ID | Description | Target AC | Tag (`coding`/`analyze`) | Depends On |
+|---------|-------------|-----------|----------------------------|------------|
+| task1 | <...> | AC-1 | coding | - |
+| task2 | <...> | AC-2 | analyze | task1 |
+
+## Claude-Codex Deliberation
+
+### Agreements
+- <Point both sides agree on>
+
+### Resolved Disagreements
+- <Topic>: Claude vs Codex summary, chosen resolution, and rationale
+
+### Convergence Status
+- Final Status: `converged` or `partially_converged`
+
+## Pending User Decisions
+
+- DEC-1: <Decision topic>
+  - Claude Position: <...>
+  - Codex Position: <...>
+  - Tradeoff Summary: <...>
+  - Decision Status: `PENDING` or `<User's final decision>`
+
 ## Implementation Notes
 
 ### Code Style Requirements
 - Implementation code and comments must NOT contain plan-specific terminology such as "AC-", "Milestone", "Step", "Phase", or similar workflow markers
 - These terms are for plan documentation only, not for the resulting codebase
 - Use descriptive, domain-appropriate naming in code instead
+
+## Output File Convention
+
+This template is used to produce the main output file (e.g., `plan.md`).
+
+### Chinese Variant (`_zh` file)
+
+When `chinese_plan=true` is set in `.humanize/config.json`, a `_zh` variant of the output file is also written after the main file. The `_zh` filename is constructed by inserting `_zh` immediately before the file extension:
+
+- `plan.md` becomes `plan_zh.md`
+- `docs/my-plan.md` becomes `docs/my-plan_zh.md`
+- `output` (no extension) becomes `output_zh`
+
+The `_zh` file contains a full Chinese translation of the English plan. All identifiers (`AC-*`, task IDs, file paths, API names, command flags) remain unchanged, as they are language-neutral.
+
+When `chinese_plan=false` (the default), or when `.humanize/config.json` does not exist, or when the `chinese_plan` field is absent, the `_zh` file is NOT written. A missing config file is not an error.
diff --git a/scripts/setup-rlcr-loop.sh b/scripts/setup-rlcr-loop.sh
@@ -99,7 +99,9 @@ DESCRIPTION:
   3. Has two phases: Implementation Phase and Review Phase
 
   The flow:
-  1. Claude works on the plan (Implementation Phase)
+  1. Claude executes plan tasks with tag-based routing (Implementation Phase)
+     - `coding` tasks: Claude implements directly
+     - `analyze` tasks: Claude delegates execution via `/humanize:ask-codex`
   2. Claude writes a summary to round-N-summary.md
   3. On exit attempt, Codex reviews the summary
   4. If Codex finds issues, it blocks exit and sends feedback
@@ -929,10 +931,10 @@ cat >> "$GOAL_TRACKER_FILE" << 'GOAL_TRACKER_EOF'
 | 0 | Initial plan | - | - |
 
 #### Active Tasks
-<!-- Map each task to its target Acceptance Criterion -->
-| Task | Target AC | Status | Notes |
-|------|-----------|--------|-------|
-| [To be populated by Claude based on plan] | - | pending | - |
+<!-- Map each task to its target Acceptance Criterion and routing tag -->
+| Task | Target AC | Status | Tag | Owner | Notes |
+|------|-----------|--------|-----|-------|-------|
+| [To be populated by Claude based on plan] | - | pending | coding or analyze | claude or codex | - |
 
 ### Completed and Verified
 <!-- Only move tasks here after Codex verification -->
@@ -1007,7 +1009,7 @@ Before starting implementation, you MUST initialize the Goal Tracker:
 1. Read @$GOAL_TRACKER_FILE
 2. If the "Ultimate Goal" section says "[To be extracted...]", extract a clear goal statement from the plan
 3. If the "Acceptance Criteria" section says "[To be defined...]", define 3-7 specific, testable criteria
-4. Populate the "Active Tasks" table with tasks from the plan, mapping each to an AC
+4. Populate the "Active Tasks" table with tasks from the plan, mapping each to an AC and filling Tag/Owner
 5. Write the updated goal-tracker.md
 
 **IMPORTANT**: The IMMUTABLE SECTION can only be modified in Round 0. After this round, it becomes read-only.
@@ -1019,6 +1021,15 @@ Before starting implementation, you MUST initialize the Goal Tracker:
 For all tasks that need to be completed, please use the Task system (TaskCreate, TaskUpdate, TaskList) to track each item in order of importance.
 You are strictly prohibited from only addressing the most important issues - you MUST create Tasks for ALL discovered issues and attempt to resolve each one.
 
+## Task Tag Routing (MUST FOLLOW)
+
+Each task must have one routing tag from the plan: \`coding\` or \`analyze\`.
+
+- Tag \`coding\`: Claude executes the task directly.
+- Tag \`analyze\`: Claude must execute via \`/humanize:ask-codex\`, then integrate Codex output.
+- Keep Goal Tracker "Active Tasks" columns **Tag** and **Owner** aligned with execution (\`coding -> claude\`, \`analyze -> codex\`).
+- If a task is missing a valid tag, do not guess silently; document it in Plan Evolution Log and block completion until clarified.
+
 EOF
 
 # Append plan content directly (avoids command substitution size limits for large files)
@@ -1057,6 +1068,7 @@ cat >> "$LOOP_DIR/round-0-prompt.md" << EOF
 Throughout your work, you MUST maintain the Goal Tracker:
 
 1. **Before starting a task**: Mark it as "in_progress" in Active Tasks
+   - Confirm Tag/Owner routing is correct before execution
 2. **After completing a task**: Move it to "Completed and Verified" with evidence (but mark as "pending verification")
 3. **If you discover the plan has errors**:
    - Do NOT silently change direction
diff --git a/scripts/validate-gen-plan-io.sh b/scripts/validate-gen-plan-io.sh
@@ -14,17 +14,23 @@
 set -e
 
 usage() {
-    echo "Usage: $0 --input <path/to/draft.md> --output <path/to/plan.md>"
+    echo "Usage: $0 --input <path/to/draft.md> --output <path/to/plan.md> [--auto-start-rlcr-if-converged] [--discussion|--direct]"
     echo ""
     echo "Options:"
     echo "  --input   Path to the input draft file (required)"
     echo "  --output  Path to the output plan file (required)"
+    echo "  --auto-start-rlcr-if-converged  Enable direct RLCR start after converged planning (discussion mode only)"
+    echo "  --discussion  Use discussion mode (iterative Claude/Codex convergence rounds)"
+    echo "  --direct      Use direct mode (skip convergence rounds, proceed immediately to plan)"
     echo "  -h, --help  Show this help message"
     exit 6
 }
 
 INPUT_FILE=""
 OUTPUT_FILE=""
+AUTO_START_RLCR_IF_CONVERGED="false"
+GEN_PLAN_MODE_DISCUSSION="false"
+GEN_PLAN_MODE_DIRECT="false"
 
 # Parse arguments
 while [[ $# -gt 0 ]]; do
@@ -45,6 +51,18 @@ while [[ $# -gt 0 ]]; do
             OUTPUT_FILE="$2"
             shift 2
             ;;
+        --auto-start-rlcr-if-converged)
+            AUTO_START_RLCR_IF_CONVERGED="true"
+            shift
+            ;;
+        --discussion)
+            GEN_PLAN_MODE_DISCUSSION="true"
+            shift
+            ;;
+        --direct)
+            GEN_PLAN_MODE_DIRECT="true"
+            shift
+            ;;
         -h|--help)
             usage
             ;;
@@ -55,6 +73,12 @@ while [[ $# -gt 0 ]]; do
     esac
 done
 
+# Validate mutually exclusive flags
+if [[ "$GEN_PLAN_MODE_DISCUSSION" == "true" && "$GEN_PLAN_MODE_DIRECT" == "true" ]]; then
+    echo "Error: --discussion and --direct are mutually exclusive"
+    exit 6
+fi
+
 # Validate required arguments
 if [[ -z "$INPUT_FILE" ]]; then
     echo "ERROR: --input is required"
@@ -66,6 +90,11 @@ if [[ -z "$OUTPUT_FILE" ]]; then
     usage
 fi
 
+# Note on auto-start behavior in direct mode
+if [[ "$GEN_PLAN_MODE_DIRECT" == "true" && "$AUTO_START_RLCR_IF_CONVERGED" == "true" ]]; then
+    echo "NOTE: --auto-start-rlcr-if-converged only triggers in --discussion mode; in --direct mode the plan is not considered converged and auto-start will be skipped."
+fi
+
 # Get absolute paths
 INPUT_FILE=$(realpath -m "$INPUT_FILE" 2>/dev/null || echo "$INPUT_FILE")
 OUTPUT_FILE=$(realpath -m "$OUTPUT_FILE" 2>/dev/null || echo "$OUTPUT_FILE")
diff --git a/tests/run-all-tests.sh b/tests/run-all-tests.sh
@@ -51,6 +51,7 @@ TEST_SUITES=(
     "test-monitor-e2e-deletion.sh"
     "test-monitor-e2e-sigint.sh"
     "test-gen-plan.sh"
+    "test-task-tag-routing.sh"
     "test-pr-loop-1-scripts.sh"
     "test-pr-loop-2-hooks.sh"
     "test-pr-loop-3-stophook.sh"
diff --git a/tests/test-gen-plan.sh b/tests/test-gen-plan.sh
diff --git a/tests/test-pr-loop-3-stophook.sh b/tests/test-pr-loop-3-stophook.sh
diff --git a/tests/test-task-tag-routing.sh b/tests/test-task-tag-routing.sh

Original file line number	Diff line number	Diff line change
`@@ -8,7 +8,7 @@`
`8`	`8`	`"name": "humanize",`
`9`	`9`	`"source": "./",`
`10`	`10`	`"description": "Humanize - An iterative development plugin that uses Codex to review Claude's work. Creates a feedback loop where Claude implements plans and Codex independently reviews progress, ensuring quality through continuous refinement.",`
`11`		`- "version": "1.14.0"`
	`11`	`+ "version": "1.15.0"`
`12`	`12`	`}`
`13`	`13`	`]`
`14`	`14`	`}`
Original file line number	Diff line number	Diff line change
`@@ -1,7 +1,7 @@`
`1`	`1`	`{`
`2`	`2`	`"name": "humanize",`
`3`	`3`	`"description": "Humanize - An iterative development plugin that uses Codex to review Claude's work. Creates a feedback loop where Claude implements plans and Codex independently reviews progress, ensuring quality through continuous refinement.",`
`4`		`- "version": "1.14.0",`
	`4`	`+ "version": "1.15.0",`
`5`	`5`	`"author": {`
`6`	`6`	`"name": "humania-org"`
`7`	`7`	`},`