Add Plan Understanding Quiz pre-flight check for start-rlcr-loop

SihaoLiu · SihaoLiu · commit 8e90877086d1 · 2026-03-12T21:08:01.000-07:00
Add an advisory quiz that runs before the RLCR loop starts, verifying
the user understands the technical implementation details of their plan.
An opus-model agent analyzes the plan and generates 2 multiple-choice
questions. If the user answers incorrectly, the system explains the plan
and offers to stop or proceed.

New flags:
- --yolo: skip quiz and enable --claude-answer-codex (full automation)
- --skip-quiz: skip quiz only without other behavioral changes

gen-plan auto-start now passes --skip-quiz since the user already
demonstrated understanding through the convergence discussion.
diff --git a/agents/plan-understanding-quiz.md b/agents/plan-understanding-quiz.md
@@ -0,0 +1,103 @@
+---
+name: plan-understanding-quiz
+description: Analyzes a plan and generates multiple-choice technical comprehension questions to verify user understanding before RLCR loop. Use when validating user readiness for start-rlcr-loop command.
+model: opus
+tools: Read, Glob, Grep
+---
+
+# Plan Understanding Quiz
+
+You are a specialized agent that analyzes an implementation plan and generates targeted multiple-choice technical comprehension questions. Your goal is to test whether the user genuinely understands HOW the plan will be implemented, not just what the plan title says.
+
+## Your Task
+
+When invoked, you will be given the content of a plan file. You need to:
+
+### Analyze the Plan
+
+1. **Read the plan thoroughly** to understand:
+   - What components, files, or systems are being modified
+   - What technical approach or mechanism is being used
+   - How different pieces of the implementation connect together
+   - What existing patterns or systems the plan builds upon
+
+2. **Explore the repository** to add context:
+   - Check README.md, CLAUDE.md, or other documentation files
+   - Look at the directory structure and key files referenced in the plan
+   - Understand the existing architecture that the plan interacts with
+
+### Generate Multiple-Choice Questions
+
+Create exactly 2 multiple-choice questions that test the user's understanding of the plan's **technical implementation details**. Each question must have exactly 4 options (A through D), with exactly 1 correct answer.
+
+- **QUESTION_1**: Should test whether the user knows what components/systems are being changed and how. Focus on the core technical mechanism or approach.
+- **QUESTION_2**: Should test whether the user understands how different parts of the implementation connect, what existing patterns are being followed, or what the key technical constraints are.
+
+**Good question characteristics:**
+- Derived from the plan's specific content, not generic templates
+- Test understanding of HOW things will be done, not just WHAT the plan describes
+- Not too low-level (no exact line numbers, exact syntax, or trivial details)
+- A user who has carefully read and understood the plan should pick the correct answer
+- A user who just skimmed the title or blindly accepted a generated plan would likely pick wrong
+- Wrong options should be plausible (not obviously absurd) but clearly incorrect to someone who read the plan
+
+**Example good questions:**
+- "How does this plan integrate the new validation step into the startup flow?" with options covering different integration approaches
+- "Which components need to change and why?" with options describing different component sets
+
+**Example bad questions (avoid these):**
+- "What is the plan about?" (too vague, tests nothing)
+- "What are the risks?" (generic, not about implementation)
+- "On which line does function X start?" (too low-level)
+
+### Generate Plan Summary
+
+Write a 2-3 sentence summary explaining what the plan does and how, suitable for educating a user who showed gaps in understanding. Focus on the technical approach, not just the goal.
+
+## Output Format
+
+You MUST output in this exact format, with each field on its own line:
+
+```
+QUESTION_1: <your first question>
+OPTION_1A: <option A text>
+OPTION_1B: <option B text>
+OPTION_1C: <option C text>
+OPTION_1D: <option D text>
+ANSWER_1: <A, B, C, or D>
+QUESTION_2: <your second question>
+OPTION_2A: <option A text>
+OPTION_2B: <option B text>
+OPTION_2C: <option C text>
+OPTION_2D: <option D text>
+ANSWER_2: <A, B, C, or D>
+PLAN_SUMMARY: <2-3 sentence technical summary>
+```
+
+## Important Notes
+
+- Always output all 13 fields - never skip any
+- ANSWER must be exactly one letter: A, B, C, or D
+- Randomize the position of the correct answer (do not always put it in A or D)
+- The plan may be written in any language - generate questions and options in the same language as the plan
+- Focus on substance over format
+- If the plan is very short or lacks technical detail, derive questions from whatever implementation hints are available
+- Questions should feel like a friendly knowledge check, not an adversarial interrogation
+
+## Example Output
+
+```
+QUESTION_1: How does this plan integrate the new validation step into the existing build pipeline?
+OPTION_1A: By replacing the existing lint step with a combined lint-and-validate step
+OPTION_1B: By adding a new PostToolUse hook that runs between the lint step and the compilation step
+OPTION_1C: By modifying the compilation step to include inline validation checks
+OPTION_1D: By creating a standalone pre-build script that runs before any other steps
+ANSWER_1: B
+QUESTION_2: Why does the plan require changes to both the CLI parser and the state file, rather than just the CLI?
+OPTION_2A: The state file stores the original CLI arguments for audit logging purposes
+OPTION_2B: The CLI parser is deprecated and the state file is the new configuration mechanism
+OPTION_2C: The CLI parser adds the flag, the state file persists it across loop iterations, and the stop hook reads it at exit time
+OPTION_2D: Both files share a common schema and must always be updated together
+ANSWER_2: C
+PLAN_SUMMARY: This plan adds a build output validation step by hooking into the PostToolUse lifecycle event. It modifies the hook configuration to insert a format checker between linting and compilation, and updates the state file schema to track validation results across RLCR rounds.
+```
diff --git a/commands/gen-plan.md b/commands/gen-plan.md
@@ -590,13 +590,15 @@ If all of the following are true:
 Then start work immediately by running:
 
 ```bash
-/humanize:start-rlcr-loop <output-plan-path>
+/humanize:start-rlcr-loop --skip-quiz <output-plan-path>
 ```
 
+The `--skip-quiz` flag is passed because the user has already demonstrated understanding of the plan through the gen-plan convergence discussion.
+
 If the command invocation is not available in this context, fall back to the setup script:
 
 ```bash
-"${CLAUDE_PLUGIN_ROOT}/scripts/setup-rlcr-loop.sh" --plan-file <output-plan-path>
+"${CLAUDE_PLUGIN_ROOT}/scripts/setup-rlcr-loop.sh" --skip-quiz --plan-file <output-plan-path>
 ```
 
 If the auto-start attempt fails, report the failure reason and provide the exact manual command for the user to run:
diff --git a/commands/start-rlcr-loop.md b/commands/start-rlcr-loop.md
@@ -1,10 +1,11 @@
 ---
 description: "Start iterative loop with Codex review"
-argument-hint: "[path/to/plan.md | --plan-file path/to/plan.md] [--max N] [--codex-model MODEL:EFFORT] [--codex-timeout SECONDS] [--track-plan-file] [--push-every-round] [--base-branch BRANCH] [--full-review-round N] [--skip-impl] [--claude-answer-codex] [--agent-teams]"
+argument-hint: "[path/to/plan.md | --plan-file path/to/plan.md] [--max N] [--codex-model MODEL:EFFORT] [--codex-timeout SECONDS] [--track-plan-file] [--push-every-round] [--base-branch BRANCH] [--full-review-round N] [--skip-impl] [--claude-answer-codex] [--agent-teams] [--yolo] [--skip-quiz]"
 allowed-tools:
   - "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/setup-rlcr-loop.sh:*)"
   - "Read"
   - "Task"
+  - "AskUserQuestion"
 hide-from-slash-command-tool: "true"
 ---
 
@@ -57,9 +58,58 @@ If any condition fails, skip the pre-check and let the setup script handle path
 
 ---
 
+## Plan Understanding Quiz
+
+Before running the setup script, verify the user genuinely understands what the plan will do. This is an advisory check -- it never blocks the loop, but catches "wishful thinking" users who blindly accepted a generated plan without reading it.
+
+**Skip this entire quiz if** any of these conditions are true:
+- `$ARGUMENTS` contains `--skip-impl` (no plan to quiz about)
+- `$ARGUMENTS` contains `--yolo` (user explicitly opted out of all pre-flight checks)
+- `$ARGUMENTS` contains `--skip-quiz` (user explicitly opted out of the quiz)
+- `$ARGUMENTS` contains `-h` or `--help` (just showing help)
+- No plan content is available (the compliance pre-check was skipped because no plan file path could be determined)
+
+### Run the quiz agent
+
+1. Reuse the plan content that was already read during the compliance pre-check above (do not re-read the file).
+
+2. Use the Task tool to invoke the `humanize:plan-understanding-quiz` agent (opus model):
+   ```
+   Task tool parameters:
+   - model: "opus"
+   - prompt: Include the plan file content and ask the agent to:
+     1. Explore the repository structure for context
+     2. Analyze the plan's technical implementation details
+     3. Generate 2 multiple-choice questions (4 options each) and a plan summary
+     4. Return in the structured format: QUESTION_1, OPTION_1A-D, ANSWER_1, QUESTION_2, OPTION_2A-D, ANSWER_2, PLAN_SUMMARY
+   ```
+
+3. **Parse the result**: Extract all 13 fields from the agent output (QUESTION_1, OPTION_1A through OPTION_1D, ANSWER_1, QUESTION_2, OPTION_2A through OPTION_2D, ANSWER_2, PLAN_SUMMARY). If the output is malformed (any field missing or ANSWER not A/B/C/D), warn: "Plan understanding quiz unavailable, continuing without it." and proceed to the Setup section below.
+
+### Ask questions and evaluate
+
+4. Use AskUserQuestion to present QUESTION_1 as a multiple-choice question with the 4 options (OPTION_1A through OPTION_1D). Compare the user's choice against ANSWER_1:
+   - If the user selected the correct answer, mark QUESTION_1 as **PASS**
+   - Otherwise, mark as **WRONG**
+
+5. Use AskUserQuestion to present QUESTION_2 as a multiple-choice question with the 4 options (OPTION_2A through OPTION_2D). Compare the user's choice against ANSWER_2 using the same criteria.
+
+### Decide whether to proceed
+
+6. **If both questions PASS**: Briefly acknowledge ("Your understanding of the plan looks solid. Proceeding with setup.") and continue to the Setup section below.
+
+7. **If one or both questions are WRONG**: Show the PLAN_SUMMARY to the user to help them understand what the plan does and the correct answers to the questions they missed. Then use AskUserQuestion with the question: "Would you like to proceed with the RLCR loop anyway, or stop and review the plan more carefully first?" with these choices:
+   - "Proceed with RLCR loop"
+   - "Stop and review the plan first"
+
+   - If the user chooses **"Proceed with RLCR loop"**: Continue to the Setup section below.
+   - If the user chooses **"Stop and review the plan first"**: Report "Stopping. Please review the plan file and re-run start-rlcr-loop when ready." and **stop the command**.
+
+---
+
 ## Setup
 
-If the pre-check passed (or was skipped), execute the setup script to initialize the loop:
+If the pre-check passed (or was skipped), and the quiz passed (or was skipped or user chose to proceed), execute the setup script to initialize the loop:
 
 ```bash
 "${CLAUDE_PLUGIN_ROOT}/scripts/setup-rlcr-loop.sh" $ARGUMENTS
diff --git a/docs/usage.md b/docs/usage.md
@@ -50,6 +50,9 @@ OPTIONS:
   --agent-teams          Enable Claude Code Agent Teams mode for parallel development.
                          Requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 environment variable.
                          Claude acts as team leader, splitting tasks among team members.
+  --yolo                 Skip Plan Understanding Quiz and let Claude answer Codex Open
+                         Questions directly. Alias for --skip-quiz --claude-answer-codex.
+  --skip-quiz            Skip the Plan Understanding Quiz only (without other changes).
   -h, --help             Show help message
 ```
 
diff --git a/scripts/setup-rlcr-loop.sh b/scripts/setup-rlcr-loop.sh
@@ -89,6 +89,13 @@ OPTIONS:
   --agent-teams        Enable Claude Code Agent Teams mode for parallel development.
                        Requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 environment variable.
                        Claude acts as team leader, splitting tasks among team members.
+  --yolo               Skip Plan Understanding Quiz and let Claude answer Codex Open
+                       Questions directly. Convenience alias for --skip-quiz
+                       --claude-answer-codex. Use when you trust the plan and want
+                       maximum automation.
+  --skip-quiz          Skip the Plan Understanding Quiz only (without other behavioral
+                       changes). The quiz is an advisory pre-flight check that verifies
+                       you understand the plan before committing to an RLCR loop.
   --allow-empty-bitlesson-none
                        Allow BitLesson delta with action:none even with no new entries (default)
   --require-bitlesson-entry-for-none
@@ -120,6 +127,8 @@ EXAMPLES:
   /humanize:start-rlcr-loop docs/impl.md --max 20
   /humanize:start-rlcr-loop plan.md --codex-model ${DEFAULT_CODEX_MODEL}:${DEFAULT_CODEX_EFFORT}
   /humanize:start-rlcr-loop plan.md --codex-timeout 7200  # 2 hour timeout
+  /humanize:start-rlcr-loop plan.md --yolo              # skip quiz, full automation
+  /humanize:start-rlcr-loop plan.md --skip-quiz          # skip quiz only
 
 STOPPING:
   - /humanize:cancel-rlcr-loop   Cancel the active loop
@@ -235,6 +244,14 @@ while [[ $# -gt 0 ]]; do
             AGENT_TEAMS="true"
             shift
             ;;
+        --yolo)
+            ASK_CODEX_QUESTION="false"
+            shift
+            ;;
+        --skip-quiz)
+            # No-op in setup script; quiz logic lives in command markdown
+            shift
+            ;;
         --allow-empty-bitlesson-none)
             BITLESSON_ALLOW_EMPTY_NONE="true"
             shift
diff --git a/skills/humanize-rlcr/SKILL.md b/skills/humanize-rlcr/SKILL.md
@@ -107,6 +107,8 @@ Pass these through `setup-rlcr-loop.sh`:
 | `--push-every-round` | Require push each round | false |
 | `--claude-answer-codex` | Let Claude answer open questions directly | false |
 | `--agent-teams` | Enable agent teams mode | false |
+| `--yolo` | Skip quiz and enable --claude-answer-codex | false |
+| `--skip-quiz` | Skip Plan Understanding Quiz (implicit in skill mode) | false |
 
 Review phase `codex review` runs with `gpt-5.4:high`.
 
diff --git a/skills/humanize/SKILL.md b/skills/humanize/SKILL.md
@@ -96,6 +96,8 @@ Transforms a rough draft document into a structured implementation plan with:
 - `--push-every-round` - Require git push after each round
 - `--claude-answer-codex` - Let Claude answer Codex Open Questions directly (default is AskUserQuestion)
 - `--agent-teams` - Enable Agent Teams mode
+- `--yolo` - Skip Plan Understanding Quiz and enable --claude-answer-codex
+- `--skip-quiz` - Skip the Plan Understanding Quiz only
 
 ### Cancel RLCR Loop