fix(tasks): mark unreviewed tasks.md with a file-level marker, cleared on review#136
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughStep 4 in ChangesTask Planning Workflow Fix
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
commands/tasks.md (1)
82-90:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove the pre-save interactive gate for the QA-agent fallback.
This
AskUserQuestionbranch still runs before Step 4, so a missing QA agent can block non-interactive runs beforetasks.mdexists. Make the no-agent path deterministic (for example, default togeneral-purposeand flag it in the Recommendations table) or defer this choice until after the file is saved.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@commands/tasks.md` around lines 82 - 90, Remove the pre-save interactive gate that invokes AskUserQuestion before tasks.md is persisted; instead make the no-QA-agent path deterministic by defaulting to the built-in "general-purpose" agent (mark the slice as "**[Agent: general-purpose]**") and add a clear entry in the Recommendations table indicating the fallback, or alternatively defer the AskUserQuestion until after tasks.md is saved so non-interactive runs are not blocked; update logic around the QA-agent selection (the block that currently offers /awos:hire, /awos:tasks, and SKIP_TESTS choices) to implement one of these deterministic behaviors and ensure the Feature Testing & Regression slice emission still uses the {qa-agent} substitution and the exact required wording.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@commands/tasks.md`:
- Around line 82-90: Remove the pre-save interactive gate that invokes
AskUserQuestion before tasks.md is persisted; instead make the no-QA-agent path
deterministic by defaulting to the built-in "general-purpose" agent (mark the
slice as "**[Agent: general-purpose]**") and add a clear entry in the
Recommendations table indicating the fallback, or alternatively defer the
AskUserQuestion until after tasks.md is saved so non-interactive runs are not
blocked; update logic around the QA-agent selection (the block that currently
offers /awos:hire, /awos:tasks, and SKIP_TESTS choices) to implement one of
these deterministic behaviors and ensure the Feature Testing & Regression slice
emission still uses the {qa-agent} substitution and the exact required wording.
6a864ce to
98d5cbf
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
commands/tasks.md (1)
83-87:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix the stale Step 3.4 cross-reference.
There is no Step 3.4 in this file, so the QA-agent selection instruction points to a non-existent location. Please change it to Step 3 or Step 3a so the prompt stays self-consistent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@commands/tasks.md` around lines 83 - 87, Update the stale cross-reference "Step 3.4" in the QA-agent selection paragraph to point to the correct existing section (use "Step 3" or "Step 3a" as appropriate) so the instruction refers to a real location; edit the sentence that reads "introspecting the `Agent` tool's description block from Step 3.4" and replace "Step 3.4" with the chosen valid section label, ensuring the rest of the sentence and the ordered selection rules remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@commands/tasks.md`:
- Around line 31-32: Change the blanket fallback sentence so only prompts that
have an explicitly documented default fall back when unanswered; keep Step 1
(spec selection) as a hard stop if unanswered/ambiguous. Update the paragraph
referencing AskUserQuestion and the fallback rule to clarify “A skipped or
unanswered question only falls back to the documented default if that specific
question includes a documented default; otherwise treat it as unanswered and
halt at Step 1 (do not auto-continue to write tasks.md).” Ensure references to
AskUserQuestion, Step 1, and writing tasks.md are updated so the behavior for
unattended runs is unambiguous.
---
Outside diff comments:
In `@commands/tasks.md`:
- Around line 83-87: Update the stale cross-reference "Step 3.4" in the QA-agent
selection paragraph to point to the correct existing section (use "Step 3" or
"Step 3a" as appropriate) so the instruction refers to a real location; edit the
sentence that reads "introspecting the `Agent` tool's description block from
Step 3.4" and replace "Step 3.4" with the chosen valid section label, ensuring
the rest of the sentence and the ordered selection rules remain unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
commands/tasks.md (1)
83-87:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix the stale Step 3.4 cross-reference.
There is no Step 3.4 in this file, so the QA-agent selection instruction points to a non-existent location. Please change it to Step 3 or Step 3a so the prompt stays self-consistent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@commands/tasks.md` around lines 83 - 87, Update the stale cross-reference "Step 3.4" in the QA-agent selection paragraph to point to the correct existing section (use "Step 3" or "Step 3a" as appropriate) so the instruction refers to a real location; edit the sentence that reads "introspecting the `Agent` tool's description block from Step 3.4" and replace "Step 3.4" with the chosen valid section label, ensuring the rest of the sentence and the ordered selection rules remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@commands/tasks.md`:
- Around line 31-32: Change the blanket fallback sentence so only prompts that
have an explicitly documented default fall back when unanswered; keep Step 1
(spec selection) as a hard stop if unanswered/ambiguous. Update the paragraph
referencing AskUserQuestion and the fallback rule to clarify “A skipped or
unanswered question only falls back to the documented default if that specific
question includes a documented default; otherwise treat it as unanswered and
halt at Step 1 (do not auto-continue to write tasks.md).” Ensure references to
AskUserQuestion, Step 1, and writing tasks.md are updated so the behavior for
unattended runs is unambiguous.
---
Outside diff comments:
In `@commands/tasks.md`:
- Around line 83-87: Update the stale cross-reference "Step 3.4" in the QA-agent
selection paragraph to point to the correct existing section (use "Step 3" or
"Step 3a" as appropriate) so the instruction refers to a real location; edit the
sentence that reads "introspecting the `Agent` tool's description block from
Step 3.4" and replace "Step 3.4" with the chosen valid section label, ensuring
the rest of the sentence and the ordered selection rules remain unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: bc659ece-27b3-4bc5-b542-cb7049d0d8f3
📒 Files selected for processing (1)
commands/tasks.md
🛑 Comments failed to post (1)
commands/tasks.md (1)
31-32:
⚠️ Potential issue | 🟠 Major | ⚡ Quick winNarrow the fallback to questions with a documented default.
Line 32 turns every unanswered prompt into “continue and write
tasks.md,” but Step 1 explicitly says not to proceed until a valid spec is selected. In an unattended run with an ambiguous prompt, this can push the workflow forward without a chosen spec.Proposed fix
- - **A skipped or unanswered question — as happens in an unattended `claude -p` run — is never a stop signal. Fall back to the documented default for that question and continue through the remaining steps, including writing `tasks.md`.** + - **If a question has a documented default, use it when the answer is skipped or unanswered. Otherwise, stop and wait for a response.**🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@commands/tasks.md` around lines 31 - 32, Change the blanket fallback sentence so only prompts that have an explicitly documented default fall back when unanswered; keep Step 1 (spec selection) as a hard stop if unanswered/ambiguous. Update the paragraph referencing AskUserQuestion and the fallback rule to clarify “A skipped or unanswered question only falls back to the documented default if that specific question includes a documented default; otherwise treat it as unanswered and halt at Step 1 (do not auto-continue to write tasks.md).” Ensure references to AskUserQuestion, Step 1, and writing tasks.md are updated so the behavior for unattended runs is unambiguous.
AlexanderMakarov
left a comment
There was a problem hiding this comment.
This is well-scoped. #135's headline ask — write-then-review so the deliverable survives a headless turn — already landed in #132, and this PR correctly limits itself to the residual gap: a saved tasks.md that nobody reviewed is indistinguishable from one that was. The tasks.md saved as drafted — not user-reviewed marker plus the "ask only through AskUserQuestion" rule cover that, and it sits cleanly under the INTERACTION guard (an unanswered review question falls back to "keep as saved + emit marker" rather than stalling). The Step 4/5 renumber is correct and nothing else references the old item numbers.
One thing I'd like resolved before merge: you've introduced a pinned, machine-detected marker string but no Layer 1 test pins it — the directly analogous <!-- skip-tests: true --> marker right below it is locked by a lint test, and CLAUDE.md says a grep-catchable contract ships its test in the same PR. Detail inline, plus a small robustness note on the marker glyph.
I considered CodeRabbit's two outside-diff notes and left both: the "remove the pre-save QA gate" (Step 3a) and "Step 3.4 is stale" points target pre-existing lines this PR doesn't touch, and the QA-gate concern is already handled by the post-#132 INTERACTION fallback (a skipped hire question defaults to general-purpose), so neither is in scope here.
| 2. If any tasks were assigned to `general-purpose` (because no specialist exists) or verification cannot be performed (missing MCPs/services), surface a table: | ||
| 1. Report the saved path and present the slice/task plan for review. | ||
| 2. Ask for review feedback strictly via the `AskUserQuestion` tool (e.g. options "Looks good — keep it as saved" / "I want changes") — never in plain text, which would end a non-interactive turn before the review outcome can be reported. If the user requests changes (adjust, split, merge slices or tasks, or reassign subagents), apply them and re-save; otherwise they can revise later by re-running `/awos:tasks`. | ||
| 3. If the review question goes unanswered (dismissed, or the run is non-interactive), keep the saved file as-is and include this exact line in your final report: `tasks.md saved as drafted — not user-reviewed`. Keep the wording — downstream automations depend on it to tell a reviewed plan from a draft. |
There was a problem hiding this comment.
You've added a marker that downstream automation greps for (tasks.md saved as drafted — not user-reviewed), and the PR body says the wording is load-bearing for awos-qa. Two things:
-
No Layer 1 test guards it. The
<!-- skip-tests: true -->marker a few items down is locked bytests/lint-prompts.test.js:542("The marker shape is part of the contract — if it moves, verify.md will not detect it. Lock the shape here."), andCLAUDE.mdsays a surface-area contract a grep can catch ships its test in the same PR. The behavioral check lives in awos-qa, but nothing in this repo stops an innocent reword of this line from silently drifting from what the harness expects. A one-line assertion mirroring the skip-tests test would close it:assert.ok( body.includes('tasks.md saved as drafted — not user-reviewed'), 'commands/tasks.md must keep the literal unreviewed-output marker so awos-qa can detect draft-grade tasks.md' );
-
The em-dash makes the marker a fragile join key. Unlike the pure-ASCII
<!-- skip-tests: true -->, this string carries a—that a model emitting the final report can easily render as-or--, and it's hard to eyeball the difference when a harness match fails. Worth considering an ASCII form. If you keep the em-dash, the Layer 1 pin from (1) at least makes the exact glyph load-bearing in the test.
There was a problem hiding this comment.
Both points addressed in 75d2697.
-
Layer 1 test — added in
tests/lint-prompts.test.js, mirroring the skip-tests lock: pins the marker shape and the removal-on-review instruction.npm run test:lint→ 32/32. -
Em-dash fragility — rather than ASCII-ize the chat line, the marker moved out of the chat report and into the saved file as
<!-- not-user-reviewed -->, directly analogous to<!-- skip-tests: true -->: pure ASCII, persisted in the artifact, grep-detectable deterministically (not parsed from freeform chat), and visible to anyone opening the file later. Step 4 writes it by default; Step 5 removes it once the plan is reviewed.
Validated live both ways: a headless (dismissed-review) run keeps the marker (verify 4/4), and an approval answered via a PreToolUse hook removes it (0 occurrences).
On the two CodeRabbit out-of-scope notes — agreed they target pre-existing lines this PR doesn't touch; leaving them for a separate change.
There was a problem hiding this comment.
Both addressed cleanly in 75d2697 — thanks. Moving the marker into the saved file as <!-- not-user-reviewed --> is a better fix than the ASCII chat-line I floated: persisted in the artifact, deterministic to grep, visible to anyone opening the file.
The new test checks two things: that the marker string is present, and that some "remove the marker" instruction exists. What it doesn't check is the order that actually matters — Step 4 writes the marker before review, and Step 5 removes it only after the user responds. So someone could later move the write to after review, or make Step 5 always delete it, and the test would still pass.
The real behavioral check belongs in awos-qa. But Layer 1 owns the prompt text, so it can go a bit further cheaply: assert that the write lives in Step 4 and that the removal is tied to a review response (e.g. the text "once the user has reviewed" or "When the user responds"). That closes the gap without having to parse sections.
…d on review #135's headline ask (write-then-review so the deliverable survives a headless turn) landed in #132. This closes the residual gap: a saved tasks.md that nobody reviewed was indistinguishable from a reviewed one. - Step 4 writes a literal '<!-- not-user-reviewed -->' marker at the top of tasks.md (it is written before review, so it starts a draft). - Step 5 removes the marker once the user reviews the plan (approves, or after their requested changes are applied and re-saved); if the review question is dismissed or the run is non-interactive, the marker stays — its presence is the draft signal. - Review feedback is asked strictly via AskUserQuestion, never plain text (a plain-text ask ends a non-interactive turn before the review outcome can be recorded). A file-level HTML-comment marker mirrors the existing '<!-- skip-tests: true -->' contract: pure ASCII, persisted in the artifact, grep-detectable deterministically (not parsed from freeform chat), and visible to anyone opening the file later. Locked by a new Layer 1 lint test, per CLAUDE.md (a grep-catchable contract ships its test in the same PR). Validated live (claude 2.1.175, effort medium): headless run keeps the marker (verify 4/4 pass); an approval answered via a PreToolUse hook removes it. Closes #135 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
98d5cbf to
75d2697
Compare
AlexanderMakarov
left a comment
There was a problem hiding this comment.
Re-review after 75d2697. Both points from my earlier pass are addressed, and well.
Two non-blocking follow-ups, neither a merge blocker:
- The new lint test pins that the marker exists and that a removal instruction exists, but not the lifecycle that's the actual contract — written in Step 4 before review, removed only once the user responds (reply on my earlier thread).
- Minor:
<!-- not-user-reviewed -->and<!-- skip-tests: true -->now share "the top", so Step 5's removal is worth scoping explicitly to the one line (inline).
Happy to approve as-is.
|
|
||
| 1. Write the complete slice/task list to `tasks.md` in the chosen spec directory. **Write the file without waiting for approval** — generating a task list is reversible (re-run `/awos:tasks` to revise), so the deliverable must never be gated behind a confirmation that an unattended run cannot answer. | ||
| 2. If `SKIP_TESTS = true`, record a one-line note at the top of the generated `tasks.md` so that downstream commands (e.g. `/awos:verify`) can detect the choice: `<!-- skip-tests: true -->`. | ||
| 2. Record a one-line marker at the very top of the generated `tasks.md`: `<!-- not-user-reviewed -->`. The file is written before review (Step 5), so it starts as a draft; Step 5 removes this marker once the user has reviewed it. Keep the marker shape exactly — downstream automations grep for it to tell a draft from a reviewed plan. |
There was a problem hiding this comment.
The <!-- not-user-reviewed --> marker now shares "the top" with <!-- skip-tests: true --> (Step 4.3) whenever SKIP_TESTS = true. Both minor:
- Step 5.3 says "remove the
<!-- not-user-reviewed -->marker from the top oftasks.md" — naming the exact string is the right call, but worth stating explicitly to leave any<!-- skip-tests: true -->line intact. That one has to survive review, and a less careful pass could read "remove the top comment" as the whole block. - On write, the relative order of the two is unspecified. Harmless for grep (it's position-independent), but pinning "not-user-reviewed first, skip-tests second" reads cleaner.
Optional hardening, not a blocker.
| 2. If any tasks were assigned to `general-purpose` (because no specialist exists) or verification cannot be performed (missing MCPs/services), surface a table: | ||
| 1. Report the saved path and present the slice/task plan for review. | ||
| 2. Ask for review feedback strictly via the `AskUserQuestion` tool (e.g. options "Looks good — keep it as saved" / "I want changes") — never in plain text, which would end a non-interactive turn before the review outcome can be reported. If the user requests changes (adjust, split, merge slices or tasks, or reassign subagents), apply them and re-save; otherwise they can revise later by re-running `/awos:tasks`. | ||
| 3. If the review question goes unanswered (dismissed, or the run is non-interactive), keep the saved file as-is and include this exact line in your final report: `tasks.md saved as drafted — not user-reviewed`. Keep the wording — downstream automations depend on it to tell a reviewed plan from a draft. |
There was a problem hiding this comment.
Both addressed cleanly in 75d2697 — thanks. Moving the marker into the saved file as <!-- not-user-reviewed --> is a better fix than the ASCII chat-line I floated: persisted in the artifact, deterministic to grep, visible to anyone opening the file.
The new test checks two things: that the marker string is present, and that some "remove the marker" instruction exists. What it doesn't check is the order that actually matters — Step 4 writes the marker before review, and Step 5 removes it only after the user responds. So someone could later move the write to after review, or make Step 5 always delete it, and the test would still pass.
The real behavioral check belongs in awos-qa. But Layer 1 owns the prompt text, so it can go a bit further cheaply: assert that the write lives in Step 4 and that the removal is tied to a review response (e.g. the text "once the user has reviewed" or "When the user responds"). That closes the gap without having to parse sections.
Closes #135.
Change
A file-level marker on
tasks.md, mirroring the existing<!-- skip-tests: true -->contract:<!-- not-user-reviewed -->at the top oftasks.md. The file is written before review, so it starts as a draft.AskUserQuestion, never plain text (a plain-text ask ends a non-interactive turn before the review outcome can be recorded).Why a file marker (this PR's earlier revision used a chat-report line)
Per review feedback, the marker moved from the final chat report into the saved file. A file-level HTML-comment marker is pure ASCII (no fragile em-dash join key), persisted in the artifact, grep-detectable deterministically instead of parsed from freeform chat, and visible to anyone opening the file later. It's the same mechanism
/awos:verifyalready relies on for<!-- skip-tests: true -->.Test (Layer 1)
New
tests/lint-prompts.test.jscase locks both the marker shape and the removal-on-review instruction — perCLAUDE.md, a grep-catchable contract ships its test in the same PR (the adjacentskip-testsmarker is locked the same way).npm run test:lint→ 32/32.Validation (live, claude 2.1.175, effort medium, branch on current main)
Both contract branches exercised end-to-end via the awos-qa harness:
claude -p, review question dismissedPreToolUseanswer-map hook🤖 Generated with Claude Code
Summary by CodeRabbit
<!-- not-user-reviewed -->marker that is removed only after a completed review, while unanswered/dismissed reviews preserve the draft marker and reflect that status in the final report.