Skip to content

fix(tasks): mark unreviewed tasks.md with a file-level marker, cleared on review#136

Merged
FlySpot merged 1 commit into
mainfrom
fix/tasks-write-before-review
Jun 15, 2026
Merged

fix(tasks): mark unreviewed tasks.md with a file-level marker, cleared on review#136
FlySpot merged 1 commit into
mainfrom
fix/tasks-write-before-review

Conversation

@FlySpot

@FlySpot FlySpot commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Closes #135.

Scope note: #135's headline ask — write-then-review so the deliverable survives a headless turn — already landed in #132. This PR closes the residual gap: a saved tasks.md that nobody reviewed was indistinguishable from a reviewed one.

Change

A file-level marker on tasks.md, mirroring the existing <!-- skip-tests: true --> contract:

  • Step 4 writes <!-- not-user-reviewed --> at the top of tasks.md. The file is written before review, so it starts as a draft.
  • Step 5 removes the marker once the user reviews the plan — on approval, or after their requested changes are applied and re-saved. If the review question is dismissed or the run is non-interactive, the marker stays; its presence is the draft signal.
  • Review feedback is asked strictly via AskUserQuestion, never plain text (a plain-text ask ends a non-interactive turn before the review outcome can be recorded).

Why a file marker (this PR's earlier revision used a chat-report line)

Per review feedback, the marker moved from the final chat report into the saved file. A file-level HTML-comment marker is pure ASCII (no fragile em-dash join key), persisted in the artifact, grep-detectable deterministically instead of parsed from freeform chat, and visible to anyone opening the file later. It's the same mechanism /awos:verify already relies on for <!-- skip-tests: true -->.

Test (Layer 1)

New tests/lint-prompts.test.js case locks both the marker shape and the removal-on-review instruction — per CLAUDE.md, a grep-catchable contract ships its test in the same PR (the adjacent skip-tests marker is locked the same way). npm run test:lint → 32/32.

Validation (live, claude 2.1.175, effort medium, branch on current main)

Both contract branches exercised end-to-end via the awos-qa harness:

Branch Setup tasks.md marker verify
Headless / no review claude -p, review question dismissed present (Claude left it, citing the rule) 4/4 pass
Review approved approval answered via a PreToolUse answer-map hook removed (0 occurrences)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation
    • Updated the “Surface for Review and Recommend Next Step” workflow to collect all review feedback exclusively via the AskUserQuestion tool. Added draft-state handling using a <!-- not-user-reviewed --> marker that is removed only after a completed review, while unanswered/dismissed reviews preserve the draft marker and reflect that status in the final report.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 77209c12-ccc5-469d-8dd3-70a74c14b8a8

📥 Commits

Reviewing files that changed from the base of the PR and between 98d5cbf and 75d2697.

📒 Files selected for processing (2)
  • commands/tasks.md
  • tests/lint-prompts.test.js

📝 Walkthrough

Walkthrough

Step 4 in commands/tasks.md now writes a <!-- not-user-reviewed --> draft marker to the file immediately after saving the initial task checklist. Step 5 mandates review feedback exclusively through the AskUserQuestion tool; the marker is removed only after a user responds (with changes applied if requested). If the review question is unanswered or dismissed, the file is preserved exactly as saved with the marker intact, signaling draft status. A new test validates the marker presence and removal instructions. The recommendations table and /awos:implement final step remain unchanged.

Changes

Task Planning Workflow Fix

Layer / File(s) Summary
Draft marker metadata in Step 4
commands/tasks.md
Step 4 now writes the <!-- not-user-reviewed --> marker to the top of tasks.md immediately after saving the initial task checklist, and conditionally adds a <!-- skip-tests: true --> note when SKIP_TESTS = true.
Step 5 review tool enforcement and marker removal
commands/tasks.md
Step 5 requires review feedback exclusively through the AskUserQuestion tool; removes the <!-- not-user-reviewed --> marker only after receiving a user response (including re-save after requested edits); if unanswered or non-interactive, preserves the file and marker as-is to signal draft status.
Preserving recommendations table and final step
commands/tasks.md
Conditional recommendations table remains unchanged: displayed when tasks use general-purpose or verification MCPs/services are missing. Final report continues to surface /awos:implement as the next command regardless of review outcome.
Test validation of draft marker and removal instructions
tests/lint-prompts.test.js
New test case validates that commands/tasks.md contains the literal <!-- not-user-reviewed --> marker and contains prompt text instructing Step 5 to remove that marker upon review completion.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • provectus/awos#132: Modifies commands/tasks.md's Step 4–5 flow around AskUserQuestion handling and when tasks.md is written with a draft marker, directly connected at the same workflow points.

Suggested labels

patch

Suggested reviewers

  • workshur
  • AlexanderMakarov
  • dustyo-O

Poem

🐰 I wrote the marker "not-reviewed" at the start,
Then asked the question tool to play its part.
If answered, off it goes—the marker flies away,
If silent? Draft stays put for yet another day.
Now /awos:implement hops forth, hip hooray!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: marking unreviewed tasks.md with a file-level marker that is cleared on review.
Linked Issues check ✅ Passed The PR implements core objectives from #135: mandating AskUserQuestion for review feedback, implementing a stable unreviewed-output marker line, and ensuring deterministic headless runs with detectable draft output.
Out of Scope Changes check ✅ Passed All changes are directly scoped to addressing #135: Step 5 workflow updates, draft marker implementation, and validation test for the new marker—no extraneous modifications present.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/tasks-write-before-review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
commands/tasks.md (1)

82-90: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove the pre-save interactive gate for the QA-agent fallback.

This AskUserQuestion branch still runs before Step 4, so a missing QA agent can block non-interactive runs before tasks.md exists. Make the no-agent path deterministic (for example, default to general-purpose and flag it in the Recommendations table) or defer this choice until after the file is saved.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commands/tasks.md` around lines 82 - 90, Remove the pre-save interactive gate
that invokes AskUserQuestion before tasks.md is persisted; instead make the
no-QA-agent path deterministic by defaulting to the built-in "general-purpose"
agent (mark the slice as "**[Agent: general-purpose]**") and add a clear entry
in the Recommendations table indicating the fallback, or alternatively defer the
AskUserQuestion until after tasks.md is saved so non-interactive runs are not
blocked; update logic around the QA-agent selection (the block that currently
offers /awos:hire, /awos:tasks, and SKIP_TESTS choices) to implement one of
these deterministic behaviors and ensure the Feature Testing & Regression slice
emission still uses the {qa-agent} substitution and the exact required wording.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@commands/tasks.md`:
- Around line 82-90: Remove the pre-save interactive gate that invokes
AskUserQuestion before tasks.md is persisted; instead make the no-QA-agent path
deterministic by defaulting to the built-in "general-purpose" agent (mark the
slice as "**[Agent: general-purpose]**") and add a clear entry in the
Recommendations table indicating the fallback, or alternatively defer the
AskUserQuestion until after tasks.md is saved so non-interactive runs are not
blocked; update logic around the QA-agent selection (the block that currently
offers /awos:hire, /awos:tasks, and SKIP_TESTS choices) to implement one of
these deterministic behaviors and ensure the Feature Testing & Regression slice
emission still uses the {qa-agent} substitution and the exact required wording.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 4510a4cd-ba5a-47d0-84b0-2910b9ef828b

📥 Commits

Reviewing files that changed from the base of the PR and between 1fbba1a and 6a864ce.

📒 Files selected for processing (1)
  • commands/tasks.md

@FlySpot FlySpot requested a review from AlexanderMakarov June 12, 2026 16:08
@FlySpot FlySpot changed the title fix(tasks): write tasks.md before the review ask; mandate AskUserQuestion in Step 5 fix(tasks): make unreviewed headless output detectable in Step 5 Jun 12, 2026
@FlySpot FlySpot force-pushed the fix/tasks-write-before-review branch from 6a864ce to 98d5cbf Compare June 12, 2026 17:54

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
commands/tasks.md (1)

83-87: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the stale Step 3.4 cross-reference.

There is no Step 3.4 in this file, so the QA-agent selection instruction points to a non-existent location. Please change it to Step 3 or Step 3a so the prompt stays self-consistent.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commands/tasks.md` around lines 83 - 87, Update the stale cross-reference
"Step 3.4" in the QA-agent selection paragraph to point to the correct existing
section (use "Step 3" or "Step 3a" as appropriate) so the instruction refers to
a real location; edit the sentence that reads "introspecting the `Agent` tool's
description block from Step 3.4" and replace "Step 3.4" with the chosen valid
section label, ensuring the rest of the sentence and the ordered selection rules
remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@commands/tasks.md`:
- Around line 31-32: Change the blanket fallback sentence so only prompts that
have an explicitly documented default fall back when unanswered; keep Step 1
(spec selection) as a hard stop if unanswered/ambiguous. Update the paragraph
referencing AskUserQuestion and the fallback rule to clarify “A skipped or
unanswered question only falls back to the documented default if that specific
question includes a documented default; otherwise treat it as unanswered and
halt at Step 1 (do not auto-continue to write tasks.md).” Ensure references to
AskUserQuestion, Step 1, and writing tasks.md are updated so the behavior for
unattended runs is unambiguous.

---

Outside diff comments:
In `@commands/tasks.md`:
- Around line 83-87: Update the stale cross-reference "Step 3.4" in the QA-agent
selection paragraph to point to the correct existing section (use "Step 3" or
"Step 3a" as appropriate) so the instruction refers to a real location; edit the
sentence that reads "introspecting the `Agent` tool's description block from
Step 3.4" and replace "Step 3.4" with the chosen valid section label, ensuring
the rest of the sentence and the ordered selection rules remain unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: bc659ece-27b3-4bc5-b542-cb7049d0d8f3

📥 Commits

Reviewing files that changed from the base of the PR and between 6a864ce and 98d5cbf.

📒 Files selected for processing (1)
  • commands/tasks.md

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
commands/tasks.md (1)

83-87: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the stale Step 3.4 cross-reference.

There is no Step 3.4 in this file, so the QA-agent selection instruction points to a non-existent location. Please change it to Step 3 or Step 3a so the prompt stays self-consistent.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commands/tasks.md` around lines 83 - 87, Update the stale cross-reference
"Step 3.4" in the QA-agent selection paragraph to point to the correct existing
section (use "Step 3" or "Step 3a" as appropriate) so the instruction refers to
a real location; edit the sentence that reads "introspecting the `Agent` tool's
description block from Step 3.4" and replace "Step 3.4" with the chosen valid
section label, ensuring the rest of the sentence and the ordered selection rules
remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@commands/tasks.md`:
- Around line 31-32: Change the blanket fallback sentence so only prompts that
have an explicitly documented default fall back when unanswered; keep Step 1
(spec selection) as a hard stop if unanswered/ambiguous. Update the paragraph
referencing AskUserQuestion and the fallback rule to clarify “A skipped or
unanswered question only falls back to the documented default if that specific
question includes a documented default; otherwise treat it as unanswered and
halt at Step 1 (do not auto-continue to write tasks.md).” Ensure references to
AskUserQuestion, Step 1, and writing tasks.md are updated so the behavior for
unattended runs is unambiguous.

---

Outside diff comments:
In `@commands/tasks.md`:
- Around line 83-87: Update the stale cross-reference "Step 3.4" in the QA-agent
selection paragraph to point to the correct existing section (use "Step 3" or
"Step 3a" as appropriate) so the instruction refers to a real location; edit the
sentence that reads "introspecting the `Agent` tool's description block from
Step 3.4" and replace "Step 3.4" with the chosen valid section label, ensuring
the rest of the sentence and the ordered selection rules remain unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: bc659ece-27b3-4bc5-b542-cb7049d0d8f3

📥 Commits

Reviewing files that changed from the base of the PR and between 6a864ce and 98d5cbf.

📒 Files selected for processing (1)
  • commands/tasks.md
🛑 Comments failed to post (1)
commands/tasks.md (1)

31-32: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Narrow the fallback to questions with a documented default.

Line 32 turns every unanswered prompt into “continue and write tasks.md,” but Step 1 explicitly says not to proceed until a valid spec is selected. In an unattended run with an ambiguous prompt, this can push the workflow forward without a chosen spec.

Proposed fix
- - **A skipped or unanswered question — as happens in an unattended `claude -p` run — is never a stop signal. Fall back to the documented default for that question and continue through the remaining steps, including writing `tasks.md`.**
+ - **If a question has a documented default, use it when the answer is skipped or unanswered. Otherwise, stop and wait for a response.**
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commands/tasks.md` around lines 31 - 32, Change the blanket fallback sentence
so only prompts that have an explicitly documented default fall back when
unanswered; keep Step 1 (spec selection) as a hard stop if unanswered/ambiguous.
Update the paragraph referencing AskUserQuestion and the fallback rule to
clarify “A skipped or unanswered question only falls back to the documented
default if that specific question includes a documented default; otherwise treat
it as unanswered and halt at Step 1 (do not auto-continue to write tasks.md).”
Ensure references to AskUserQuestion, Step 1, and writing tasks.md are updated
so the behavior for unattended runs is unambiguous.

@AlexanderMakarov AlexanderMakarov left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is well-scoped. #135's headline ask — write-then-review so the deliverable survives a headless turn — already landed in #132, and this PR correctly limits itself to the residual gap: a saved tasks.md that nobody reviewed is indistinguishable from one that was. The tasks.md saved as drafted — not user-reviewed marker plus the "ask only through AskUserQuestion" rule cover that, and it sits cleanly under the INTERACTION guard (an unanswered review question falls back to "keep as saved + emit marker" rather than stalling). The Step 4/5 renumber is correct and nothing else references the old item numbers.

One thing I'd like resolved before merge: you've introduced a pinned, machine-detected marker string but no Layer 1 test pins it — the directly analogous <!-- skip-tests: true --> marker right below it is locked by a lint test, and CLAUDE.md says a grep-catchable contract ships its test in the same PR. Detail inline, plus a small robustness note on the marker glyph.

I considered CodeRabbit's two outside-diff notes and left both: the "remove the pre-save QA gate" (Step 3a) and "Step 3.4 is stale" points target pre-existing lines this PR doesn't touch, and the QA-gate concern is already handled by the post-#132 INTERACTION fallback (a skipped hire question defaults to general-purpose), so neither is in scope here.

Comment thread commands/tasks.md Outdated
2. If any tasks were assigned to `general-purpose` (because no specialist exists) or verification cannot be performed (missing MCPs/services), surface a table:
1. Report the saved path and present the slice/task plan for review.
2. Ask for review feedback strictly via the `AskUserQuestion` tool (e.g. options "Looks good — keep it as saved" / "I want changes") — never in plain text, which would end a non-interactive turn before the review outcome can be reported. If the user requests changes (adjust, split, merge slices or tasks, or reassign subagents), apply them and re-save; otherwise they can revise later by re-running `/awos:tasks`.
3. If the review question goes unanswered (dismissed, or the run is non-interactive), keep the saved file as-is and include this exact line in your final report: `tasks.md saved as drafted — not user-reviewed`. Keep the wording — downstream automations depend on it to tell a reviewed plan from a draft.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've added a marker that downstream automation greps for (tasks.md saved as drafted — not user-reviewed), and the PR body says the wording is load-bearing for awos-qa. Two things:

  1. No Layer 1 test guards it. The <!-- skip-tests: true --> marker a few items down is locked by tests/lint-prompts.test.js:542 ("The marker shape is part of the contract — if it moves, verify.md will not detect it. Lock the shape here."), and CLAUDE.md says a surface-area contract a grep can catch ships its test in the same PR. The behavioral check lives in awos-qa, but nothing in this repo stops an innocent reword of this line from silently drifting from what the harness expects. A one-line assertion mirroring the skip-tests test would close it:

    assert.ok(
      body.includes('tasks.md saved as drafted — not user-reviewed'),
      'commands/tasks.md must keep the literal unreviewed-output marker so awos-qa can detect draft-grade tasks.md'
    );
  2. The em-dash makes the marker a fragile join key. Unlike the pure-ASCII <!-- skip-tests: true -->, this string carries a that a model emitting the final report can easily render as - or --, and it's hard to eyeball the difference when a harness match fails. Worth considering an ASCII form. If you keep the em-dash, the Layer 1 pin from (1) at least makes the exact glyph load-bearing in the test.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both points addressed in 75d2697.

  1. Layer 1 test — added in tests/lint-prompts.test.js, mirroring the skip-tests lock: pins the marker shape and the removal-on-review instruction. npm run test:lint → 32/32.

  2. Em-dash fragility — rather than ASCII-ize the chat line, the marker moved out of the chat report and into the saved file as <!-- not-user-reviewed -->, directly analogous to <!-- skip-tests: true -->: pure ASCII, persisted in the artifact, grep-detectable deterministically (not parsed from freeform chat), and visible to anyone opening the file later. Step 4 writes it by default; Step 5 removes it once the plan is reviewed.

Validated live both ways: a headless (dismissed-review) run keeps the marker (verify 4/4), and an approval answered via a PreToolUse hook removes it (0 occurrences).

On the two CodeRabbit out-of-scope notes — agreed they target pre-existing lines this PR doesn't touch; leaving them for a separate change.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both addressed cleanly in 75d2697 — thanks. Moving the marker into the saved file as <!-- not-user-reviewed --> is a better fix than the ASCII chat-line I floated: persisted in the artifact, deterministic to grep, visible to anyone opening the file.

The new test checks two things: that the marker string is present, and that some "remove the marker" instruction exists. What it doesn't check is the order that actually matters — Step 4 writes the marker before review, and Step 5 removes it only after the user responds. So someone could later move the write to after review, or make Step 5 always delete it, and the test would still pass.

The real behavioral check belongs in awos-qa. But Layer 1 owns the prompt text, so it can go a bit further cheaply: assert that the write lives in Step 4 and that the removal is tied to a review response (e.g. the text "once the user has reviewed" or "When the user responds"). That closes the gap without having to parse sections.

…d on review

#135's headline ask (write-then-review so the deliverable survives a
headless turn) landed in #132. This closes the residual gap: a saved
tasks.md that nobody reviewed was indistinguishable from a reviewed
one.

- Step 4 writes a literal '<!-- not-user-reviewed -->' marker at the
  top of tasks.md (it is written before review, so it starts a draft).
- Step 5 removes the marker once the user reviews the plan (approves,
  or after their requested changes are applied and re-saved); if the
  review question is dismissed or the run is non-interactive, the
  marker stays — its presence is the draft signal.
- Review feedback is asked strictly via AskUserQuestion, never plain
  text (a plain-text ask ends a non-interactive turn before the review
  outcome can be recorded).

A file-level HTML-comment marker mirrors the existing
'<!-- skip-tests: true -->' contract: pure ASCII, persisted in the
artifact, grep-detectable deterministically (not parsed from freeform
chat), and visible to anyone opening the file later. Locked by a new
Layer 1 lint test, per CLAUDE.md (a grep-catchable contract ships its
test in the same PR).

Validated live (claude 2.1.175, effort medium): headless run keeps the
marker (verify 4/4 pass); an approval answered via a PreToolUse hook
removes it.

Closes #135

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@FlySpot FlySpot force-pushed the fix/tasks-write-before-review branch from 98d5cbf to 75d2697 Compare June 15, 2026 11:56
@FlySpot FlySpot changed the title fix(tasks): make unreviewed headless output detectable in Step 5 fix(tasks): mark unreviewed tasks.md with a file-level marker, cleared on review Jun 15, 2026

@AlexanderMakarov AlexanderMakarov left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review after 75d2697. Both points from my earlier pass are addressed, and well.

Two non-blocking follow-ups, neither a merge blocker:

  1. The new lint test pins that the marker exists and that a removal instruction exists, but not the lifecycle that's the actual contract — written in Step 4 before review, removed only once the user responds (reply on my earlier thread).
  2. Minor: <!-- not-user-reviewed --> and <!-- skip-tests: true --> now share "the top", so Step 5's removal is worth scoping explicitly to the one line (inline).

Happy to approve as-is.

Comment thread commands/tasks.md

1. Write the complete slice/task list to `tasks.md` in the chosen spec directory. **Write the file without waiting for approval** — generating a task list is reversible (re-run `/awos:tasks` to revise), so the deliverable must never be gated behind a confirmation that an unattended run cannot answer.
2. If `SKIP_TESTS = true`, record a one-line note at the top of the generated `tasks.md` so that downstream commands (e.g. `/awos:verify`) can detect the choice: `<!-- skip-tests: true -->`.
2. Record a one-line marker at the very top of the generated `tasks.md`: `<!-- not-user-reviewed -->`. The file is written before review (Step 5), so it starts as a draft; Step 5 removes this marker once the user has reviewed it. Keep the marker shape exactly — downstream automations grep for it to tell a draft from a reviewed plan.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The <!-- not-user-reviewed --> marker now shares "the top" with <!-- skip-tests: true --> (Step 4.3) whenever SKIP_TESTS = true. Both minor:

  • Step 5.3 says "remove the <!-- not-user-reviewed --> marker from the top of tasks.md" — naming the exact string is the right call, but worth stating explicitly to leave any <!-- skip-tests: true --> line intact. That one has to survive review, and a less careful pass could read "remove the top comment" as the whole block.
  • On write, the relative order of the two is unspecified. Harmless for grep (it's position-independent), but pinning "not-user-reviewed first, skip-tests second" reads cleaner.

Optional hardening, not a blocker.

Comment thread commands/tasks.md Outdated
2. If any tasks were assigned to `general-purpose` (because no specialist exists) or verification cannot be performed (missing MCPs/services), surface a table:
1. Report the saved path and present the slice/task plan for review.
2. Ask for review feedback strictly via the `AskUserQuestion` tool (e.g. options "Looks good — keep it as saved" / "I want changes") — never in plain text, which would end a non-interactive turn before the review outcome can be reported. If the user requests changes (adjust, split, merge slices or tasks, or reassign subagents), apply them and re-save; otherwise they can revise later by re-running `/awos:tasks`.
3. If the review question goes unanswered (dismissed, or the run is non-interactive), keep the saved file as-is and include this exact line in your final report: `tasks.md saved as drafted — not user-reviewed`. Keep the wording — downstream automations depend on it to tell a reviewed plan from a draft.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both addressed cleanly in 75d2697 — thanks. Moving the marker into the saved file as <!-- not-user-reviewed --> is a better fix than the ASCII chat-line I floated: persisted in the artifact, deterministic to grep, visible to anyone opening the file.

The new test checks two things: that the marker string is present, and that some "remove the marker" instruction exists. What it doesn't check is the order that actually matters — Step 4 writes the marker before review, and Step 5 removes it only after the user responds. So someone could later move the write to after review, or make Step 5 always delete it, and the test would still pass.

The real behavioral check belongs in awos-qa. But Layer 1 owns the prompt text, so it can go a bit further cheaply: assert that the write lives in Step 4 and that the removal is tied to a review response (e.g. the text "once the user has reviewed" or "When the user responds"). That closes the gap without having to parse sections.

@FlySpot FlySpot merged commit b906876 into main Jun 15, 2026
7 checks passed
@FlySpot FlySpot deleted the fix/tasks-write-before-review branch June 15, 2026 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tasks: Step 4 draft-review gate stalls headless runs — tasks.md never written when the feedback ask is plain text

2 participants