Add optional skills-mode improve and verify stages#144
Conversation
SummaryThis PR adds two optional skills-mode pipeline stages ( Architecture & Design ✅The approach is solid:
Concerns1. Verify retry loop on blocking gate When the verify gate throws ( Is this intentional? The existing Suggestion: either add a 2. Budget borrowing
But the naming is misleading. If someone configures
This is a follow-up concern, not a blocker. 3. The Specific Code Comments
function getStageSuccessStatus(task: TaskRow, stage: StatusTransition): TaskStatus {
if (stage.label === "planner" && shouldRunSkillsModeImprove(task)) {
return "improve";
}
if (stage.label === "implementer" && shouldRunSkillsModeVerify(task)) {
return "verify";
}
if (stage.label === "verifier" && task.skipReview) {
return "done";
}
return stage.onSuccess;
}Clean. The three conditions are mutually exclusive (different stage labels), so no ordering ambiguity. The fallback to
if (!looksLikeFullPlanUpdate(currentPlan, improvedPlan)) {
log.warn(/* ... */);
persistTaskPlanForTask({ /* ... planText: currentPlan ... */ });
return;
}Good defensive pattern. The improver won't accidentally nuke a plan with partial output. The warning log provides enough context for debugging.
function extractVerifyGateResult(resultText: string): VerifyGateResult | null {
const fence = resultText.match(/```aif-gate-result\s*([\s\S]*?)```/);
if (!fence) return null;
// ... JSON.parse with type narrowing
}The parsing is defensive (returns
The conditional chain for stage matching is getting long: stage === "implementer"
? or(
eq(tasks.status, "implementing"),
and(eq(tasks.status, "plan_ready"), eq(tasks.autoMode, true)),
)
: stage === "improver"
? inArray(tasks.status, ["improve"])
: stage === "plan-checker"
? and(eq(tasks.status, "plan_ready"), eq(tasks.autoMode, true))
: stage === "planner"
? inArray(tasks.status, ["planning"])
: stage === "verifier"
? inArray(tasks.status, ["verify"])
: inArray(tasks.status, ["review"]);This is a nested ternary chain with 6 branches. It works but is hard to read. Consider refactoring to a Minor Issues
Questions for the Author
Files I'd Want to See Tests For
Overall: Clean, well-structured PR. The additive design, defense-in-depth guards, and defensive parsing make this safe to merge. The verify retry loop is the only behavioral concern worth discussing before merge. |
|
Thanks for the review. I addressed the blocking verify retry concern in 1e1277a. Changes made:
Validation:
I left the budget naming and data query refactor notes as follow-up/non-blocking, since they do not affect the retry-loop behavior. |
PR ReviewSummaryPR extends the workflow engine by adding optional Improve and Verify stages for skills-mode ( This is not a point change, but an extension of the execution model with conditional stages. Strengths
Risks / Concerns1. Growth of state machine complexityConditional stages ( 2. Flag interaction complexityBehavior depends on the combination of:
There is no explicit truth table, which complicates the system's predictability. 3. Verify semantics
4. Watchdog behaviorExtending the stale-stage watchdog to Suggestions
ConclusionPR qualitatively expands the pipeline and fixes a critical retry-loop bug in verify. The main trade-off is an increase in FSM complexity and conditional branching, which requires a more strict specification of behavior. |
|
Addressed the latest review feedback in 7017e09. Added an explicit skills-mode flag truth table to docs/architecture.md, clarified Improve as plan refinement vs Verify as the execution validation gate, and documented why verify remains a coordinator stage for lifecycle/claim/timeout/profile/activity-log reuse. Added coordinator coverage for all flag combinations, including subagent-mode ignore behavior and skipReview interactions. Validation: npx vitest run packages/agent/src/tests/coordinator.test.ts, npm run format:check, git diff --check, npm run ai:validate. |
Review: Request changesThanks for the update. The structured I found one blocking behavior issue: in the normal Please either persist verification output in a dedicated field/log, or make the reviewer preserve/merge the existing Verification section when writing its final Non-blocking note: issue #131 describes Inline commentFile:
|
Summary
Validation
Related issue: #131