Skip to content

run-uat stuck loop: verdict written to S{sid}-ASSESSMENT.md but checkNeedsRunUat reads S{sid}-UAT.md #2644

@chrisleduc

Description

@chrisleduc

Problem

Auto-mode enters a stuck loop on run-uat units (3 consecutive dispatches, no progress). The UAT runner writes its verdict to S{sid}-ASSESSMENT.md (via gsd_summary_save(artifact_type: "ASSESSMENT") as instructed by the prompt), but checkNeedsRunUat checks S{sid}-UAT.md for a verdict. The UAT spec file has no verdict, so hasVerdict returns false on every loop iteration and the unit keeps re-dispatching.

Root Cause

Three-way mismatch across three files:

File 1: prompts/run-uat.md (line 58)
The prompt instructs the LLM to call gsd_summary_save with artifact_type: "ASSESSMENT":

Call `gsd_summary_save` with `milestone_id: …`, `slice_id: …`, `artifact_type: "ASSESSMENT"`

→ This writes {sid}-ASSESSMENT.md (e.g. S01-ASSESSMENT.md) with verdict: PASS in frontmatter.

File 2: auto-prompts.ts lines ~781–810 (checkNeedsRunUat)
The dispatch guard reads S{sid}-UAT.md — the original spec file — to detect whether UAT has already run:

const uatFile = resolveSliceFile(base, mid, sid, "UAT");
const uatContent = await loadFile(uatFile);   // loads S01-UAT.md (the spec)
if (hasVerdict(uatContent)) return null;       // spec has no verdict → always false

File 3: verdict-parser.ts line 35 (hasVerdict)
Correct function, wrong file being passed to it. The verdict is in S01-ASSESSMENT.md, not S01-UAT.md.

Sequence:

  1. run-uat dispatched → agent calls gsd_summary_save(artifact_type: "ASSESSMENT")S01-ASSESSMENT.md written with verdict: PASS
  2. verifyExpectedArtifact("run-uat", …) resolves to S01-UAT.md (exists as spec file) → returns true — unit appears complete
  3. Next iteration: checkNeedsRunUat loads S01-UAT.md, calls hasVerdictfalse (spec has no verdict)
  4. checkNeedsRunUat returns { sliceId, uatType } → re-dispatches run-uat
  5. Stuck-loop detection fires after 3 identical dispatches

Expected Behavior

checkNeedsRunUat should also check S{sid}-ASSESSMENT.md for a verdict (the file the prompt actually tells the agent to write). If either the UAT spec file or the ASSESSMENT file contains a verdict, UAT has been run and dispatch should be skipped.

Proposed fix in auto-prompts.ts (~line 783, DB path):

// After: if (hasVerdict(uatContent)) return null;
// Add:
const assessmentFile = resolveSliceFile(base, mid, sid, "ASSESSMENT");
if (assessmentFile) {
  const assessmentContent = await loadFile(assessmentFile);
  if (assessmentContent && hasVerdict(assessmentContent)) return null;
}

Apply the same check to the file-based fallback path (~line 809).

Alternative fix: Change the run-uat prompt to write the verdict into S{sid}-UAT.md directly (updating the existing file's frontmatter) instead of creating a separate ASSESSMENT file. This would keep the verdict in the file that checkNeedsRunUat already reads.

Environment

  • GSD version: 2.50.0
  • Model: claude-sonnet-4-6
  • Unit: run-uat M022-tj9p23/S01 (dispatched 3 times, $1.75 wasted)

Reproduction Context

  • Phase: post-slice UAT for run-uat unit type with artifact-driven mode
  • The S{sid}-UAT.md spec file exists (written by gsd_slice_complete)
  • Agent correctly calls gsd_summary_save(artifact_type: "ASSESSMENT") as instructed
  • S{sid}-ASSESSMENT.md is written with verdict: PASS in YAML frontmatter
  • S{sid}-UAT.md retains original spec content — no verdict field
  • Loop re-dispatches indefinitely until stuck-loop guard fires

Forensic Evidence

  • Anomaly: run-uat/M022-tj9p23/S01 dispatched 2 times (stuck-loop warning)
  • S01-ASSESSMENT.md present with verdict: PASS — correct output
  • S01-UAT.md present with no verdict — causes hasVerdict to return false
  • checkNeedsRunUat (auto-prompts.ts ~line 781) only checks resolveSliceFile(…, "UAT"), never checks resolveSliceFile(…, "ASSESSMENT")
  • Session cost: $7.47 / 14.38M tokens for 8 units (3 redundant UAT runs)

Related


Auto-generated by /gsd forensics

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions