What happened
On PR #2758, the review agent ran 4 times and approved the final commit. It correctly identified test-integrity issues (weakened assertions) but missed a cross-file consistency bug: env.sandbox in harness/review.yaml references ${REVIEW_FINDING_SEVERITY_THRESHOLD}, but the reusable workflow that invokes the harness does not define this variable. The Qodo code review bot caught this; the fullsend review agent did not. The ADR 0055 migration will repeat this exact pattern for 5 more agents (triage, code, fix, retro, prioritize), making this a recurring risk.
What could go better
The review agent performed well on code-level analysis (test assertions) but lacked a check for cross-file consistency between harness configs and their calling workflows. This class of bug — where a config references a variable that no caller provides — is particularly important during migration PRs that change the variable delivery mechanism. Confidence is high that this is a real gap: the agent had access to both files and could have detected the mismatch. Existing issues #2676 (incomplete migrations) and #2703 (env var validation before LLM start) are adjacent but do not cover review-agent detection of env.sandbox/workflow mismatches specifically.
Proposed change
Add a review dimension or checklist item to the review agent's pr-review skill (or its sub-agent definitions) for harness/workflow consistency checks. When a PR modifies env.sandbox or env.runner sections in harness YAML files, the review agent should:
- Identify all
${VAR} references in the modified env sections.
- Check whether each referenced variable is defined in the corresponding reusable workflow's
env: block for the agent execution step.
- Flag any variable that is referenced but not provided by the workflow.
This could be implemented as guidance in skills/pr-review/SKILL.md or as a dedicated sub-agent for harness-config reviews. The specific file to update depends on the review agent's architecture, but the skill definition at internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md is a natural starting point.
Validation criteria
On the next PR that migrates a harness from host_files/runner_env to env.sandbox (e.g., triage or code agent migration under ADR 0055), the review agent should flag any ${VAR} references in env.sandbox that are not defined in the corresponding reusable workflow. Validate by checking the review agent's findings on at least 2 such migration PRs.
Generated by retro agent from #2758
What happened
On PR #2758, the review agent ran 4 times and approved the final commit. It correctly identified test-integrity issues (weakened assertions) but missed a cross-file consistency bug:
env.sandboxinharness/review.yamlreferences${REVIEW_FINDING_SEVERITY_THRESHOLD}, but the reusable workflow that invokes the harness does not define this variable. The Qodo code review bot caught this; the fullsend review agent did not. The ADR 0055 migration will repeat this exact pattern for 5 more agents (triage, code, fix, retro, prioritize), making this a recurring risk.What could go better
The review agent performed well on code-level analysis (test assertions) but lacked a check for cross-file consistency between harness configs and their calling workflows. This class of bug — where a config references a variable that no caller provides — is particularly important during migration PRs that change the variable delivery mechanism. Confidence is high that this is a real gap: the agent had access to both files and could have detected the mismatch. Existing issues #2676 (incomplete migrations) and #2703 (env var validation before LLM start) are adjacent but do not cover review-agent detection of env.sandbox/workflow mismatches specifically.
Proposed change
Add a review dimension or checklist item to the review agent's pr-review skill (or its sub-agent definitions) for harness/workflow consistency checks. When a PR modifies
env.sandboxorenv.runnersections in harness YAML files, the review agent should:${VAR}references in the modified env sections.env:block for the agent execution step.This could be implemented as guidance in
skills/pr-review/SKILL.mdor as a dedicated sub-agent for harness-config reviews. The specific file to update depends on the review agent's architecture, but the skill definition atinternal/scaffold/fullsend-repo/skills/pr-review/SKILL.mdis a natural starting point.Validation criteria
On the next PR that migrates a harness from
host_files/runner_envtoenv.sandbox(e.g., triage or code agent migration under ADR 0055), the review agent should flag any${VAR}references inenv.sandboxthat are not defined in the corresponding reusable workflow. Validate by checking the review agent's findings on at least 2 such migration PRs.Generated by retro agent from #2758