Fix agent should not dismiss CI failures it cannot reproduce locally

## What happened

On PR #2909 fix iteration 3 (https://github.com/fullsend-ai/fullsend/pull/2909), the fix agent was asked to get CI passing and codecov coverage up. The agent added 4 tests and got queryMintHealth to 100% coverage, but reported it 'disagreed' with a CI lint failure because it could not reproduce it locally. The CI failure was a real go vet error caused by stale test function signatures that no longer matched the refactored production code (bundleFunctionSource changed from 1-arg to 3-arg). This dismissal caused a 3-day delay — the human had to come back on July 5 and trigger a 4th /fs-fix to resolve the issue. The fix was a 1-line change per call site.

Related open issues: #1884 ('Fix agent should check CI status and address failing checks') covers the broader category of CI awareness, but does not address the specific anti-pattern of the agent investigating a failure, failing to reproduce it, and then dismissing it as spurious.

## What could go better

The fix agent's failure mode was not 'ignoring CI' (which #1884 covers) but rather 'investigating CI, failing to reproduce, and then concluding the failure is wrong.' The root cause of the local/CI divergence was likely that the agent's local environment didn't have the same merged state as CI, or the agent only ran a subset of tests. The correct behavior when a CI failure can't be reproduced locally is to trust CI over local, read the exact CI error output, and fix based on that output. The agent should never conclude a CI failure is wrong without evidence (e.g., demonstrating the same test fails on the main branch). Confidence: high — this is a clear anti-pattern with a concrete 3-day cost on this PR.

## Proposed change

Add guidance to AGENTS.md (in the 'Go code' section, near the unit tests/coverage guidance) or to the fix agent's skill/definition:

> **CI failures take precedence over local results.** If CI reports a failure you cannot reproduce locally, trust CI. Read the full CI log output and fix based on the error message — do not dismiss failures as spurious. Common causes of local/CI divergence: stale dependencies, unmerged changes from main, different Go toolchain versions, running only a subset of tests. Only report a CI failure as a pre-existing flake if you can demonstrate the same failure exists on the main branch.

If this is better addressed upstream in the fix agent definition (fullsend-ai/fullsend agent configs), add the guidance there instead. The key principle is: never dismiss a CI failure without positive evidence it's unrelated to the PR's changes.

## Validation criteria

Over the next 10 fix agent iterations where CI fails, the agent should never dismiss a failure without providing evidence it's pre-existing. Track by searching fix agent PR comments for dismissal language ('disagree', 'cannot reproduce', 'spurious', 'flaky') and verifying each is accompanied by evidence (e.g., link to same failure on main). Success: zero unsubstantiated dismissals in the next 10 relevant fix iterations.

---
_Generated by retro agent from https://github.com/fullsend-ai/fullsend/pull/2909_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix agent should not dismiss CI failures it cannot reproduce locally #3019

What happened

What could go better

Proposed change

Validation criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Fix agent should not dismiss CI failures it cannot reproduce locally #3019

Description

What happened

What could go better

Proposed change

Validation criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions