Skip to content

Commit 23f2119

Browse files
authored
agents.md: ban adding test skips without explicit user request (erigontech#21232)
## Summary Adds a project-level policy in [agents.md](agents.md) for test skips. The rule applies to every contributor and to every automated agent (LLM coding assistants etc.) working in the repo. ## Three-part policy 1. **Why skips are dangerous** — concrete case: erigontech#21153 removed a `t.Skip` documenting a known parallel-exec SD/CREATE2 bug; the underlying bug was never actually fixed, so removing the skip suddenly red'd CI across downstream PRs. 2. **Two valid skip reasons (humans only, with tracking issue)**: - External test suites we import where we can't pass all the tests yet - Flakes with very low tolerance (general rule: reproduce locally and fix; only skip after serious investigation, with the investigation linked in the tracking issue) 3. **Strict ban for automated agents**: agents must never add a skip, period. Not as a tactical unblock, not behind a flag, not as an `AskUserQuestion` option. Default trajectory for a failing test is investigate → reproduce → fix → verify. If genuinely can't fix in-session, escalate with investigation findings — never silently mute. ## Why Skips convert a loud "this is broken" CI signal into silence, then back into surprise when the skip is later removed. The end state we want: humans add skips only for the two narrow reasons with tracking issues; automated agents always go through investigate → reproduce → fix, never → skip. ## Test plan - Docs-only change to `agents.md`; no code paths affected. - No CI gating to add — this is a behavioural directive for both human contributors reading the repo's contributor docs and for automated agents that read `agents.md`/`CLAUDE.md` at session start.
1 parent 1cf1ea8 commit 23f2119

1 file changed

Lines changed: 33 additions & 0 deletions

File tree

agents.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,39 @@ Before committing, always verify changes with: `make lint && make erigon integra
4242
./build/bin/erigon --datadir=dev --chain=dev --beacon.api=beacon,validator,node,config # PoS dev mode
4343
```
4444

45+
## Test skips
46+
47+
These rules apply project-wide — to every contributor and to every automated agent (LLM coding assistants, CI bots, etc.) working in this repository.
48+
49+
### Why skips are dangerous
50+
51+
A failing test is a real failure that must be diagnosed and fixed. Skipping it hides the failure and pushes the cost onto whoever later removes the skip — at which point the underlying bug is still there and now also surprises them. Skipping converts a loud "this is broken" signal into silence, then back into surprise. Concrete case: `#21153` removed a `t.Skip` for `TestGeneratedTraceApiCollision` that had documented a known parallel-exec SD/CREATE2-reincarnation bug; the underlying bug was never actually fixed (the comment said "fixed on `exec3/remove-rwtx-threading` branch" — that branch's fix never merged), so removing the skip suddenly red'd CI across downstream PRs (notably #21017).
52+
53+
### Two valid reasons a skip may exist
54+
55+
Both apply to human contributors. Both require an explicit, linked tracking issue. Neither permits an automated agent to add the skip on its own.
56+
57+
1. **External test suites we import where we can't pass all the tests** — typically because we haven't done the corresponding development yet (e.g., an upstream Ethereum spec test for a feature we haven't implemented). The skip documents the gap rather than hiding a regression.
58+
59+
2. **Flaky tests** — partially valid, with very low (not zero) tolerance. The general rule for flakes is **reproduce locally and fix**. Only after a serious attempt at local repro and root-cause analysis (not "I ran it three times and it passed") should a skip be considered. The tracking issue must include the local-repro investigation attached, and the test owner accepts responsibility for un-skipping once the flake is fixed.
60+
61+
In both cases the skip carries an inline comment with the linked tracking issue, and the issue gets closed by removing the skip — not by closing the issue with the skip still in place.
62+
63+
### Rule for automated agents (LLM assistants etc.)
64+
65+
**Automated agents must never add a skip. Period.** Not even with a "the user can review it" framing. Not as an option in `AskUserQuestion` menus. Not as a "tactical unblock" suggestion in text answers. Not behind any conditional or env-var gate.
66+
67+
When an agent encounters a failing test:
68+
- Investigate the failure: read logs, reproduce locally, narrow to a minimal repro
69+
- Fix the underlying bug
70+
- If the agent genuinely can't fix it in-session, the correct outcomes are: (a) escalate to the user with the investigation findings, or (b) report it as a tracked blocker — never (c) skip it
71+
72+
If a flaky test is blocking the agent's own CI iteration, the agent reproduces locally and either fixes the flake or hands off to the user with the repro recipe. Adding a skip "just to get CI green" is exactly the pattern that produced #21153's surprise.
73+
74+
Applies to all forms of test muting: `t.Skip`, `t.SkipNow`, `t.Skipf`, `SkipLoad`, `bt.SkipLoad`, build-tag exclusions, conditional bypasses behind `dbg.*` env flags, removing tests from a runner matrix without a tracking issue. **All off-limits for automated agents.**
75+
76+
If a user explicitly directs an agent to add a skip in the current turn (overriding this rule for a specific case), the agent should still flag the trade-off and ensure a tracking issue exists.
77+
4578
## Conventions
4679

4780
Commit messages: prefix with package(s) modified, e.g., `eth, rpc: make trace configs optional`

0 commit comments

Comments
 (0)