You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a developer posts a `/retest` comment on a PR, they are manually triggering a CI rerun — the human-visible equivalent of the automated signal above. A PR with one or more `/retest` comments *may* indicate a flaky test, but this signal is weak on its own because:
53
+
54
+
-`/retest` often follows a real fix (e.g. after pushing a correction) — not every rerun is flakiness
55
+
- A PR with multiple `/retest` comments on a check that keeps failing is *more* suggestive of an intermittent issue
56
+
- This signal is only visible when manually reading PR comments; the skill does not scan for it automatically
57
+
58
+
Use it as a prompt for investigation, not as a classification. If you notice `/retest` comments while reviewing a PR and the check eventually passed, treat that as supporting evidence alongside Signal 1–3 above.
59
+
50
60
### Signal 5 — Symptom pattern match (weak, starting point only)
51
61
52
62
The error message matches a known timing or infrastructure error pattern:
@@ -103,7 +113,7 @@ Machine-readable source of truth for known flaky tests. Each entry has:
103
113
|`symptoms`| List of error strings or patterns observed |
104
114
|`first_seen` / `last_seen`| ISO dates |
105
115
|`pr_occurrences`| PR numbers where this was observed |
106
-
|`status`|`active` / `intermittent` / `resolved`|
116
+
|`status`|`suspected` / `confirmed` / `resolved`|
107
117
|`resolution`| What to do when this failure appears |
This returns all active and intermittent registry entries as JSON. Load this once and use it for all Confirmed flaky checks across every failing test.
147
+
This returns all suspected and confirmed registry entries as JSON. Load this once and use it for all registry-match checks across every failing test.
148
148
149
149
### Step 4 — Classify each failing test
150
150
@@ -254,7 +254,7 @@ For each suspected flaky test, compare the PR's changed file paths against the f
254
254
255
255
## `mark-flaky` Sub-command
256
256
257
-
Register a confirmed flaky test in `packages/gen-ai/.claude/skills/flake-check/flaky-tests.yaml` after a developer has verified the test is genuinely intermittent.
257
+
Register a flaky test in `packages/gen-ai/.claude/skills/flake-check/flaky-tests.yaml` after a developer has seen it fail in a way that looks intermittent. Only genuine flaky tests belong here — consistent product bug failures should be tracked in Jira and quarantined with `@Bug` in the test suite.
258
258
259
259
### Step 1 — Gather information
260
260
@@ -265,7 +265,7 @@ Ask the user for (or infer from a just-completed investigation):
265
265
-**PR numbers** — where it was observed (e.g. `#4821,#4897`)
266
266
-**Symptom** — the actual error message seen
267
267
-**Resolution** — what to do when it appears (e.g. `Rerun — passes on retry`)
268
-
-**Status** — `intermittent` (default) or `active` (consistent blocker) or `resolved`
268
+
-**Status** — `suspected` (default, seen once or twice) or `confirmed` (verified across multiple PRs) or `resolved`
0 commit comments