You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .agents/skills/flaky-test-investigator/SKILL.md
+2-22Lines changed: 2 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -42,8 +42,6 @@ For every failure, try to retrieve:
42
42
-**Server logs** (`kibana.log`, `elasticsearch.log` when present). Cross-reference the failure timestamp with any errors in the logs — a server-side 500 or unexpected warning is strong evidence the failure is a product bug, not a test bug.
43
43
-**Full session trace** when the framework supports it (Scout / Playwright). Lets you scrub through every step, locator query, network call, and DOM snapshot.
44
44
45
-
How to actually find and download each artifact type is framework-specific — see "Retrieve failure artifacts" below.
46
-
47
45
Things to specifically check in the artifacts before forming a root-cause hypothesis:
48
46
49
47
-**Did the expected element render at all?** If yes and the selector missed it → flaky selector (Tier 2 fix territory). If no → real rendering / race / data issue (Tier 1 territory).
@@ -53,27 +51,9 @@ Things to specifically check in the artifacts before forming a root-cause hypoth
53
51
54
52
If artifacts are not available (expired, not uploaded, no `read_artifacts` token), say so in the report rather than fabricating a hypothesis. "Screenshot would have resolved this; not available" is a valid open question.
55
53
56
-
### Retrieve failure artifacts
57
-
58
-
The standard recipe is **list → filter by path → download by ID**, always scoped to the failed job's UUID. Two Buildkite gotchas to know about first:
59
-
60
-
-**Failed-attempt jobs are hidden by default.**`/builds/<n>` returns only the latest attempt; append `?include_retried_jobs=true` to find the original failing job (the one cited in `failed-test` comments). `retried` and `retried_in_job_id` link the two.
61
-
-**Per-job artifacts use a different endpoint than build-wide artifacts.** If a build retried to green, failure artifacts only live on the failed job's listing (`bk artifacts list <build> -p <pipeline> --job-uuid <jobId>`). Don't conclude "no screenshot uploaded" until you've checked there.
62
-
63
-
**Scout** (`@kbn/scout-reporting`, not standard Playwright output — `playwright-report/`, `trace.zip`, and video are NOT published):
64
-
65
-
-`.scout/reports/scout-playwright-test-failures-<runId>/test-failures-summary.json` — maps test name → HTML report. Start here.
66
-
-`.scout/reports/scout-playwright-test-failures-<runId>/<testId>.html` — self-contained: error, stdout, embedded screenshot. Usually sufficient on its own.
67
-
-`.scout/reports/scout-playwright-test-failures-<runId>/scout-failures-<runId>.ndjson` — one record per failure (`id` = `<testId>`, `owner`, `location`, `error.*`) for programmatic use.
68
-
-`**/.scout/test-artifacts/<test-slug>/test-failed-<N>.png` — plain Playwright screenshot; the PNG doesn't carry `<testId>`, so correlate via spec path.
69
-
70
-
**FTR** (a single content `<hash>` links every artifact for one failure):
71
-
72
-
-`target/test_failures/<jobId>_<hash>.{json,log,html}` — `.json` is source of truth; full Kibana/ES stdout lives in `system-out` (there is no separate `kibana.log`). Pull this first.
73
-
-`<test-root>/screenshots/failure/*-<hash>.png` and `<test-root>/failure_debug/html/*-<hash>.html` — UI tests only; fetch only when the failure is UI-side.
74
-
-`.es/*.log` — transport/cluster-shaped failures.
54
+
### List failure artifacts
75
55
76
-
`target/test_failures/` is shared with Scout; filter by `.jobName` (e.g. `FTR Configs #90` vs `Scout Lane #12`) to keep only FTR. On Cloud FTR pipelines the layout differs: one self-contained HTML per failure at `<config-path-with-underscores>-<unix-timestamp>/html/<contentHash>.html` — no `target/test_failures/`, screenshot, or DOM artifacts.
56
+
`bk artifacts list <build> -p <pipeline> --job-uuid <jobId>` returns a JSON listing of every artifact uploaded for the failing job. Pass `--job-uuid <jobId>` for the failed attempt (without it, `bk` only returns the latest attempt and hides retried failures). If a build retried to green, failure artifacts only live on the failed job's listing; don't conclude "no screenshot" until you've scoped to the right job UUID.
Copy file name to clipboardExpand all lines: .github/workflows/failed-test-investigator.md
+3-14Lines changed: 3 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -143,28 +143,17 @@ Post exactly one comment. Keep the visible portion very short and easy to read:
143
143
144
144
1.**One-line bold headline** stating the result kind and one identifying detail.
145
145
2.**Diagnosis** (≤5 concise bullet points): what broke and where, the most likely root cause.
146
-
3.**Next steps** (≤5 concise bullet points).
146
+
3.**Recommended next steps** (≤5 concise bullet points).
147
147
148
-
Put the full `flaky-test-investigator` skill output inside a collapsed `<details><summary>Investigation details</summary> ... </details>` block (not in the visible portion). Open the block with a `#### Findings` subsection containing exactly these four bullets in this order — downstream tooling parses them, so preserve keys, casing, and `` - `key`: value `` shape. These bullets must live **inside `<details>`**, never in the visible portion:
Put the full `flaky-test-investigator` skill output inside a collapsed `<details><summary>Investigation details</summary> ... </details>` block (not in the visible portion).
154
149
155
150
The skill's "Reporting" subsections should also be inside the collapsible section:
156
151
157
152
- What the test does
158
-
- What failed and when
159
153
- Where it ran
160
154
- Root cause hypothesis
161
155
- Evidence
162
-
- Failure screenshot
163
-
- Recommended next step
156
+
- Failure screenshot (omit this section if not available)
164
157
- Open questions
165
158
166
159
Blank lines around `</summary>` and `</details>` are required for the inner markdown to render.
167
-
168
-
End the comment with this footer line (verbatim, on its own line after the `</details>` block):
169
-
170
-
`<sup>AI-generated, share feedback in [#appex-qa](https://elastic.slack.com/archives/C04HT4P1YS3)</sup>`
0 commit comments