Skip to content

Commit 0db996b

Browse files
committed
statistics
1 parent a996505 commit 0db996b

2 files changed

Lines changed: 18 additions & 3 deletions

File tree

.github/aw/actions-lock.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,11 @@
55
"version": "v8",
66
"sha": "ed597411d8f924073f98dfc5c65a23a2325f34cd"
77
},
8+
"actions/upload-artifact@v4": {
9+
"repo": "actions/upload-artifact",
10+
"version": "v4",
11+
"sha": "ea165f8d65b6e75b540449e92b4886f43607fa02"
12+
},
813
"github/gh-aw/actions/setup@v0.51.5": {
914
"repo": "github/gh-aw/actions/setup",
1015
"version": "v0.51.5",

.github/workflows/ci-doctor.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -325,7 +325,7 @@ You are the CI Failure Doctor, an expert investigative agent that analyzes faile
325325
- **Do NOT use** ISO 8601 format with colons (e.g., `2026-02-12T11:20:45.458Z`) - colons are not allowed in artifact filenames
326326
- Store error patterns in `/tmp/gh-aw/cache-memory/patterns/`
327327
- Maintain an index file of all investigations for fast searching
328-
2. **Update Pattern Database**: Enhance knowledge with new findings by updating pattern files. For every investigation, also update or create a per-pattern record under `/tmp/gh-aw/cache-memory/patterns/<signature-hash>.json` with the following schema, so reproduction frequency and timing can be tracked across runs:
328+
2. **Update Pattern Database — read-modify-write, never overwrite**: Maintain one record per failure signature under `/tmp/gh-aw/cache-memory/patterns/<signature-hash>.json`. Schema:
329329

330330
~~~json
331331
{
@@ -339,8 +339,16 @@ You are the CI Failure Doctor, an expert investigative agent that analyzes faile
339339
}
340340
~~~
341341

342-
When the file already exists, increment `count`, refresh `last_seen`, and prepend the current run URL to `recent_run_urls` (keep at most 10 entries). Never recompute `first_seen`.
343-
3. **Build Statistics Snapshot**: Before sending the Teams notification, aggregate all per-pattern files into a single in-memory database snapshot used to populate `notify_teams.statistics` and `notify_teams.statistics_json` (see Output Requirements). Sort patterns by `count` descending, ties broken by most recent `last_seen`.
342+
The persisted `count` MUST equal the number of times this signature has actually been observed across runs. The most common bug is overwriting the file with `count: 1` on every run; follow these rules to avoid it:
343+
344+
- **Stable signature first.** Compute `<signature-hash>` deterministically from inputs that do NOT change between reruns of the same failure: normalized primary error message (strip absolute paths, line/column numbers, hex addresses, PIDs, timestamps, run IDs, commit SHAs, tmp dirs, and UUIDs), failed job name, and failure category. Two reruns of the same failure MUST produce the same hash. If they don't, `count` will stay at `1` forever — fix the normalization before writing anything.
345+
- **Read before write.** Always check whether `/tmp/gh-aw/cache-memory/patterns/<signature-hash>.json` already exists and load it. Do NOT generate a fresh record from scratch and clobber the existing file.
346+
- **If the file exists**: set `count = previous.count + 1`, set `last_seen = <now UTC>`, keep `first_seen` unchanged, prepend the current run URL to `recent_run_urls` and truncate to the 10 most recent entries (deduplicate by URL). Refresh `title`/`category` only if previously empty.
347+
- **If the file does not exist** but the same signature appears in prior `/tmp/gh-aw/cache-memory/investigations/*.json` entries (e.g., the patterns dir was lost or never populated), reconstruct the record: set `count` to the number of matching investigation files including the current one, derive `first_seen`/`last_seen` from those investigations' timestamps, and seed `recent_run_urls` from them. Then write the file.
348+
- **Only if neither exists**: create a new record with `count: 1`, `first_seen = last_seen = <now UTC>`.
349+
- **Reconciliation invariant.** After writing, the persisted `count` for the current signature MUST equal `notify_teams.occurrence_count`. If they differ, the persisted record is wrong — fix it (typically by switching to the read-modify-write path above) before sending the notification. Apply the same reconciliation pass to every other pattern file you touch when building the snapshot: if a pattern file's `count` is lower than the number of matching investigation files for the same signature, raise `count` to that number.
350+
351+
3. **Build Statistics Snapshot**: After step 2 has updated and reconciled every pattern record, aggregate all files under `/tmp/gh-aw/cache-memory/patterns/` into the snapshot used to populate `notify_teams.statistics` and `notify_teams.statistics_json`. The `count` reported for each pattern in both fields MUST be read directly from the (just-updated) per-pattern files — do NOT recompute it as `1` for the current pattern, and do NOT use a per-run counter. Sort patterns by `count` descending, ties broken by most recent `last_seen`. The current failure's row MUST show the same `count` value as `notify_teams.occurrence_count`; if it doesn't, return to step 2 and fix the persisted record before emitting the notification.
344352
4. **Save Artifacts**: Store detailed logs and analysis in the cached directories.
345353

346354
### Phase 6: Reporting and Recommendations
@@ -399,6 +407,8 @@ Provide all required fields and include the optional PR-related fields whenever
399407

400408
- **`statistics_json`** (required) — Full pattern database serialized as a compact JSON string (single line, no surrounding code fence). Must include **every** pattern currently tracked, not just the top 20. Schema is documented on the input field. This payload is uploaded as the `ci-doctor-statistics` workflow artifact (alongside the rendered Markdown) and is intended for offline analysis or dashboarding. Keep `recent_run_urls` capped at 10 entries per pattern.
401409

410+
**Count consistency (mandatory):** the `count` value for every pattern in `statistics_json` (and in the rendered `statistics` table) MUST be the persisted `count` read from the corresponding `/tmp/gh-aw/cache-memory/patterns/<signature-hash>.json` file *after* Phase 5 step 2 has updated it. In particular, the current failure's pattern MUST report `count == occurrence_count`. Do NOT emit `count: 1` for every pattern — that is a symptom of either (a) overwriting the persisted record instead of read-modify-write, or (b) generating a fresh signature hash on each run. Validate this invariant before calling `notify_teams`; if it fails, fix the persistence step rather than the reported numbers.
411+
402412
- **`description`** (required) — Thorough Markdown body. Microsoft Teams Adaptive Cards render only a **limited subset of Markdown** — specifically: headings (`#`/`##`/`###`), bold/italic, inline code, fenced code blocks, ordered/unordered lists, and links. **Do not** use raw HTML tags such as `<details>`, `<summary>`, `<br>`, `<b>`, `<table>`, etc. — they appear as literal text in Teams. Use `###` headings for every section (no collapsibles). Use this structure:
403413

404414
```markdown

0 commit comments

Comments
 (0)