Commit 707c2a3
committed
feat: detect and flag infrastructure failures in trend reports
When Bedrock is unavailable during evaluation (throttling, service
outages), runs produce 0 test passes and the trend report incorrectly
signals a regression. This change adds infrastructure failure detection
so the gate skips unreliable runs instead of firing false regressions.
- Add InfraFailureReason enum and InfraFailure dataclass to models
- Preserve individual error type counts (throttle, service_unavailable,
model_error) instead of collapsing to a single error_count
- Read actual server_started/server_error from contract test results
instead of hardcoding server_startup_success=True
- Add detect_infra_failure() with conservative detection logic
- Gate passes with annotation when latest run is an infra failure
- Gate falls back to older non-infra run when comparison is infra failure
- Add prominent warning banner to both MD and HTML trend reports
- Expand Section F with per-error-type columns and infra failure flag
- Serialize InfraFailureReason in YAML output via generalized Enum handler1 parent a756b71 commit 707c2a3
15 files changed
Lines changed: 774 additions & 90 deletions
File tree
- scripts/aidlc-evaluator/packages/trend-reports
- src/trend_reports
- tests
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
22 | 24 | | |
23 | 25 | | |
24 | 26 | | |
| |||
33 | 35 | | |
34 | 36 | | |
35 | 37 | | |
| 38 | + | |
| 39 | + | |
36 | 40 | | |
37 | 41 | | |
38 | 42 | | |
| |||
Lines changed: 5 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
217 | 217 | | |
218 | 218 | | |
219 | 219 | | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
220 | 225 | | |
221 | 226 | | |
222 | 227 | | |
| |||
Lines changed: 91 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
| 22 | + | |
21 | 23 | | |
22 | 24 | | |
23 | 25 | | |
| |||
147 | 149 | | |
148 | 150 | | |
149 | 151 | | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
150 | 158 | | |
151 | 159 | | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
157 | | - | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
158 | 166 | | |
159 | 167 | | |
160 | 168 | | |
| |||
175 | 183 | | |
176 | 184 | | |
177 | 185 | | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
178 | 192 | | |
179 | 193 | | |
180 | 194 | | |
| |||
215 | 229 | | |
216 | 230 | | |
217 | 231 | | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
218 | 235 | | |
219 | 236 | | |
220 | 237 | | |
221 | 238 | | |
222 | 239 | | |
223 | 240 | | |
| 241 | + | |
| 242 | + | |
224 | 243 | | |
225 | 244 | | |
226 | 245 | | |
| |||
301 | 320 | | |
302 | 321 | | |
303 | 322 | | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
304 | 376 | | |
305 | 377 | | |
306 | 378 | | |
| |||
319 | 391 | | |
320 | 392 | | |
321 | 393 | | |
322 | | - | |
323 | | - | |
324 | | - | |
325 | | - | |
326 | | - | |
| 394 | + | |
| 395 | + | |
327 | 396 | | |
328 | 397 | | |
329 | 398 | | |
| |||
334 | 403 | | |
335 | 404 | | |
336 | 405 | | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
337 | 410 | | |
338 | 411 | | |
339 | 412 | | |
| |||
346 | 419 | | |
347 | 420 | | |
348 | 421 | | |
349 | | - | |
| 422 | + | |
350 | 423 | | |
351 | 424 | | |
352 | 425 | | |
353 | 426 | | |
354 | 427 | | |
355 | 428 | | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
356 | 434 | | |
357 | 435 | | |
358 | 436 | | |
| |||
364 | 442 | | |
365 | 443 | | |
366 | 444 | | |
| 445 | + | |
367 | 446 | | |
368 | 447 | | |
369 | 448 | | |
| |||
Lines changed: 54 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
15 | 18 | | |
16 | 19 | | |
17 | 20 | | |
| |||
22 | 25 | | |
23 | 26 | | |
24 | 27 | | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
25 | 58 | | |
26 | 59 | | |
27 | 60 | | |
| |||
52 | 85 | | |
53 | 86 | | |
54 | 87 | | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
55 | 109 | | |
56 | 110 | | |
57 | 111 | | |
| |||
Lines changed: 32 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
37 | 58 | | |
38 | 59 | | |
39 | 60 | | |
| |||
119 | 140 | | |
120 | 141 | | |
121 | 142 | | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
122 | 149 | | |
123 | 150 | | |
124 | 151 | | |
| |||
152 | 179 | | |
153 | 180 | | |
154 | 181 | | |
| 182 | + | |
| 183 | + | |
155 | 184 | | |
156 | 185 | | |
157 | 186 | | |
| |||
210 | 239 | | |
211 | 240 | | |
212 | 241 | | |
| 242 | + | |
213 | 243 | | |
214 | 244 | | |
215 | 245 | | |
| |||
258 | 288 | | |
259 | 289 | | |
260 | 290 | | |
| 291 | + | |
| 292 | + | |
0 commit comments