stress: tighten sustained-load threshold to baseline-relative once 5-10 runs accumulate

## Context

`tests/stress/test-sustained-load.sh` currently asserts `error_pct <= 55%` (down from a previous 30% that deterministically failed on v1.1.9-rc.1). The release-gate job runs with `continue-on-error: true`, so the assertion is informational, but the 55% number is a fixed ceiling chosen from a tiny sample (worst observed run 53% plus margin).

A real regression that produces 40% error rate would slip past, because the threshold is set to absorb the worst observed flake.

## Followup from PR #148

PR #148 (Fresh-Eyes review, Finding 1) called this out. The original review asked for either tightening the threshold or making it telemetry-only. We kept the assertion plus `continue-on-error: true` for now because the suite is genuinely flake-prone on shared ARC runners and a hard fail would block release-gate without giving signal.

## Proposal

Once 5-10 successful baseline runs have accumulated on the current Helm CPU layout (post-#140 rebalance, post-#132 admin-exempt credential), record their error-rate distribution and switch the assertion to:

```
error_pct <= median(baseline_error_pct) + 15
```

If a `tests/stress/baseline-sustained-error-pct.txt` file is absent, the script should warn loudly and pass (instead of failing with "no baseline" as the original review suggested) so the release-gate stays unblocked while the baseline file is being populated.

## Acceptance

- Baseline file checked in with at least 5 successful run values
- Script computes median + delta at runtime, not a hard-coded number
- `SUSTAINED_ERROR_PCT_THRESHOLD` env override still works for dedicated-runner environments
- TODO comment in `test-sustained-load.sh` removed when this lands

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stress: tighten sustained-load threshold to baseline-relative once 5-10 runs accumulate #153

Context

Followup from PR #148

Proposal

Acceptance

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

stress: tighten sustained-load threshold to baseline-relative once 5-10 runs accumulate #153

Description

Context

Followup from PR #148

Proposal

Acceptance

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions