Skip to content

stress: tighten sustained-load threshold to baseline-relative once 5-10 runs accumulate #153

@brandonrc

Description

@brandonrc

Context

tests/stress/test-sustained-load.sh currently asserts error_pct <= 55% (down from a previous 30% that deterministically failed on v1.1.9-rc.1). The release-gate job runs with continue-on-error: true, so the assertion is informational, but the 55% number is a fixed ceiling chosen from a tiny sample (worst observed run 53% plus margin).

A real regression that produces 40% error rate would slip past, because the threshold is set to absorb the worst observed flake.

Followup from PR #148

PR #148 (Fresh-Eyes review, Finding 1) called this out. The original review asked for either tightening the threshold or making it telemetry-only. We kept the assertion plus continue-on-error: true for now because the suite is genuinely flake-prone on shared ARC runners and a hard fail would block release-gate without giving signal.

Proposal

Once 5-10 successful baseline runs have accumulated on the current Helm CPU layout (post-#140 rebalance, post-#132 admin-exempt credential), record their error-rate distribution and switch the assertion to:

error_pct <= median(baseline_error_pct) + 15

If a tests/stress/baseline-sustained-error-pct.txt file is absent, the script should warn loudly and pass (instead of failing with "no baseline" as the original review suggested) so the release-gate stays unblocked while the baseline file is being populated.

Acceptance

  • Baseline file checked in with at least 5 successful run values
  • Script computes median + delta at runtime, not a hard-coded number
  • SUSTAINED_ERROR_PCT_THRESHOLD env override still works for dedicated-runner environments
  • TODO comment in test-sustained-load.sh removed when this lands

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions