perf(ab_test): ignore noisy network and MMDS metrics on flaky instances#5874
Merged
Manciukic merged 2 commits intoMay 11, 2026
Merged
Conversation
7828bb8 to
72af264
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5874 +/- ##
=======================================
Coverage 82.80% 82.80%
=======================================
Files 277 277
Lines 29892 29892
=======================================
Hits 24753 24753
Misses 5139 5139
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
JackThomson2
previously approved these changes
May 11, 2026
Add m5n.metal (al2 host), m8g.metal-{24,48}xl, and m8i.metal-{48,96}xl
to the IGNORED list for network latency and MMDS A/B tests respectively.
These instance/metric combinations exhibit high variance (up to 67%
swings in both directions) on unrelated commits, causing persistent
false-positive pipeline failures. The noise on m8g correlates with the
May 2 AL2023 rootfs update and affects only guest kernel linux-6.1
dimensions. MMDS noise on m8i is bidirectional within the same build.
The tests still run and collect data on these instances; they just no
longer block the pipeline.
Signed-off-by: Riccardo Mancini <mancio@amazon.com>
72af264 to
ccf4933
Compare
JackThomson2
approved these changes
May 11, 2026
ilstam
approved these changes
May 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add noisy instance/metric combinations to the A/B test
IGNOREDlist to stop false-positive pipeline failures.Changes
test_network_latency— repeated identical false positivestest_network_latency— up to 67% swings in both directions since the May 2 AL2023 rootfs update, exclusively on guest kernel linux-6.1test_mmds_performance— bidirectional noise (e.g. +13% PCI on, -19% PCI off in the same build)Evidence
Occurrence matrix from recent builds:
ping_latencyon m8g (started May 5, after AL2023 rootfs update May 2):ping_latencyon m5n (recurring since March):MMDS on m8i (recurring since March):
Impact
The A/B alarm has not cleared in 6+ weeks due to these false positives, generating 7+ duplicate tickets. Tests still run and collect data — they just no longer fail the pipeline.
Testing
No functional change — the
IGNOREDlist only suppresses the assertion inanalyze_data(). Metrics are still collected and available in artifacts for manual inspection.