Skip to content

perf(ab_test): ignore noisy network and MMDS metrics on flaky instances#5874

Merged
Manciukic merged 2 commits into
firecracker-microvm:mainfrom
Manciukic:ignore-noisy-ab-metrics
May 11, 2026
Merged

perf(ab_test): ignore noisy network and MMDS metrics on flaky instances#5874
Manciukic merged 2 commits into
firecracker-microvm:mainfrom
Manciukic:ignore-noisy-ab-metrics

Conversation

@Manciukic

@Manciukic Manciukic commented May 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Add noisy instance/metric combinations to the A/B test IGNORED list to stop false-positive pipeline failures.

Changes

  • m5n.metal (al2/linux-5.10 host): ignore test_network_latency — repeated identical false positives
  • m8g.metal-{24,48}xl: ignore test_network_latency — up to 67% swings in both directions since the May 2 AL2023 rootfs update, exclusively on guest kernel linux-6.1
  • m8i.metal-{48,96}xl: ignore test_mmds_performance — bidirectional noise (e.g. +13% PCI on, -19% PCI off in the same build)

Evidence

Occurrence matrix from recent builds:

ping_latency on m8g (started May 5, after AL2023 rootfs update May 2):

  • Build 899: +67% (m8g-48xl, PCI on), -38% (m8g-24xl, PCI off), -48% (m8g-48xl, PCI on) — all guest linux-6.1
  • Build 903: +53% (m8g-48xl, PCI on, guest linux-6.1)

ping_latency on m5n (recurring since March):

  • Build 900: +11.84%, +5.33%, +14.21% — identical numbers repeated in 903
  • Always vcpus=1, al2 host, both guest kernels

MMDS on m8i (recurring since March):

  • Build 903: +13% PCI on, -19% PCI off (same build, same instance)
  • Build 896: +11% PCI on, -5% PCI off
  • Exclusively m8i, exclusively al2 host

Impact

The A/B alarm has not cleared in 6+ weeks due to these false positives, generating 7+ duplicate tickets. Tests still run and collect data — they just no longer fail the pipeline.

Testing

No functional change — the IGNORED list only suppresses the assertion in analyze_data(). Metrics are still collected and available in artifacts for manual inspection.

@Manciukic Manciukic force-pushed the ignore-noisy-ab-metrics branch from 7828bb8 to 72af264 Compare May 11, 2026 12:59
@codecov

codecov Bot commented May 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.80%. Comparing base (7df7152) to head (7fe39cf).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5874   +/-   ##
=======================================
  Coverage   82.80%   82.80%           
=======================================
  Files         277      277           
  Lines       29892    29892           
=======================================
  Hits        24753    24753           
  Misses       5139     5139           
Flag Coverage Δ
5.10-m5n.metal 83.10% <ø> (+<0.01%) ⬆️
5.10-m6a.metal 82.42% <ø> (-0.01%) ⬇️
5.10-m6g.metal 79.73% <ø> (-0.01%) ⬇️
5.10-m6i.metal 83.10% <ø> (ø)
5.10-m7a.metal-48xl 82.42% <ø> (+<0.01%) ⬆️
5.10-m7g.metal 79.73% <ø> (-0.01%) ⬇️
5.10-m7i.metal-24xl 83.07% <ø> (ø)
5.10-m7i.metal-48xl 83.07% <ø> (+<0.01%) ⬆️
5.10-m8g.metal-24xl 79.73% <ø> (+<0.01%) ⬆️
5.10-m8g.metal-48xl 79.73% <ø> (-0.01%) ⬇️
5.10-m8i.metal-48xl 83.07% <ø> (+<0.01%) ⬆️
5.10-m8i.metal-96xl 83.07% <ø> (ø)
6.1-m5n.metal 83.13% <ø> (+<0.01%) ⬆️
6.1-m6a.metal 82.46% <ø> (ø)
6.1-m6g.metal 79.73% <ø> (ø)
6.1-m6i.metal 83.12% <ø> (+<0.01%) ⬆️
6.1-m7a.metal-48xl 82.45% <ø> (ø)
6.1-m7g.metal 79.73% <ø> (-0.01%) ⬇️
6.1-m7i.metal-24xl 83.13% <ø> (+<0.01%) ⬆️
6.1-m7i.metal-48xl 83.14% <ø> (+<0.01%) ⬆️
6.1-m8g.metal-24xl 79.72% <ø> (-0.01%) ⬇️
6.1-m8g.metal-48xl 79.73% <ø> (ø)
6.1-m8i.metal-48xl 83.14% <ø> (+<0.01%) ⬆️
6.1-m8i.metal-96xl 83.14% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

JackThomson2
JackThomson2 previously approved these changes May 11, 2026
Add m5n.metal (al2 host), m8g.metal-{24,48}xl, and m8i.metal-{48,96}xl
to the IGNORED list for network latency and MMDS A/B tests respectively.

These instance/metric combinations exhibit high variance (up to 67%
swings in both directions) on unrelated commits, causing persistent
false-positive pipeline failures. The noise on m8g correlates with the
May 2 AL2023 rootfs update and affects only guest kernel linux-6.1
dimensions. MMDS noise on m8i is bidirectional within the same build.

The tests still run and collect data on these instances; they just no
longer block the pipeline.

Signed-off-by: Riccardo Mancini <mancio@amazon.com>
@Manciukic Manciukic force-pushed the ignore-noisy-ab-metrics branch from 72af264 to ccf4933 Compare May 11, 2026 13:32
@Manciukic Manciukic enabled auto-merge (rebase) May 11, 2026 13:43
@Manciukic Manciukic added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label May 11, 2026
@Manciukic Manciukic added Status: Awaiting review Indicates that a pull request is ready to be reviewed and removed Status: Awaiting review Indicates that a pull request is ready to be reviewed labels May 11, 2026
@Manciukic Manciukic merged commit 0f76352 into firecracker-microvm:main May 11, 2026
8 checks passed
@Manciukic Manciukic deleted the ignore-noisy-ab-metrics branch May 12, 2026 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Awaiting review Indicates that a pull request is ready to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants