Skip to content

compile-perf: Slack notification on scheduled runs + 10% regression threshold#11745

Open
jvepsalainen-nv wants to merge 4 commits into
masterfrom
dev/jvepsalainen/compile-perf-slack-notify
Open

compile-perf: Slack notification on scheduled runs + 10% regression threshold#11745
jvepsalainen-nv wants to merge 4 commits into
masterfrom
dev/jvepsalainen/compile-perf-slack-notify

Conversation

@jvepsalainen-nv

Copy link
Copy Markdown
Contributor

Two improvements to the nightly compile-perf monitoring.

Changes

tools/compile-perf/trend.py
Lower --rel default from 1.25 (25%) to 1.10 (10%). The runner is a dedicated quiesced machine with 5-sample medians, giving a noise floor of ~1–3%, so 10% catches real medium regressions while avoiding false positives. The --abs 2ms guard still prevents alerts on small absolute deltas.

.github/workflows/compile-perf-nightly.yml

  • Add id: trend to the Check trend step so its outcome is accessible
  • Add Slack notification step that fires on schedule runs with if: always() — posts pass/fail status, CI run link, and report link to the channel configured by the SLANG_PERF_SLACK_WEBHOOK secret (silently skips if secret not set)

Note on in-flight PRs

This PR may conflict with #11698 (#11702, #11727 also touch the same file) at the trend step line. The conflict is trivial — just a pythonpython3 rename at the same location that can be resolved by whoever merges last.

On a dedicated quiesced runner with 5-sample medians the noise floor is
~1-3%, so 1.25x was too loose to catch medium regressions. 1.10x (10%)
catches real regressions while the --abs 2ms guard prevents alerts on
small absolute deltas.
…hreshold

- trend.py: lower --rel default from 1.25 to 1.10 (10% threshold on
  quiesced runner with 5-sample medians; --abs 2ms still guards against
  small absolute deltas)
- nightly workflow: add Slack notification step for scheduled runs
  (fires with if: always() so it reports both success and regression);
  uses SLANG_PERF_SLACK_WEBHOOK secret; includes CI run link and
  report link with caveat about Pages update delay
@jvepsalainen-nv jvepsalainen-nv requested a review from a team as a code owner June 25, 2026 07:20
@jvepsalainen-nv jvepsalainen-nv added the pr: non-breaking PRs without breaking changes label Jun 25, 2026
@jvepsalainen-nv jvepsalainen-nv requested review from bmillsNV and removed request for a team June 25, 2026 07:20
@jvepsalainen-nv jvepsalainen-nv added the pr: non-breaking PRs without breaking changes label Jun 25, 2026
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e96fd6e9-9ead-429d-b888-39bda9ff8f04

📥 Commits

Reviewing files that changed from the base of the PR and between 365f2a8 and 1dc9815.

📒 Files selected for processing (2)
  • .github/workflows/compile-perf-nightly.yml
  • tools/compile-perf/lib/analyze.py

📝 Walkthrough

Walkthrough

The nightly compile-perf workflow now records the trend check result and posts a Slack notification on scheduled runs. The compile-perf drift threshold default changes from 1.25 to 1.10, and the example command and documentation are updated to match.

Changes

Compile perf drift alerts

Layer / File(s) Summary
Lower drift threshold default
tools/compile-perf/trend.py, tools/compile-perf/DESIGN.md, tools/compile-perf/lib/analyze.py
The default --rel value changes to 1.10, and the example invocation, inline comments, docstring, and design text are updated to match.
Nightly trend notification
.github/workflows/compile-perf-nightly.yml
The trend check step gets an id, and a scheduled Slack step reads steps.trend.outcome, formats the status, and posts to SLACK_WEBHOOK_COMPILE_PERF.

Suggested reviewers

  • bmillsNV
  • jkiviluoto-nv
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly captures the main changes: a Slack notification on scheduled runs and a lower 10% regression threshold.
Description check ✅ Passed The description accurately summarizes the compile-perf threshold change and scheduled Slack notification workflow update.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9b4f552d-5643-4eaa-9ed5-2397068476de

📥 Commits

Reviewing files that changed from the base of the PR and between 1161c35 and ac2b2c3.

📒 Files selected for processing (3)
  • .github/workflows/compile-perf-nightly.yml
  • tools/compile-perf/DESIGN.md
  • tools/compile-perf/trend.py

Comment thread .github/workflows/compile-perf-nightly.yml Outdated
Comment thread tools/compile-perf/trend.py
@jvepsalainen-nv jvepsalainen-nv self-assigned this Jun 25, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
.github/workflows/compile-perf-nightly.yml (1)

251-267: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Export payload vars before invoking Python.

DATE, ICON, and STATUS are shell locals, but the Python snippet reads them from os.environ; without export, this step can fail with KeyError and skip Slack notification.

Suggested fix
-          DATE=$(date -u +%Y-%m-%d)
+          export DATE=$(date -u +%Y-%m-%d)
           if [ "$TREND_OUTCOME" = "success" ]; then
-            ICON=":white_check_mark:"
-            STATUS="No regressions detected"
+            export ICON=":white_check_mark:"
+            export STATUS="No regressions detected"
           elif [ "$TREND_OUTCOME" = "failure" ]; then
-            ICON=":warning:"
-            STATUS="Regression detected — see CI run for details"
+            export ICON=":warning:"
+            export STATUS="Regression detected — see CI run for details"
           else
-            ICON=":x:"
-            STATUS="Nightly job failed — see CI run for details"
+            export ICON=":x:"
+            export STATUS="Nightly job failed — see CI run for details"
           fi

As per path instructions, .github/workflows/** changes must verify secret handling and notification-path correctness.

Source: Path instructions


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 814a1cef-79f2-4c58-bf79-94c4ade668b6

📥 Commits

Reviewing files that changed from the base of the PR and between ac2b2c3 and 365f2a8.

📒 Files selected for processing (1)
  • .github/workflows/compile-perf-nightly.yml

- Export ICON/STATUS/DATE shell vars so os.environ can read them in Python
- Update analyze.py classify() docstring: step_thr references 1.10 not 1.25,
  drift_thr explains why it stays at 1.25 (release noise vs nightly noise)

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: 🟡 Has issues — 0 bug(s), 2 gap(s)

Tooling/CI-only PR: lowers trend.py --rel default 1.25 → 1.10 and adds a Slack notification step to the nightly compile-perf workflow. Threshold-change rationale comments are updated consistently across trend.py, analyze.py, and DESIGN.md. Two coverage gaps around the new Slack path: the if: excludes the only currently-enabled trigger so the path never runs in CI, and the PR description names a secret that doesn't match the workflow.

Changes Overview

Regression threshold tightening (tools/compile-perf/trend.py, tools/compile-perf/lib/analyze.py, tools/compile-perf/DESIGN.md)

  • What changed: trend.py's --rel default drops from 1.25 (25%) to 1.10 (10%) with a new rationale citing a dedicated quiesced runner and ~1-3% noise floor. analyze.py's classify() keeps step_thr=1.4 and drift_thr=1.25 but the rationale comments are updated to explain why those release-history thresholds intentionally stay higher than trend.py's new nightly default. DESIGN.md updates the matching sentence.

Slack notification on schedule runs (.github/workflows/compile-perf-nightly.yml)

  • What changed: adds id: trend to the existing "Check trend" step and appends a new "Notify Slack" step gated by if: github.event_name == 'schedule' && always(). The step exits early if SLACK_WEBHOOK_COMPILE_PERF is unset, otherwise branches on steps.trend.outcome (success/failure/else → ✅/⚠️/❌) and POSTs a JSON payload with the run URL and report URL to the webhook via an inline python -c urllib.request.urlopen call.
Findings (2 total)
Severity Location Finding
🟡 Gap .github/workflows/compile-perf-nightly.yml:233 New Slack step is unreachable in CI until the disabled cron is re-enabled — workflow_dispatch runs are excluded by == 'schedule', so payload/branch logic gets zero coverage
🟡 Gap .github/workflows/compile-perf-nightly.yml:237 PR description names secret SLANG_PERF_SLACK_WEBHOOK; workflow uses SLACK_WEBHOOK_COMPILE_PERF — operator provisioning from the description will silently land in the "skipping" branch

# primary timer drifts past threshold vs the trailing median.
- name: Check trend (fail on regression)
id: trend
shell: bash

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Gap: New Slack step is unreachable in CI until the cron is re-enabled

The new step is gated if: github.event_name == 'schedule' && always(), but the schedule: block at lines 14-15 of this same workflow is intentionally commented out, and workflow_dispatch is explicitly excluded by the == 'schedule' check. Result: the entire Slack notification path — bash branch selection, inline python -c, urllib.urlopen call, payload JSON shape, secret-missing skip — has zero execution coverage in this PR and will continue to have zero coverage until someone uncomments the cron weeks or months later. A typo in the inline Python, an env-var name mismatch, or a payload-shape regression would surface only at that point, exactly when the nightly alert is supposed to start working.

Example: If os.environ["ICON"] were misspelled in the f-string, manual workflow_dispatch runs would never catch it; the first scheduled run would fail silently (job log shows a Python KeyError, but no Slack message arrives) and the on-call user would assume "no regressions" instead of "alerting broken".

Suggestion: Either (a) temporarily widen the gate to also fire on a tagged workflow_dispatch input so the path can be exercised end-to-end once before the cron is enabled — e.g. add a notify-slack boolean input and gate on (github.event_name == 'schedule' || github.event.inputs.notify-slack == 'true') && always(); or (b) extract the inline Python into tools/compile-perf/notify_slack.py with a small unit test asserting the JSON payload shape and the success/failure/else branch selection. Option (b) is preferable because it pins the contract without requiring a live run.

run: |
set -euo pipefail
cd "$GITHUB_WORKSPACE/tools/compile-perf"
python trend.py --results "$GITHUB_WORKSPACE/perf-results"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Gap: PR description names the wrong secret

The workflow reads secrets.SLACK_WEBHOOK_COMPILE_PERF, but the PR description tells the reader the secret is SLANG_PERF_SLACK_WEBHOOK. Whoever provisions the org secret based on the PR description will set the wrong name, and the new step will silently take the exit 0 "not set; skipping" branch on every scheduled run — so the cron will look green, but no notifications will ever fire. This is the same failure mode as #2 above, triggered through a different mechanism.

Suggestion: Update the PR description to name SLACK_WEBHOOK_COMPILE_PERF (the workflow is the source of truth), or rename the workflow's secrets.SLACK_WEBHOOK_COMPILE_PERF reference to match whichever convention is preferred. If the team has a naming convention (e.g. SLANG_* prefix for shader-slang org secrets), pick the conforming name and align both sides.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr: non-breaking PRs without breaking changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant