Skip to content

compile-perf: fix Windows runner — add bash to PATH, use python3#11698

Open
jvepsalainen-nv wants to merge 20 commits into
masterfrom
fix/compile-perf-bash-path
Open

compile-perf: fix Windows runner — add bash to PATH, use python3#11698
jvepsalainen-nv wants to merge 20 commits into
masterfrom
fix/compile-perf-bash-path

Conversation

@jvepsalainen-nv

@jvepsalainen-nv jvepsalainen-nv commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Fixes several issues discovered when running the compile-perf workflows on the nvrgfx-perf Windows self-hosted runner after the initial merge of #11485.

Changes

Release sweep workflow (compile-perf-release-sweep.yml)

  • Add Add bash to PATH (Windows) step — the release sweep doesn't use common-setup (no build needed), so Git Bash wasn't on PATH for shell: bash steps
  • Add Set up Python step — Python is not on PATH by default on the self-hosted runner
  • Replace cp releases/index.json "$RESULTS/index.json" with track.py merge-index — merges newly swept releases into the existing index rather than overwriting it, so HTML reports are generated from all historical data regardless of the sweep date window

Nightly workflow (compile-perf-nightly.yml)

  • Add Set up Python step — Python is not on PATH by default on the self-hosted runner
  • Use python3 instead of python throughout — python is not available on the runner

bench.py

  • Remove diagnostics_errors workload — slangc aborts before emitting timer output when compilation fails with many errors; the workload contributed no performance data
  • Remove dead expect_fail plumbing — no workload sets expect_fail=True after diagnostics_errors is removed

report.py / breakdown.py

  • Write HTML/SVG files with explicit encoding="utf-8" — Windows Python defaults to cp1252 which can't encode Unicode characters (e.g. , ×) used in the generated HTML, causing UnicodeEncodeError
  • Read SVG files with explicit encoding="utf-8" — fixes garbled × rendering of × on the generated pages
  • Fix remaining write calls in breakdown.py that were missing encoding="utf-8" (workload .html files and breakdown.svg)
  • Fix workload page back link: ../report_per_workload.html../index.html

lib/manifest.py

  • Remove diagnostics_errors WorkloadSpec entry and expect_fail field

lib/workloads.py

  • Remove gen_diagnostics_errors generator; update gen_diagnostics_clean docstring

tools/compile-perf/README.md

  • Update diagnostics_clean description — no longer a "control for errors − clean" since diagnostics_errors is gone

track.py

  • Add merge-index command: merges a new index.json into the existing results repo index by tag, preserving all prior release entries
  • Add encoding="utf-8" to all JSON file writes

Test plan

  • Validated end-to-end on the nvrgfx-perf runner: release sweep completes 25/25 workloads, pushes results, generates HTML reports, deploys to GitHub Pages successfully
  • Unicode characters (×, ·) render correctly on the generated pages
  • Workload page back links work correctly

@jvepsalainen-nv jvepsalainen-nv requested a review from a team as a code owner June 23, 2026 08:09
@jvepsalainen-nv jvepsalainen-nv added the pr: non-breaking PRs without breaking changes label Jun 23, 2026
@jvepsalainen-nv jvepsalainen-nv requested review from bmillsNV and removed request for a team June 23, 2026 08:09
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Two GitHub Actions workflows are updated to use explicit Python 3 setup via actions/setup-python and invoke compile-perf scripts via python3 instead of python. The release-sweep workflow adds Windows Git bash directories to PATH before bash steps execute. A logic update in bench.py relaxes timer validation for expected-failure workloads that may not emit timer output, with corresponding configuration changes in manifest.py and documentation updates.

Changes

Compile Perf Workflow and Tool Updates

Layer / File(s) Summary
Windows Git bash and Python setup for release-sweep
.github/workflows/compile-perf-release-sweep.yml
Adds Git bash directories to PATH on Windows runners and introduces actions/setup-python@v6 to configure Python 3.x before running compile-perf tools.
Release-sweep workflow Python3 interpreter migration
.github/workflows/compile-perf-release-sweep.yml
Updates release-sweep tool invocations (fetch_releases.py, sweep.py, track.py in resync pipeline) to use python3, switches the commit message runner-id computation to python3 track.py, and updates report.py invocation in HTML generation.
Nightly workflow Python setup and Python3 migration
.github/workflows/compile-perf-nightly.yml
Adds actions/setup-python@v4 to configure Python 3.x and updates all tool invocations (bench.py, track.py, report.py, trend.py) to use python3 instead of python.
Expected-failure workload timer validation and documentation
tools/compile-perf/bench.py, tools/compile-perf/lib/manifest.py, tools/compile-perf/README.md
Relaxes timer validation in run_spec to accept expect_fail workloads that produce compile errors before timer output is emitted, removes primary_timers configuration from the diagnostics_errors workload, and updates documentation to explain that error-heavy compilation aborts before timer output.

Suggested reviewers

  • bmillsNV
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main changes: adding bash to PATH and using python3 on Windows.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description clearly relates to the changeset, detailing specific fixes for Windows runner compatibility issues including bash PATH setup, Python availability, encoding fixes, and workload removal.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@jvepsalainen-nv jvepsalainen-nv changed the title compile-perf: add bash to PATH on Windows in release sweep compile-perf: fix Windows runner — add bash to PATH, use python3 Jun 23, 2026
The release sweep uses shell: bash steps but doesn't go through
common-setup (no build needed), so Git Bash wasn't on PATH.
Also replace bare 'python' with 'python3' throughout both workflows —
Windows doesn't have a 'python' alias by default.
@jvepsalainen-nv jvepsalainen-nv force-pushed the fix/compile-perf-bash-path branch from 070e28d to 00bf550 Compare June 23, 2026 08:30

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a2e0bc20-ad08-4142-a3fc-ccb3707b0ecc

📥 Commits

Reviewing files that changed from the base of the PR and between dacb6c5 and 070e28d.

📒 Files selected for processing (2)
  • .github/workflows/compile-perf-nightly.yml
  • .github/workflows/compile-perf-release-sweep.yml

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 4


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a2e0bc20-ad08-4142-a3fc-ccb3707b0ecc

📥 Commits

Reviewing files that changed from the base of the PR and between dacb6c5 and 070e28d.

📒 Files selected for processing (2)
  • .github/workflows/compile-perf-nightly.yml
  • .github/workflows/compile-perf-release-sweep.yml
🛑 Comments failed to post (4)
.github/workflows/compile-perf-nightly.yml (4)

18-20: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Default blank ref to master, not the dispatch branch.

Line 19 promises “blank = master HEAD”, but Line 50 uses github.ref, which is the branch/tag selected when manually dispatching the workflow. That can record non-master runs as nightly ToT data.

🐛 Proposed fix
       - uses: actions/checkout@93cb6efe18208431cddfb8368fd83d5badbf9bfd # v5
         with:
-          ref: ${{ github.event.inputs.ref || github.ref }}
+          ref: ${{ github.event.inputs.ref || 'master' }}
           persist-credentials: false

As per path instructions, CI/CD workflow changes must verify correct trigger conditions.

Also applies to: 48-51

Source: Path instructions


28-30: 🩺 Stability & Availability | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Inspect compile-perf workflows that write the perf-results repo and compare concurrency groups.
# Expected: workflows that push to shader-slang/slang-compile-perf share one non-canceling concurrency group.

set -euo pipefail

for f in .github/workflows/compile-perf-nightly.yml .github/workflows/compile-perf-release-sweep.yml; do
  [ -f "$f" ] || continue
  echo "### $f"
  sed -n '/^concurrency:/,/^[^[:space:]]/p' "$f" | sed '$d' || true
  rg -n 'PERF_RESULTS_REPO|slang-compile-perf|git push' "$f" || true
  echo
done

Repository: shader-slang/slang

Length of output: 1633


Use the same concurrency group for both compile-perf-nightly and compile-perf-release-sweep workflows.

Both workflows push to the same shader-slang/slang-compile-perf repository and gh-pages branch, but they use separate concurrency groups (compile-perf-nightly and compile-perf-release-sweep). This allows them to run concurrently, risking non-fast-forward push failures when both attempt simultaneous pushes. Consolidate them under a single shared concurrency group with cancel-in-progress: false to serialize perf-results writes.


44-46: 📐 Maintainability & Code Quality | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check whether the custom nvrgfx-perf runner label is configured for actionlint.
# Expected: actionlint configuration includes nvrgfx-perf, or the workflow label matches an existing configured custom label.

set -euo pipefail

echo "## actionlint-related files"
fd -i 'actionlint|\.github' . -t f | sed -n '1,120p'

echo
echo "## nvrgfx-perf references"
rg -n 'nvrgfx-perf|runner-label|self-hosted' .github -g '*.yml' -g '*.yaml' || true

Repository: shader-slang/slang

Length of output: 3295


🏁 Script executed:

cat -n .github/actionlint.yaml

Repository: shader-slang/slang

Length of output: 408


🏁 Script executed:

sed -n '40,50p' .github/workflows/compile-perf-nightly.yml

Repository: shader-slang/slang

Length of output: 567


Add nvrgfx-perf to the actionlint configuration or update the runner label.

The nvrgfx-perf label at line 46 is not configured in .github/actionlint.yaml. If actionlint is enforced in CI, this workflow will fail linting. Either add nvrgfx-perf to the self-hosted-runner.labels list in .github/actionlint.yaml, or use an existing configured label like perf.

🧰 Tools
🪛 actionlint (1.7.12)

[error] 46-46: label "nvrgfx-perf" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2025", "windows-2025-vs2026", "windows-2022", "windows-11-arm", "ubuntu-slim", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-24.04-arm", "ubuntu-22.04", "ubuntu-22.04-arm", "macos-latest", "macos-latest-xlarge", "macos-latest-large", "macos-26-intel", "macos-26-xlarge", "macos-26-large", "macos-26", "macos-15-intel", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xlarge", "macos-14-large", "macos-14", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows", "arc", "benchmark", "build", "falcor", "GCP-T4", "GPU", "perf", "regression-test", "SM80Plus", "vulkancts". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file

(runner-label)

Source: Linters/SAST tools


155-179: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Don’t deploy Pages after a failed report generation.

Because report generation is continue-on-error, later deploy steps still run. If report.py leaves a partial analysis directory before failing, Lines 209-217 can publish incomplete reports.

🛡️ Proposed fix
       - name: Generate HTML reports
+        id: generate_html_reports
         continue-on-error: true
         shell: bash
         run: |
           set -euo pipefail
@@
 
       - name: Checkout gh-pages branch
+        if: steps.generate_html_reports.outcome == 'success'
         continue-on-error: true
         uses: actions/checkout@93cb6efe18208431cddfb8368fd83d5badbf9bfd # v5
         with:
@@
 
       - name: Deploy HTML reports to GitHub Pages
+        if: steps.generate_html_reports.outcome == 'success'
         continue-on-error: true
         shell: bash
         env:

Also applies to: 209-217

🧰 Tools
🪛 zizmor (1.26.1)

[warning] 169-176: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

@jvepsalainen-nv jvepsalainen-nv self-assigned this Jun 23, 2026
github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

@jkiviluoto-nv jkiviluoto-nv left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

github-actions[bot]

This comment was marked as outdated.

github-actions[bot]

This comment was marked as outdated.

@jvepsalainen-nv jvepsalainen-nv added this pull request to the merge queue Jun 24, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Jun 24, 2026
github-actions[bot]

This comment was marked as outdated.

@jvepsalainen-nv jvepsalainen-nv added this pull request to the merge queue Jun 25, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 25, 2026
@jvepsalainen-nv jvepsalainen-nv added this pull request to the merge queue Jun 26, 2026
@jvepsalainen-nv jvepsalainen-nv removed this pull request from the merge queue due to a manual request Jun 26, 2026
github-actions[bot]

This comment was marked as outdated.

@jvepsalainen-nv jvepsalainen-nv added this pull request to the merge queue Jun 26, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 26, 2026
@nv-slang-bot

Copy link
Copy Markdown

Heads-up: this PR was evicted from the merge queue by an infra flake — a single 600s per-test timeout in Falcor test_Materials_scene2_d3d12 (HSigmoid passed, so not a numeric/relErr failure). It's approved and head-green, but GitHub didn't auto-requeue it, so it's currently out of the queue and needs a manual re-add to the merge queue when you get a chance.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: 🟡 Has issues — 0 bugs, 2 gaps

The PR fixes two real Windows-runner regressions (bash not on PATH on the release-sweep workflow; python vs python3 interpreter mismatch) and replaces the destructive cp releases/index.json "$RESULTS/index.json" with a track.py merge-index subcommand that preserves the historical release set. The remaining concerns are a stale doc reference outside the touched-file set and the absence of any automated test for the new merge_index logic.

Changes Overview

Windows runner fixes (.github/workflows/compile-perf-nightly.yml, .github/workflows/compile-perf-release-sweep.yml)

  • Adds actions/setup-python@v6 (pinned to SHA a309ff8…, same pin already used elsewhere in the repo).
  • Release-sweep adds a pwsh step that appends C:\Program Files\Git\bin and Git\usr\bin to GITHUB_PATH (matches the wording in .github/actions/common-setup/action.yml, which nightly already inherits via common-setup).
  • All inline python … invocations are switched to python3 …, including the $(python3 … track.py runner-id) command substitution inside the release-sweep commit-message line.
  • Release-sweep gets a new if: always() cleanup step that rm -rf "$GITHUB_WORKSPACE/tools/compile-perf/releases".

track.py merge-index and index merge semantics (.github/workflows/compile-perf-release-sweep.yml, tools/compile-perf/track.py)

  • The destructive cp releases/index.json "$RESULTS/index.json" line is replaced with python3 track.py merge-index --results "$RESULTS" --index releases/index.json.
  • The new merge_index function reads any existing index.json, keys both inputs by tag (new wins on collision), sorts merged entries by date, and writes the result. register, rebuild, and stamp-runner continue to run afterwards.
  • All JSON file writes in track.py gain encoding="utf-8".

Drop diagnostics_errors workload (tools/compile-perf/bench.py, lib/manifest.py, lib/workloads.py, README.md)

  • The intentionally-failing diagnostics_errors workload (and its expect_fail / expected_fail plumbing on WorkloadSpec and the per-sample result record) is removed; slangc was aborting before emitting timer output for it, so it contributed no perf data.
  • gen_diagnostics_clean's docstring is rewritten; the README row and the errors − clean caveat are dropped.

Encoding fixes for Windows HTML/SVG generation (tools/compile-perf/breakdown.py, tools/compile-perf/report.py)

  • Every open(…, "w") writing HTML/SVG and every open(…) reading SVG content gains encoding="utf-8". Fixes the UnicodeEncodeError produced by Windows Python's cp1252 default on glyphs like and ×, and the × mojibake when SVG output was re-read.
  • Per-workload page back link changes from ../report_per_workload.html to ../index.html, matching the deploy step that renames the file.
Findings (2 total)
Severity Location Finding
🟡 Gap tools/compile-perf/lib/analyze.py:12 Module docstring still claims it "derives the diagnostics path-cost series (errors - clean)"; with diagnostics_errors removed there is nothing to subtract. File is not in the PR's touched-file set.
🟡 Gap tools/compile-perf/track.py:166-186 (new merge_index) New non-trivial merge logic has no automated test; missing-tag raises KeyError, missing-date silently sorts to the front, and several JSON loads use bare open() without with.

@@ -10,7 +10,6 @@
- Every compilable workload exposes a ``[shader("compute")]`` entry named
``computeMain`` so ``slangc`` auto-discovers it; results are written to a

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Gap: Stale diagnostics_errors references remain in lib/analyze.py and other docstrings the PR did not touch

The PR drops the diagnostics_errors workload from manifest.py and the gen_diagnostics_errors generator from workloads.py, and updates the README table. But tools/compile-perf/lib/analyze.py is not in the touched-file set, and its module docstring still claims:

"""Stack per-release perf results into time-series and flag regressions.
...
Also derives the diagnostics path-cost series (errors - clean).
"""

(lib/analyze.py:12)

After this PR there is no errors series to subtract, so the docstring describes a behavior that the module no longer (and no longer can) implement. This is the same class of stale-doc issue the PR is fixing in workloads.py and the README — it just missed this file.

Suggestion: drop line 12 of lib/analyze.py, since no remaining function computes errors − clean. While there, grep once more for diagnostics_errors, expect_fail, expected_fail, and errors - clean / errors − clean across tools/compile-perf/ to make sure nothing else points at the removed workload — e.g. trailing references in sweep.py, trend.py, or fixture data.



def stamp_runner(results_dir, label):
rp = os.path.join(results_dir, "runner.json")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Gap: merge_index is new non-trivial logic with no automated test, and several edge cases are silently ambiguous

def merge_index(results_dir, new_index_path):
    ...
    dest = os.path.join(results_dir, "index.json")
    existing = {}
    if os.path.exists(dest):
        for r in json.load(open(dest)):
            existing[r["tag"]] = r
    n_before = len(existing)
    for r in json.load(open(new_index_path)):
        existing[r["tag"]] = r
    merged = sorted(existing.values(), key=lambda r: r.get("date", ""))

merge_index replaces the previous cp releases/index.json "$RESULTS/index.json" line and is now the single source of truth for what releases the report covers. There is no test_*.py anywhere under tools/compile-perf/ (Glob tools/compile-perf/**/test_*.py returns zero results) and no CI step that exercises the function with anything other than the real production input on the self-hosted runner — i.e. the only signal that the merge worked is that the next release-sweep nightly produced a non-broken report.

Behaviors that are not pinned by any test:

  • Missing tag field. existing[r["tag"]] = r raises KeyError for an entry without tag. Cold-start with a freshly fetched releases/index.json of unknown shape will abort the whole sweep job mid-way through commit/push rather than fail early.
  • Missing date field. key=lambda r: r.get("date", "") silently sorts dateless entries to the front of the series, which then propagates into the tracking series and the HTML report — a bug an automated test would catch instantly but a human eyeballing the live report likely will not.
  • Duplicate tags within a single input. Last-wins, but this is undocumented.
  • Mixed-shape merge (existing index from a previous tool version vs new index). No schema check.
  • Empty/malformed JSON in either file. Bubbles up as a raw JSONDecodeError; the runner is left with the cleanup step racing the half-written index.json.

Coupled with the fact that json.load(open(dest)) and json.load(open(new_index_path)) don't use with, a KeyError mid-iteration leaves the read handle on dest open while the cleanup paths run, which on Windows can interact badly with the subsequent merged → open(dest, "w") in a retry / re-run scenario.

Suggestion: add a minimal tools/compile-perf/tests/test_track_merge_index.py (pytest, stdlib-only) covering at least: (a) cold-start (no dest file); (b) merge overrides by tag; (c) sort by date; (d) missing-tag and missing-date raise or have a documented fallback. Wire it into a pull_request paths-filtered job (paths: [tools/compile-perf/**, .github/workflows/compile-perf-*.yml]) so changes in this directory get the test run on ubuntu-latest before they reach the self-hosted runner. While there, wrap the two json.load(open(...)) calls in with blocks for symmetry with the new with open(dest, "w", encoding="utf-8") in the same function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr: non-breaking PRs without breaking changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants