Print ci commands#11497
Conversation
The conditional logic that assembles slang_test_args only shows the bash source in the log, not the final arguments, making it hard to tell which branches fired. Print the resolved command with printf %q before each invocation so the exact arguments are visible and copy-pasteable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Remove the test-category input/usage from the reusable test workflows so slang-test runs without -category and falls back to its default category (full). Drops the input definitions in ci-slang-test.yml and ci-slang-sanitizer.yml, the hardcoded -category full in the container workflow, all test-category: values passed from ci.yml and ci-nightly-sanitizer.yml, and the unused test-category matrix entries in the falcor/regression/perf/cts/release workflows. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
macOS has no native Vulkan backend, so exclude the vk API on macOS test jobs; Vulkan-only test variants are filtered out by the test runner rather than run and failed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Address review feedback: separate the printf command-echo block from the slang-test invocation with a blank line for readability. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
slang-test honors only one -api option; a second occurrence silently overrides the first. Build one API expression from the cpu-only, gpu-api-only, and macOS-no-vk restrictions and append it once, folding the macOS vk exclusion into the existing expression (all-vk on its own, or -vk subtracted from a cpu/gpu restriction) instead of adding a second -api. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Restructure the API selection into one explicit if/elif/else that picks exactly one expression: cpu-only -> cpu+llvm; gpu-api-only -> the GPU API list (without vk on macOS); otherwise all (all-vk on macOS). macOS has no native Vulkan, so its expressions exclude vk; -api all matches slang-test's default of all APIs enabled. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
slang-test parses each -api expression relative to the previous result,
but RenderApiUtil::parseApiFlags only keeps that previous value when the
expression begins with an operator ('+' or '-'). A name-leading expression
resets the API set from scratch, so a second -api like 'cpu' after '-api vk'
silently dropped the earlier selection with no diagnostic.
Reject a second-or-later -api whose expression does not begin with '+' or
'-', pointing the user at the two valid forms (combine into one expression,
or lead with an operator to accumulate). Operator-leading expressions such
as '-api all -api -vk' still accumulate as before.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
-synthesizedTestApi feeds RenderApiUtil::parseApiFlags the same way as -api, so a name-leading expression after an earlier -synthesizedTestApi silently discards it. Reject that case symmetrically, tracking the two options independently. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The script required Bash 4 only for readarray and the |& pipe operator. Replace readarray with a read_lines helper (a portable readarray -t emulation), swap |& for 2>&1 |, and lower the version gate to 3.2 so the formatter runs on a stock macOS shell without 'brew install bash'. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ted-target This neural diagnostic test fails in CI; tracked as a separate issue. Add all four target variants (-target hlsl/wgsl/glsl/cpp) to expected-failure-github.txt to unblock CI for now. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Only the base (-target hlsl) variant fails on macOS; the .1/.2/.3 (wgsl/glsl/cpp) variants pass. Drop the over-broad entries so those variants keep reporting real status. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Dropping -category means the GitHub-hosted aarch64 jobs (no GPU) now run the full suite, which includes gfx-unit-test-tool Vulkan tests that fail to create a Vulkan instance without a GPU. Add the four that were not yet covered to expected-failure-no-gpu.txt, alongside the existing Vulkan entries. The list is only applied to full-gpu-tests=false jobs, so GPU runners still expect these to pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…macOS This dynamic-dispatch test is flaky on the macOS test server: the two existential-typed constant buffers' outputs come back swapped, so the FileCheck for 15/16 fails. It passes locally and on the macOS debug job. Add it to expected-failure-github.txt to unblock CI; the ordering root cause (from shader-slang#10393) is tracked as a separate issue. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… test The macOS failure is intermittent (passes locally and on the macOS debug job), so it does not belong in expected-failure-github.txt, which is for consistent failures. Handle it as a flaky test via CI rerun instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Expand the -api help to document the expression syntax (names reset the set; +/- add/remove) and enumerate every available keyword: all, none, vk (vulkan), dx11 (d3d11), dx12 (d3d12), mtl (metal), cuda, cpu, llvm, wgpu (webgpu). Also note -synthesizedTestApi accepts the same expression. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Metal is the only GPU API available on macOS, so select -api mtl for the macOS jobs instead of all-vk. (cpu)/(llvm) variants are skipped here and covered by the CPU-only Linux tier; no-API tests still run because their requiredFlags==0. This also stops running CPU-variant GPU-dispatch tests on the macOS test server. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Need the big picture first? Review this PR in Change Stack to see what changed before going file by file. No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR removes ChangesTest Category Removal and Command Logging
Bash 3.2 Compatibility for Formatting Script
Test API Filter Option Validation and Tests
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
For the failing aarch64 tests, I think we need to introduce a new file tests/expected-failure-aarch64 and silence it for this PR. |
Dropping -category means the GitHub-hosted aarch64 jobs run the full suite with -api all, which includes gfx-unit-test-tool Vulkan tests that fail to create a Vulkan instance on the aarch64 runners. Add a dedicated tests/expected-failure-aarch64.txt, applied only when platform == aarch64, listing the four affected tests. The underlying aarch64 Vulkan behavior is tracked as a separate follow-up issue.
All callers pass the literal name 'files', so the dynamic-name indirection that motivated eval is unnecessary. Populate the global 'files' array directly, which is Bash 3.2 compatible (no readarray/mapfile or namerefs) and avoids eval entirely.
|
@coderabbitai review |
✅ Action performedReview finished.
|
jkwak-work
left a comment
There was a problem hiding this comment.
Looks good to me
There was a problem hiding this comment.
Verdict: 🟡 Has issues — 2 gap(s)
CI-infrastructure refactor that drops -category (slang-test default = full), selects a single -api per job, and prints the resolved command via printf %q. Adds a new apiOptionSeen / synthesizedApiOptionSeen guard in tools/slang-test/options.cpp that turns silent override of repeated -api into a hard error. Also bumps extras/formatting.sh to support bash 3.2 and bumps the HLSL profile in one neural diagnostic test from cs_6_0 → cs_6_6 (verified necessary because descriptor_handle requires _sm_6_6 on HLSL per slang-capabilities.capdef:1425).
Changes Overview
slang-test option parser (tools/slang-test/options.cpp)
- New: rejects a second-or-later
-api(and-synthesizedTestApi) whose expression begins with a name (e.g.-api vk -api cpu), sinceRenderApiUtil::parseApiFlagswould silently discard the earlier value. Operator-leading expressions (+cuda,-vk) still accumulate. Help text rewritten to enumerate keywords and operator semantics.
CI workflows (.github/workflows/ci.yml, ci-slang-test.yml, ci-slang-test-container.yml, ci-slang-sanitizer.yml, ci-nightly-sanitizer.yml, plus compile-regression-test.yml, falcor-*.yml, release.yml, vk-gl-cts-nightly.yml)
- Removed the
test-categoryworkflow input/usage everywhere; slang-test falls back to its defaultfull. Smoke tiers (test-macos-debug-clang-aarch64,test-linux-debug-gcc-aarch64,test-linux-release-gcc-aarch64) widen fromsmoketofull. ci-slang-test.ymlcollapses the previous two--apichain into a single expression:cpu-only→cpu+llvm;os == macos→mtl;gpu-api-only→vk+cuda+dx11+dx12+mtl+wgpu; otherwiseall.- Adds
printf '+ %q' …immediately before eachslang-testinvocation in three workflows so CI logs show the fully-resolved command line.
New aarch64 expected-failure list (tests/expected-failure-aarch64.txt)
- Silences four
gfx-unit-test-tool/*Vulkan.internaltests that fail to create a Vulkan instance on the aarch64 GitHub-hosted runners. Wired in underinputs.platform == aarch64inci-slang-test.yml.
Build script (extras/formatting.sh)
- Lower minimum bash from 4 to 3.2 (macOS default
/bin/bash). Adds aread_linesfunction emulatingreadarray -t files. Replaces|&(bash 4) with2>&1 |(bash 3.2-compatible) in two clang-format pipelines.
Test profile bump (tests/neural/network-parameter-layout-converter-unsupported-target.slang)
- HLSL diagnostic test profile bumped
cs_6_0→cs_6_6so thedescriptor_handlecapability (needed byRWStructuredBuffer<T>.Handle) is satisfied and the__target_switchdefault-armstatic_assertis the diagnostic that fires.
Findings (2 total)
| Severity | Location | Finding |
|---|---|---|
| 🟡 Gap | tools/slang-test/options.cpp:~447,~486 |
New -api / -synthesizedTestApi rejection path is a user-visible CLI behavior change with no regression test; the duplicated guard is also a future-drift hazard. |
| 🟡 Gap | tests/expected-failure-aarch64.txt + ci.yml smoke→full conversions |
smoke→full widening covers Linux aarch64 only — test-macos-debug-clang-aarch64 has no tests/expected-failure-macos*.txt safety net; the new aarch64 file also lacks a tracking-issue reference. |
It turned out that we haven't been testing all of
mtltests on mac-os.This PR removes the confusing option of
-test-categoryand enablesmtltest on mac-os tests.Summary
CI maintenance for how
slang-testis invoked in GitHub Actions, plus a relatedslang-test option-parsing fix surfaced along the way. Changes:
1. Print the fully-resolved
slang-testcommandThe conditional bash that assembles
slang_test_argsis echoed as source in thelog (the
if [[ ... ]]; then ...lines), but the final argument list is neverprinted — so it's impossible to tell which branches fired or which arguments
slang-testactually ran with. Print the resolved command withprintf %q(shell-quoted, copy-pasteable, prefixed with
+likeset -x) immediatelybefore each invocation in
ci-slang-test.yml,ci-slang-sanitizer.yml, andci-slang-test-container.yml.2. Stop passing
-category; use slang-test's defaultRemove the
test-categoryinput/usage everywhere soslang-testruns without-categoryand falls back to its default category (full). Drops the inputdefinitions, the hardcoded
-category fullin the container workflow, alltest-category:values passed fromci.yml/ci-nightly-sanitizer.yml, andthe unused
test-categorymatrix entries in the falcor/regression/perf/cts/releaseworkflows.
3. Select a single
-apiexpression per job (macOS → Metal only)slang-testhonors only one-apioption, so the workflow now picks exactly oneexpression by branch:
cpu-only→cpu+llvm; macOS →mtl(Metal is the only GPUAPI on macOS);
gpu-api-only→ the GPU API list; otherwiseall. On macOS,(cpu)/(llvm)variants are skipped (covered by the CPU-only Linux tier) whileno-API tests still run (their
requiredFlags==0).-api allmatches slang-test'sdefault, so non-macOS behavior is unchanged.
4. slang-test: error on a
-apithat silently discards an earlier oneRenderApiUtil::parseApiFlagsonly preserves the previous-apiresult when thenext expression begins with an operator (
+/-); a name-leading expression(e.g.
cpu,vk,all) resets the API set from scratch, so-api vk -api cpusilently dropped
vkwith no diagnostic. slang-test now rejects a second-or-later-apiwhose expression does not begin with+or-, pointing the user at thetwo valid forms. Operator-leading accumulation (
-api all -api -vk) still works.Notes to the reviewers
printf %qverified compatible with macOS bash 3.2.57; empty-array expansion issafe because the Actions default shell uses
-e -o pipefail, notset -u.now run the
fullsuite sincefullis slang-test's default category.-apiguard is intentionally narrow: it fires only on the name-leadingcase that silently overrides; operator-leading expressions still accumulate.