render-test: capture shader-abort message to stdout + abort message tests#11799
render-test: capture shader-abort message to stdout + abort message tests#11799nv-slang-bot[bot] wants to merge 2 commits into
Conversation
When `-render-features shader-abort` is requested, render-test now captures the host-delivered "Shader abort: ..." debug message (forwarded by slang-rhi via the Vulkan device-fault path) and prints it to stdout, so a slang-test `filecheck=` line can assert the formatted abort message. The capture is strictly gated on the shader-abort feature being requested, leaving every other render test byte-for-byte unchanged. A fired abort loses the device, so update()'s result is swallowed in this mode (it was already unchecked on the normal path) to reach the print. tests/spirv/abort-message.slang adds plain / %f / multi-arg / mixed-type abort cases, each asserting the captured message; empty .expected.txt siblings satisfy the harness buffer comparison against the empty (device-loss) output buffer. These -vk lines skip cleanly where the feature is unavailable; the GPU message round-trip is verified on capable hardware. Co-authored-by: Harsh Aggarwal <haaggarwal@nvidia.com>
There was a problem hiding this comment.
Remove all of "*.expected.txt" files.
We prefer to use inlined //CHK: sytax.
There was a problem hiding this comment.
Done — pushed fa1a35d976. Removed all four *.expected.txt files and moved the assertions inline.
Each case now combines two inline FileCheck prefixes in one directive:
filecheck=<PREFIX>asserts the capturedShader abort: …message on render-test's stdout (what the capture mode prints), andfilecheck-buffer=EMPTYasserts the readback buffer stays empty — a fired abort loses the device, so nothing is written back; the shared// EMPTY-NOT: {{.}}satisfiesCOMPARE_COMPUTE's mandatory buffer comparison without a sidecar file.
I kept filecheck= for the message per your earlier note. One choice worth your call: since the abort loses the device there's no readback, so the buffer side is an emptiness assert rather than a positive value check. If you'd rather have render-test write the captured Shader abort: … text into the output buffer so a single positive filecheck-buffer= checks it (dropping the stdout path and the empty-assert entirely), I'm happy to switch — just say the word.
Verified locally: the combined directives parse and the four -vk lines skip cleanly on a GPU-less runner, and the existing -target spirv emit case still passes. As before, the abort-fires → message round-trip → FileCheck match is the on-hardware run.
🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify.
Per review on #11799: remove the four empty .expected.txt sidecar files and assert inline instead. Each case now combines `filecheck=<MSG>` (asserts the captured "Shader abort: ..." text on render-test stdout) with `filecheck-buffer=EMPTY` (a CHECK-NOT asserting the device-loss readback buffer stays empty), satisfying COMPARE_COMPUTE's stdout and buffer comparisons inline without sidecar expected files. Co-authored-by: Harsh Aggarwal <haaggarwal@nvidia.com>
|
Looks good to me. But blocked until the following PR gets merged on slang-rhi: |
|
Thanks! Agreed on the gating. I'll hold this draft — and the underlying submodule bump it stacks on (#11792) — until slang-rhi#782 merges. At that point I'll move the 🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify. |
Motivation
Follow-up to #11790 / #11792. PR #11792 bumps
external/slang-rhito the head ofslang-rhi#782, which adds the Vulkan
Feature::ShaderAbort("shader-abort") path that recovers a shaderabort()'s message after thedevice-fault and forwards it to the host debug callback. To actually exercise that message
round-trip from a
.slangtest,render-testneeds to surface the captured message where aslang-test directive can assert it.
Per maintainer guidance on #11792: "capture the handleMessage output automatically when
-render-features shader-abortis set, and let afilecheck=line assert it." This PR implementsexactly that, plus the requested
%f/ multiple-argument / mixed-type abort cases.Proposed solution
When (and only when)
-render-features shader-abortis requested,render-testenters adevice-loss-tolerant capture mode: it scopes a local debug callback over the dispatch, then prints
any
"Shader abort: ..."message it received to stdout. ACOMPARE_COMPUTE(filecheck=...)line FileChecks render-test's stdout, so the formatted abort text can be asserted directly. The
mode is strictly gated, so every other render test is byte-for-byte unchanged.
Change summary
tools/render-test/render-test-main.cpp_isShaderAbortRequested+_printCapturedShaderAbortMessageshelpers; in_innerMain, a gated branch that scopes a capture callback overapp.update()and prints the abort text to stdout. Non-abort path unchanged.tests/spirv/abort-message.slang(new)COMPARE_COMPUTE(filecheck=...)-vkcases: plain string,%f, multi-arg, mixed%d %f. Each is its own entry point +//TEST:line (a fired abort loses the device, terminal for that invocation).tests/spirv/abort-message.slang{,.1,.2,.3}.expected.txt(new, empty)runComputeComparisonImpl's mandatory buffer comparison: a lost device writes no buffer, so the empty pre-cleared.actual.txtmatches an empty.expected.txt.The emit-side coverage (
-target spirv -capability abort→OpAbortKHR) and the device-keep-alivebuffer-compare line continue to live in
tests/spirv/abort-runtime.slang(PR #11792, unchanged).Concepts and vocabulary
filecheck=vsfilecheck-buffer=— in a render-test COMPARE_COMPUTE,filecheck=runsFileCheck against render-test's stdout/stderr (
_validateOutput), whilefilecheck-buffer=runs against the readback buffer file. The abort message arrives on stdout, so these tests use
filecheck=.message back via
vkGetDeviceFaultDebugInfoKHRand delivers it as a host debug message prefixed"Shader abort: ".ScopedCoreDebugCallback/CoreDebugCallback— the existing render-test debug-callbackbridge/buffer (
tools/render-test/slang-support.h); the capture mode reuses them rather thanadding any new interface or ABI.
Process report
shaderAbortMode = _isShaderAbortRequested(options); the else-branch is the identicalapp.update();. Zero behaviorchange for any test that does not request
shader-abort._isShaderAbortRequestedreuses thecanonical
_getFeatureFromNamename→feature map (one source of truth), which returns_Countfor unknown names with no side effects, so failure timing for other tests is unchanged.
app.update()'s result is swallowedin this mode (it was already unchecked on the normal path) so we reach the print;
filecheck=ignores the process result code. The capture callback is scoped to just the dispatch.
.expected.txtsiblings match the empty (never-written) output buffer — the cleanest neutralizerfor the always-on buffer comparison.
abort-runtime.slangSIMPLE emit line still PASSES; all four new
-vk -render-features shader-abortlines SKIP cleanlywith no GPU (
SLANG_E_NOT_AVAILABLE→ ignored, not failed). Not verifiable here (no GPU): theabort actually firing → device-fault →
"Shader abort: ..."round-trip → FileCheck match. Thatend-to-end is a maintainer hardware run.
Dependency / merge status
Draft, and cannot merge before #11792. This branch is based on #11792's
fix/issue-11790branch because
rhi::Feature::ShaderAbortonly exists in the bumped submodule — the capture codedoes not compile against
master. #11792 itself is gated onslang-rhi#782 merging. Kept separate (rather
than folded into #11792) so #11792 stays a minimal submodule-bump + gated-test that can come out of
draft the moment slang-rhi#782 lands; this can fold into #11792 instead if preferred.
Related to #11790. Does not auto-close it (the GPU round-trip is verified separately).
🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify.