Integration: bump slang-rhi #782 (VK_KHR_shader_abort) + abort test#11792
Integration: bump slang-rhi #782 (VK_KHR_shader_abort) + abort test#11792nv-slang-bot[bot] wants to merge 1 commit into
Conversation
…d abort test Bump external/slang-rhi 687dc18 -> dfd2e66 (shader-slang/slang-rhi#782 head, branch fix/issue-781), which adds Feature::ShaderAbort / "shader-abort", enables VK_KHR_shader_abort + VK_KHR_device_fault, and bumps slang-rhi's Vulkan-Headers dependency to v1.4.347. slang-rhi self-fetches those headers via FetchPackage, so bumping the pin is sufficient; slang's own external/vulkan is not used by the slang-rhi Vulkan backend. Add tests/spirv/abort-runtime.slang: a SIMPLE SPIR-V emit line (verified locally: OpAbortKHR) plus a -vk COMPARE_COMPUTE line gated on -render-features shader-abort that skips cleanly where the feature/device is unavailable. The abort() call sits on a branch the dispatch never reaches, so the abort-capable pipeline is exercised without device loss. slang-rhi#782 is unmerged, so the pin points at a PR-branch commit; this PR stays draft and cannot merge until #782 lands and the pin moves to a merged main SHA. Related to #11790, shader-slang/slang-rhi#782 and #781. Co-authored-by: Harsh Aggarwal <haaggarwal@nvidia.com>
|
Thanks @jkwak-work — taking both points together:
Testing the messages from To genuinely test the messages I'd add a
🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify. |
|
@nv-slang-bot , it sounds good. Let's go with "capture the handleMessage output automatically when -render-features shader-abort is set and let a filecheck= line assert it;" |
|
On it, @jkwak-work — auto-capture it is: when One logistics note (your call, since you own the harness): I'd suggest landing this capture mode + message tests as a separate follow-up PR rather than in #11792 — that keeps #11792 a minimal submodule-bump + gated-skip test that can come out of draft the moment slang-rhi#782 merges, decoupled from the larger render-test C++ change. Like the rest of this work, I can only compile-verify the capture mode here; the actual GPU abort-firing + message round-trip is your hardware run. If you'd rather have it all in #11792, just say so — I'll start building it now either way. 🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify. |
|
Done — implemented the auto-capture you described in #11799 (kept as a small follow-up so this PR stays a clean submodule-bump + gated-test that can leave draft the moment slang-rhi#782 merges; happy to fold it into #11792 instead if you'd rather). What it does, matching your steer:
Verified here: builds clean; the existing 🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify. |
|
Looks good to me. But blocked until the following PR gets merged on slang-rhi: |
|
Thanks! Agreed on the gating. Holding this as a draft until slang-rhi#782 merges — at that point I'll move the 🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify. |
Summary
Integration PR for the
abort()runtime path on Vulkan (VK_KHR_shader_abort), so the slang-rhi support in shader-slang/slang-rhi#782 can be exercised throughslang-test. It does two things:external/slang-rhisubmodule from687dc18todfd2e66— the head of slang-rhi PR Cryptic error message with missing function parameter #782 (fix/issue-781, "Vulkan: support VK_KHR_shader_abort + VK_KHR_device_fault",Fixes shader-slang/slang-rhi#781). That commit addsFeature::ShaderAbort/"shader-abort"and enablesVK_KHR_shader_abort+VK_KHR_device_fault, and bumps slang-rhi's Vulkan-Headers dependency to v1.4.347 (the first release with theVK_KHR_shader_abortsymbols).tests/spirv/abort-runtime.slangthat compiles and runs a compute shader containingabort(), gated on the new render feature so it skips cleanly where unsupported.Important
This PR is intentionally a draft and cannot merge yet. slang-rhi #782 is still open, so the submodule pin (
dfd2e66) points at a PR-branch commit, not a merged SHA. This is deliberate — it lets #782 be integration-tested againstslang-testbefore it lands. Once #782 merges, the pin must be moved to the merged slang-rhimainSHA before this can come out of draft.How the Vulkan-Headers v1.4.347 bump reaches slang's build
slang-rhi sources its own Vulkan headers: its
CMakeLists.txtdoesFetchPackage(vulkan_headers URL "${SLANG_RHI_VULKAN_HEADERS_URL}")(guarded only bySLANG_RHI_ENABLE_VULKAN) and links the slang-rhi library against the resultingslang-rhi-vulkan-headersINTERFACE target. slang does not defineSLANG_RHI_VULKAN_HEADERS_URL, so slang-rhi's own value wins, and #782 changed that value fromv1.4.318tov1.4.347. Therefore bumping the submodule pin alone brings the new headers in — slang's ownexternal/vulkansubmodule (currently v1.4.307) is not used by slang-rhi's Vulkan backend, so it is intentionally left untouched.The test
tests/spirv/abort-runtime.slanghas two lines:SIMPLESPIR-V emit line (-target spirv -capability abort) that confirmsabort()lowers toOpAbortKHRwith theAbortKHRcapability +SPV_KHR_abortextension. This has no device dependency, so it runs and is verified everywhere, including GPU-less CI.COMPARE_COMPUTE -vkline gated on-render-features shader-abort. render-test maps the"shader-abort"name torhi::Feature::ShaderAbortvia theSLANG_RHI_FEATURESX-macro (no render-test change needed — bothoptions.cpp'skValidFeatureNamesandrender-test-main.cpp'skFeatureNameMapare generated from that macro). On a device/driver without the feature — or no Vulkan device at all — render-test returnsSLANG_E_NOT_AVAILABLEand the line is skipped cleanly.The
abort()call sits on a branch the 4-thread dispatch never reaches (if (tid.x > 0x1000)).tid.xis a runtime value, so the compiler still emitsOpAbortKHR(and the abort-capable pipeline is created), but the abort never fires — so the device is not lost and thedispatch -> readback -> comparemodel still produces a clean PASS on capable hardware. This is deliberate:OpAbortKHRcauses device loss, which slang-test's compare model cannot tolerate, and there is no slang-test directive for device-fault validation.What is verified, and what is not
Verified locally (GPU-less Linux):
src/vulkan/vk-device.cpp/vk-api.h/vk-utils.cppcompile against the fetched v1.4.347 headers (render-test, which links slang-rhi, builds).SIMPLEemit line passes (OpAbortKHRemitted).-vkCOMPARE_COMPUTEline skips cleanly (no Vulkan device here).tests/spirv/abort*.slangemit tests still pass (no regression).What a PASS on capable hardware demonstrates (for whoever runs it there): slang-rhi selecting and advertising
Feature::ShaderAbort, creating a Vulkan device withVK_KHR_shader_abortenabled, and successfully dispatching a non-aborting path through a pipeline that containsOpAbortKHR. It does not demonstrate the abort firing.NOT verified here or in CI (no capable hardware): the abort actually firing, the device-fault path, and the abort-message round-trip via
vkGetDeviceFaultDebugInfoKHR. Those need a device with bothVK_KHR_shader_abortandVK_KHR_device_faultand a device-loss-tolerant harness mode; they must be validated on capable hardware by a maintainer. This PR exists precisely so that validation can happen.Files changed
external/slang-rhi— submodule pin687dc18→dfd2e66(slang-rhi Cryptic error message with missing function parameter #782 head).tests/spirv/abort-runtime.slang— new gated runtime test.Related
abortfeature (VK_KHR_shader_abort) #11790 (this work; not auto-closing — see the draft/merge note above)aborttesting (prereq for shader-slang/slang#11790) slang-rhi#781 (the slang-rhi tracking issue Cryptic error message with missing function parameter #782 fixes)abort()support shipped in Add abort intrinsic for VK_KHR_shader_abort support #11542.🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify.