Summary
The RHI test texture-shared-cuda.vulkan (CUDA↔Vulkan shared-texture interop) intermittently fails a CHECK_GE numeric-tolerance assertion (in the test harness around tests/testing.h:228). It appears to be a PR-agnostic numeric-tolerance flake, not a real regression.
Observed in shader-slang/slang CI (which runs the vendored slang-rhi test suite) on 3 unrelated PRs over ~1 week:
Each occurrence: 965/966 rhi cases pass, only texture-shared-cuda.vulkan fails, and only on windows-release-gpu-rhi — the same rhi suite passes on windows-debug-gpu-rhi and every other platform. A rerun clears it. The three PRs are unrelated (varied frontend / CI / refactor changes that cannot affect CUDA↔Vulkan interop), which is what confirms it's a flake rather than a regression.
Impact
Recurring windows-release-gpu-rhi reruns across unrelated shader-slang/slang PRs — each occurrence costs a rerun cycle.
Suggested fix
Either widen the CHECK_GE numeric tolerance for the CUDA↔Vulkan interop comparison (if the intermittent miss is within acceptable numeric error), or quarantine / mark-flaky texture-shared-cuda.vulkan on that config until the tolerance is right-sized. A maintainer familiar with the expected interop numeric bounds should choose the tolerance.
Notes
- Surfaced from CI-health monitoring of shader-slang/slang; the specific failing runs are on the referenced PRs'
test-slang-rhi / windows-release-gpu-rhi jobs.
🤖 Filed by an automated Slang CI coworker — may be inaccurate. A human maintainer should verify run-level evidence before acting.
Summary
The RHI test
texture-shared-cuda.vulkan(CUDA↔Vulkan shared-texture interop) intermittently fails aCHECK_GEnumeric-tolerance assertion (in the test harness aroundtests/testing.h:228). It appears to be a PR-agnostic numeric-tolerance flake, not a real regression.Observed in
shader-slang/slangCI (which runs the vendored slang-rhi test suite) on 3 unrelated PRs over ~1 week:Each occurrence: 965/966 rhi cases pass, only
texture-shared-cuda.vulkanfails, and only on windows-release-gpu-rhi — the same rhi suite passes on windows-debug-gpu-rhi and every other platform. A rerun clears it. The three PRs are unrelated (varied frontend / CI / refactor changes that cannot affect CUDA↔Vulkan interop), which is what confirms it's a flake rather than a regression.Impact
Recurring
windows-release-gpu-rhireruns across unrelated shader-slang/slang PRs — each occurrence costs a rerun cycle.Suggested fix
Either widen the
CHECK_GEnumeric tolerance for the CUDA↔Vulkan interop comparison (if the intermittent miss is within acceptable numeric error), or quarantine / mark-flakytexture-shared-cuda.vulkanon that config until the tolerance is right-sized. A maintainer familiar with the expected interop numeric bounds should choose the tolerance.Notes
test-slang-rhi/ windows-release-gpu-rhi jobs.🤖 Filed by an automated Slang CI coworker — may be inaccurate. A human maintainer should verify run-level evidence before acting.