Add fp16 vector atomic capability#11324
Conversation
|
@coderabbitai review |
|
@coderabbitai review |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
source/slang/slang-capabilities.capdef (1)
1272-1272:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
GL_NV_shader_atomic_fp16_vectorstill permits SPIR-V without the NV vector atomic capability.The
| _spirv_1_0branch makes this alias satisfiable on plain SPIR-V 1.0, which can bypassspvAtomicFloat16VectorNVand break the intended “distinct fp16 vector atomic capability” contract.Suggested fix
-alias GL_NV_shader_atomic_fp16_vector = _GL_NV_shader_atomic_fp16_vector + _GL_NV_gpu_shader5 | spvAtomicFloat16VectorNV | _spirv_1_0; +alias GL_NV_shader_atomic_fp16_vector = _GL_NV_shader_atomic_fp16_vector + _GL_NV_gpu_shader5 | spvAtomicFloat16VectorNV;As per coding guidelines, "source/slang/** ... (6) Cross-backend consistency — changes to one emitter may need parallel changes in others."
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 84a02cd5-d1be-42da-9ba8-3ad3c684b583
📒 Files selected for processing (2)
source/slang/slang-capabilities.capdeftests/spirv/gl-nv-shader-atomic-fp16-vector-capability.slang
|
@coderabbitai review |
|
Looks good to me |
|
@coderabbitai review |
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: bde1a37a-e5bd-426b-85d7-9b4a79ff7397
📒 Files selected for processing (3)
source/slang/hlsl.meta.slangtests/hlsl-intrinsic/byte-address-buffer/byte-address-half-atomics-capability.slangtests/spirv/gl-nv-shader-atomic-fp16-vector-compatibility.slang
|
[Agent] Deleted 56 old LLM/bot general PR conversation comments older than 2026-06-04T14:36:56Z as requested. I limited the cleanup to old CodeRabbit/LLM bot issue comments and left review threads, review bodies, human comments, and agent-authored comments intact. |
| if (!entryPoints) | ||
| return; | ||
|
|
||
| for (auto entryPoint : *entryPoints) |
There was a problem hiding this comment.
[Agent] Addressed in c65af3a. The SPIR-V emitter no longer scans referencing entry points when requiring the fp16-vector atomic capability. I removed the emitter-side maybeDiagnoseCapabilityUse()/dedup state and moved the implicit capability diagnostic to diagnoseSPIRVAtomicCapabilityUses(), which runs after SPIR-V legalization builds the entry-point reference graph and before SPIR-V emission. The emitter now only declares SPV_NV_shader_atomic_fp16_vector / AtomicFloat16VectorNV for already-validated fp16-vector atomic ops.
I also recorded this implementation constraint in the PR description so later passes do not move that scan back into emission.
Validation:
cmake.exe --build --preset debug --target slangcgit.exe diff --check(CRLF warning only)- focused
slangc.exeprobes for direct vector atomic missing capability, pointer-form helper missing capability, GL_NV profile warning, unsupported compare-exchange, and positive%v2halfvector atomic codegen.
| @@ -2659,7 +2660,7 @@ struct SPIRVLegalizationContext : public SourceEmitterBase | |||
| // For SPIR-V, we don't skip this validation, because we might then be generating | |||
| // invalid SPIR-V. | |||
| bool skipFuncParamValidation = false; | |||
| validateAtomicOperations(skipFuncParamValidation, m_sink, m_module->getModuleInst()); | |||
| validateSPIRVAtomicOperations(skipFuncParamValidation, m_sink, m_module->getModuleInst()); | |||
There was a problem hiding this comment.
Let's keep the old name, validateAtomicOperations
There was a problem hiding this comment.
[Agent] Addressed in abd537c. I removed the validateSPIRVAtomicOperations entry point and kept the validation API under the existing validateAtomicOperations name, using a target-aware overload for the SPIR-V-specific fp16-vector checks. I also recorded this naming directive in the PR description so later passes do not reintroduce the separate SPIR-V-named API.
Validation:
cmake.exe --build --preset debug --target slangc slang-testgit.exe diff --check HEAD~1..HEADrg -n "validateSPIRVAtomicOperations" source/slangreturned no matches.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
💤 Files with no reviewable changes (1)
📝 WalkthroughWalkthroughAdds SPIR-V FP16-vector atomic capability support (SPV_NV_shader_atomic_fp16_vector / spvAtomicFloat16VectorNV): capability defs and docs, IR validation and legalization diagnostics, SPIR‑V emission capability gating, HLSL intrinsic target paths, new diagnostics, and comprehensive tests. ChangesNVIDIA FP16 Vector Atomics Capability
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 155f6b04-c5e7-4ae6-98e0-4b4f45a790c2
📒 Files selected for processing (3)
source/slang/slang-ir-spirv-legalize.cppsource/slang/slang-ir-validate.cppsource/slang/slang-ir-validate.h
| return; | ||
|
|
||
| auto elementCount = elementCountInst->getValue(); | ||
| SLANG_ASSERT(elementCount == 2 || elementCount == 4); |
There was a problem hiding this comment.
Remove this assert.
We should just return false for 3
There was a problem hiding this comment.
[Agent] Addressed in a9e1943. I removed the SLANG_ASSERT(elementCount == 2 || elementCount == 4) from maybeRequireFp16VectorAtomicCapability(), so unsupported fp16-vector atomic widths such as 3 now just return without declaring SPV_NV_shader_atomic_fp16_vector / AtomicFloat16VectorNV. I also recorded this directive in the PR description.
Validation:
cmake.exe --build --preset debug --target slangc slang-testgit.exe diff --check HEAD~1..HEAD
| @@ -2659,7 +2660,7 @@ struct SPIRVLegalizationContext : public SourceEmitterBase | |||
| // For SPIR-V, we don't skip this validation, because we might then be generating | |||
| // invalid SPIR-V. | |||
| bool skipFuncParamValidation = false; | |||
| validateAtomicOperations(skipFuncParamValidation, m_sink, m_module->getModuleInst()); | |||
| validateSPIRVAtomicOperations(skipFuncParamValidation, m_sink, m_module->getModuleInst()); | |||
There was a problem hiding this comment.
[Agent] Addressed in abd537c. I removed the validateSPIRVAtomicOperations entry point and kept the validation API under the existing validateAtomicOperations name, using a target-aware overload for the SPIR-V-specific fp16-vector checks. I also recorded this naming directive in the PR description so later passes do not reintroduce the separate SPIR-V-named API.
Validation:
cmake.exe --build --preset debug --target slangc slang-testgit.exe diff --check HEAD~1..HEADrg -n "validateSPIRVAtomicOperations" source/slangreturned no matches.
| @@ -2994,6 +3124,7 @@ void legalizeIRForSPIRV( | |||
| eliminateDeadCode(module); | |||
|
|
|||
| buildEntryPointReferenceGraph(context->m_referencingEntryPoints, module); | |||
| diagnoseSPIRVAtomicCapabilityUses(context, module); | |||
There was a problem hiding this comment.
Do not add this diagnostics.
We will have to revisit on how to globally handle the capability check later; as a separate issue
There was a problem hiding this comment.
[Agent] Addressed in 6e92f2d. I removed diagnoseSPIRVAtomicCapabilityUses() and its helper functions/call from SPIR-V legalization, and removed the missing-spvAtomicFloat16VectorNV diagnostic test expectations. Unsupported fp16-vector atomic operation/width validation remains in validateAtomicOperations, and the emitter still declares the NV vector capability when it emits supported fp16-vector atomics.
I also updated the PR description and reviewer directives to record that fp16-vector atomic capability diagnostics are out of scope for this PR and should be revisited separately.
Validation:
cmake.exe --build --preset debug --target slangc slang-testgit.exe diff --check HEAD~1..HEAD- direct
slangc.exeprobe with-restrictive-capability-check -capability spvAtomicFloat16AddEXT -DTEST_DIRECT_VECTOR_ATOMICproduced no implicit-capability diagnostic and emittedAtomicFloat16VectorNV,SPV_NV_shader_atomic_fp16_vector, andOpAtomicFAddEXT %v2half - direct unsupported-width probe produced
E50013 - direct unsupported compare-exchange probe produced
E50014
The focused local slang-test.exe file harness still exits 1 with no output in this WSL/Windows checkout, matching prior local behavior.
|
@coderabbitai review |
✅ Action performedReview finished.
|
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Verdict: ✅ Clean — no significant issues found
PR adds a distinct spvAtomicFloat16VectorNV SPIR-V capability (mapping to SPV_NV_shader_atomic_fp16_vector/AtomicFloat16VectorNV) so half-vector atomics are gated separately from the scalar AtomicFloat16AddEXT path, with the vector capability inheriting from the scalar one so vector-capable targets still satisfy scalar half-add. RWByteAddressBuffer.InterlockedAddF16 selects the %v2half lowering when the vector capability is present and falls back to the scalar OpAtomicFAddEXT otherwise.
Changes Overview
Capability model + GLSL/SPIR-V aliasing (source/slang/slang-capabilities.capdef, docs/command-line-slangc-reference.md, docs/user-guide/a3-02-reference-capability-atoms.md, docs/user-guide/a2-01-spirv-target-specific.md)
- Adds
SPV_NV_shader_atomic_fp16_vectorextension atom andspvAtomicFloat16VectorNV : SPV_NV_shader_atomic_fp16_vector + spvAtomicFloat16AddEXTso the vector capability subsumes scalar half-add. Updates theGL_NV_shader_atomic_fp16_vectoralias to require the new SPIR-V capability instead of falling through to_spirv_1_0. Documents the new atom in the user-facing capability/extension tables.
Intrinsic surface + lowering (source/slang/hlsl.meta.slang)
RWByteAddressBuffer.InterlockedAddF16(uint, half, out half)keeps a single[require(cuda_hlsl_spirv, sm_5_0)]declaration and target-switches between scalar__atomic_add(half)and a%v2halfpath gated onspvAtomicFloat16VectorNV(with an even/oddbyteAddress & 2lane selection). Pointer-formInterlockedAddF16Emulated(half*, ...)andInterlockedAddF16x2(half2*, ...)stay at[require(spirv)]; their vector capability is emitted on use.
SPIR-V emit + IR shared utilities (source/slang/slang-emit-spirv.cpp, source/slang/slang-ir-spirv-legalize.cpp, source/slang/slang-ir-util.{cpp,h}, source/slang/slang-ir-validate.{cpp,h}, source/slang/slang-check-shader.cpp, source/slang/slang-compiler.h, source/slang/slang-type-layout.cpp, source/slang/slang-diagnostics.lua)
- Introduces
getAtomicOperationValueType()shared by SPIR-V legalize and IR validation so both paths infer the atomic value type identically (with consistent null/VoidTypehandling). Replaces inlineIRVectorType-of-half checks in emit with amaybeRequireFp16VectorAtomicCapability()helper that explicitly accepts only width 2 and 4 (early-returns on widths like 3 without declaring the vector capability — paired withvalidateAtomicOperationsrejecting unsupported widths up-front rather than emittingOpUndef). Adds a function-type capability hook (requireFunctionTypeCapabilitiesIfNeeded) soWorkgroup/GroupSharedpointer params in emitted function signatures pull inVariablePointers. Adds diagnosticE50014for unsupported half-vector compare-exchange.
Test coverage (tests/hlsl-intrinsic/byte-address-buffer/byte-address-half-atomics-capability.slang, tests/spirv/atomic-float16-vector.slang, tests/spirv/gl-nv-shader-atomic-fp16-vector-compatibility.slang, tests/language-feature/pointer/ptr-to-groupshared.slang)
- Capability test covers: scalar-only profile declares
AtomicFloat16AddEXT/SPV_EXT_shader_atomic_float16_addwithout NV; vector-capability profile routesInterlockedAddF16through%v2half; combined scalar+vector still selects vector path; restrictive (no-fp16) profile diagnoses missing scalar capability; emulated half pointer path declares the NV capability and emits%v2half; CUDA emits scalar halfatomicAdd; pointer-formInterlockedAddF16Emulated/InterlockedAddF16x2; unsupported half-vector compare-exchange triggersE50014; unsupported widths handled. DirectSPV_NV_shader_atomic_fp16_vectorruntime coverage stays gated until RHI exposes a feature gate forVK_NV_shader_atomic_float16_vector.
Fixes #11083
Summary of the problem from the end user perspective
Targets can expose different SPIR-V fp16 atomic support: scalar half atomic add and half-vector atomics are covered by separate extensions. Slang needs distinct capabilities for those operations instead of treating all fp16 atomic add support as one capability.
Minimal repro shader; if applicable
Root cause
Slang's capability model did not have a distinct
SPV_NV_shader_atomic_fp16_vector/AtomicFloat16VectorNVcapability path for the NV half-vector extension. That made the scalarSPV_EXT_shader_atomic_float16_addpath and half-vector atomic support too easy to collapse into one requirement.Solution in this PR
This PR adds
spvAtomicFloat16VectorNV, wiresGL_NV_shader_atomic_fp16_vectorto it without the prior SPIR-V 1.0 fallback, and makes the vector capability imply scalarspvAtomicFloat16AddEXTsupport.InterlockedAddF16remains available through the scalar SPIR-V requirement and now selects the half2 vector atomic fallback whenspvAtomicFloat16VectorNVis present.The focused capability test checks that the scalar path declares
AtomicFloat16AddEXT/SPV_EXT_shader_atomic_float16_addwithout the NV vector extension, that the vector-capability profile satisfies the scalar requirement and routesInterlockedAddF16through%v2halfatomics, that the combined scalar/vector-capability case also selects the vector path, that the no-fp16 restrictive profile diagnoses the missing scalar fp16 add capability, that the emulated path declaresAtomicFloat16VectorNV/SPV_NV_shader_atomic_fp16_vectorand emits%v2halfatomics, that CUDA codegen emits the scalar halfatomicAddpath, and that unsupported half-vector compare-exchange reportsE50014instead of emitting invalid SPIR-V.Runtime
COMPARE_COMPUTEcoverage is feature-gated for the scalar fp16 add path withatomic-half. DirectSPV_NV_shader_atomic_fp16_vectorruntime directives remain disabled until RHI exposes a feature gate forVK_NV_shader_atomic_float16_vector; the active tests still cover scalar/vector SPIR-V codegen, the scalar missing-capability diagnostic, and unsupported vector operation/width diagnostics. Pointer-form vector helpers keep runtime coverage disabled because SPIR-V variable-pointer runtime coverage is disabled on current GCP runners; active tests cover static%v2halfcodegen.Notes to the reviewers; where to focus on
The SPIR-V extension split is intentional:
SPV_EXT_shader_atomic_float_adddefinesOpAtomicFAddEXTfor scalar fp32/fp64 viaAtomicFloat32AddEXTandAtomicFloat64AddEXT.SPV_EXT_shader_atomic_float16_addextends that scalar add extension for fp16 viaAtomicFloat16AddEXT; the SPIR-V spec says scalar fp16 add modules need bothSPV_EXT_shader_atomic_float16_addand the baseSPV_EXT_shader_atomic_float_addextension declared.SPV_NV_shader_atomic_fp16_vectoraddsAtomicFloat16VectorNVfor fp16 vectors with 2 or 4 components, including add, sub, min, max, and exchange. For this PR's capability model,spvAtomicFloat16VectorNVintentionally impliesspvAtomicFloat16AddEXT, so targets advertising the NV vector path also satisfy scalar half-add requirements while codegen can still select the vector opcode path when that capability is present.Please focus review on the capability definitions and aliases in
source/slang/slang-capabilities.capdef, the overload requirements and selection logic insource/slang/hlsl.meta.slang, and the SPIR-V extension/capability checks intests/hlsl-intrinsic/byte-address-buffer/byte-address-half-atomics-capability.slang.Reviewer Directives (maintained by agent)
[human @jkwak-work] Remove the
_spirv_1_0disjunct fromGL_NV_shader_atomic_fp16_vector; the alias should requirespvAtomicFloat16VectorNVfor SPIR-V. (Add fp16 vector atomic capability #11324 (comment))[human @jkwak-work] Keep
RWByteAddressBuffer.InterlockedAddF16Emulated(uint byteAddress, ...)on[require(cuda_hlsl_spirv, sm_5_0)]rather than splitting it into per-target requirements. (Add fp16 vector atomic capability #11324 (comment))[human @jkwak-work] Keep pointer-form
InterlockedAddF16Emulated(half* ...)at[require(spirv)]; its vector capability is emitted when used. (Add fp16 vector atomic capability #11324 (comment))[human @jkwak-work] Keep pointer-form
InterlockedAddF16x2(half2* ...)at[require(spirv)]; its vector capability is emitted when used. (Add fp16 vector atomic capability #11324 (comment))[human @jkwak-work] Make
spvAtomicFloat16VectorNVinherit fromspvAtomicFloat16AddEXT, and keepRWByteAddressBuffer.InterlockedAddF16(uint byteAddress, ...)available on both scalar and vector SPIR-V paths, selecting the vector path whenspvAtomicFloat16VectorNVis present. (Add fp16 vector atomic capability #11324 (comment))[human @jkwak-work] Avoid one-use lambdas in this PR; inline them or extract regular/static helpers when they only need a few variables. (Add fp16 vector atomic capability #11324 (comment))
[human @jkwak-work] Keep the atomic validation API under the existing
validateAtomicOperationsname rather than exposing avalidateSPIRVAtomicOperationsentry point. (Add fp16 vector atomic capability #11324 (comment))[human @jkwak-work] Do not handle unsupported fp16-vector atomic error cases by emitting
OpUndefin SPIR-V emit; validate them before SPIR-V emission and let the emitter assume already-validated IR. (Add fp16 vector atomic capability #11324 (comment))[human @jkwak-work] Do not add fp16-vector atomic capability diagnostics in this PR; global capability-check handling will be revisited separately. (Add fp16 vector atomic capability #11324 (comment))
[human @jkwak-work] Do not assert on unsupported fp16-vector atomic widths in SPIR-V emit; return without declaring the vector capability for width 3. (Add fp16 vector atomic capability #11324 (comment))
[LLM @github-actions] Keep legalization and validation
getAtomicOperationValueType()helpers aligned on null-safety before checking forVoidType. (Add fp16 vector atomic capability #11324 (comment))[LLM @github-actions] Keep active vector-path coverage for lane-selective
InterlockedAddF16lowering; direct NV fp16-vector runtime coverage remains gated until an RHI feature gate exists. (Add fp16 vector atomic capability #11324 (comment))[LLM @github-actions] Keep atomic operation value-type inference in one shared IR utility so SPIR-V legalization and IR validation cannot diverge. (Add fp16 vector atomic capability #11324 (comment))
[LLM @github-actions] Emit
VariablePointerswhen a Workgroup/GroupShared pointer appears in an emitted SPIR-V function signature, including noinline helpers; keep positive coverage for the explicitSPV_KHR_variable_pointerspath. (Add fp16 vector atomic capability #11324 (comment), Add fp16 vector atomic capability #11324 (comment))Related PRs in the past
None identified.