Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
6a359cc
Add fp16 vector atomic capability
jkwak-work May 26, 2026
5588f5a
Address fp16 atomic capability wording
jkwak-work May 26, 2026
c84eb0e
Update command line capability reference
jkwak-work May 27, 2026
474f9ca
Address fp16 atomic review feedback
jkwak-work May 28, 2026
272c12b
Remove ignored capability from fp16 atomic test
jkwak-work May 28, 2026
7c549da
Add negative fp16 atomic extension checks
jkwak-work May 28, 2026
d272020
Enable SPIR-V validation for fp16 atomic test
jkwak-work May 28, 2026
ba59dda
Keep fp16 vector GLSL capability compatible
jkwak-work May 28, 2026
198733b
Strengthen fp16 vector capability fallback test
jkwak-work May 28, 2026
8a800c4
Preserve half lane in fp16 atomic fallback
jkwak-work May 28, 2026
91fcc4c
Cover fp16 vector atomic capability emission
jkwak-work May 28, 2026
9abc8ec
Use default CUDA fp16 atomic lowering
jkwak-work May 28, 2026
f273945
Cover fp16 vector atomic review gaps
jkwak-work May 28, 2026
867e3f3
Cover fp16 CAS retry behavior
jkwak-work May 28, 2026
4fdbdf8
Fix fp16 CAS fallback tests
jkwak-work May 28, 2026
84558cc
Restore fp16 vector fallback semantics
jkwak-work May 28, 2026
1db3352
Address fp16 vector review feedback
jkwak-work May 28, 2026
72ffecb
Tighten fp16 vector atomic checks
jkwak-work May 28, 2026
77f3696
Avoid fp16 capability disjunction
jkwak-work May 28, 2026
d9e5cd4
Address fp16 capability review gaps
jkwak-work May 28, 2026
754e79a
Diagnose half2 atomic vector capability
jkwak-work May 28, 2026
b42c27d
Tighten fp16 vector atomic checks
jkwak-work May 28, 2026
bd156fa
Cover fp16 vector min max review gaps
jkwak-work May 28, 2026
f9b9469
Wire fp16 vector atomic exchange capability
jkwak-work May 28, 2026
0b27dc3
Cover fp16 vector alias min max atomics
jkwak-work May 28, 2026
762b16a
Diagnose unsupported fp16 vector atomic width
jkwak-work May 28, 2026
52ea0bc
Cover fp16 vector atomic review gaps
jkwak-work May 28, 2026
f01ac0f
Assert fp16 vector add opcode
jkwak-work May 28, 2026
dba2059
Format fp16 vector atomic capability call
jkwak-work May 28, 2026
bbbe5ca
Tighten pointer fp16 vector atomic capability
jkwak-work May 29, 2026
72c9e55
Clarify fp16 scalar atomic target switch comment
jkwak-work May 29, 2026
902c910
Warn on late fp16 vector atomic capability
jkwak-work May 29, 2026
2f4ad19
Format fp16 vector capability diagnostic guard
jkwak-work May 29, 2026
cc0abda
Assert fp16 vector compatibility add opcode
jkwak-work May 29, 2026
956848e
Cover half4 fp16 vector atomics
jkwak-work May 29, 2026
0032f5a
Address fp16 atomic intrinsic review
jkwak-work May 29, 2026
4ded6d0
Keep pointer fp16x2 helper broadly available
jkwak-work May 30, 2026
522e860
Add scalar exchange capability regression
jkwak-work May 30, 2026
fca599e
Address fp16 vector capability review
jkwak-work Jun 2, 2026
9ec066f
Address fp16 vector review gaps
jkwak-work Jun 2, 2026
56d8ab6
Polish fp16 vector atomic docs
jkwak-work Jun 2, 2026
d5675e4
Clarify fp16 vector atomic invariant
jkwak-work Jun 2, 2026
6e80d73
Fix fp16 vector atomic formatting
jkwak-work Jun 2, 2026
6589bb4
Cover fp16 vector atomic diagnostics
jkwak-work Jun 2, 2026
19f9be2
Clarify fp16 vector atomic diagnostic test
jkwak-work Jun 2, 2026
7934fa6
Clarify fp16 atomic test directives
jkwak-work Jun 2, 2026
aa1a948
Add fp16 atomic runtime coverage
jkwak-work Jun 2, 2026
9d4be3f
Cover all fp16 vector runtime lanes
jkwak-work Jun 2, 2026
a16695a
Cover half4 vector atomic diagnostics
jkwak-work Jun 2, 2026
5ac48a0
Add half4 vector atomic runtime coverage
jkwak-work Jun 2, 2026
bb824e1
Gate fp16 atomic runtime tests
jkwak-work Jun 2, 2026
b5345fc
Diagnose unsupported fp16 vector atomics
jkwak-work Jun 2, 2026
4287d88
Avoid invalid fp16 atomic emission
jkwak-work Jun 2, 2026
214d377
Tighten fp16 atomic coverage
jkwak-work Jun 2, 2026
65d91a7
Prefer fp16 vector atomic fallback
jkwak-work Jun 6, 2026
e70efa8
Extract fp16 atomic helpers
jkwak-work Jun 6, 2026
6bdba7a
Fix fp16 atomic helper formatting
jkwak-work Jun 6, 2026
65dcf54
Validate fp16 vector atomics before SPIR-V emit
jkwak-work Jun 6, 2026
054d6a7
Fix fp16 vector atomic validation formatting
jkwak-work Jun 6, 2026
7e19877
Clarify fp16 vector atomic diagnostic
jkwak-work Jun 6, 2026
c65af3a
Move fp16 atomic capability diagnostics before emit
jkwak-work Jun 6, 2026
a46301a
Fix SPIR-V legalize include order
jkwak-work Jun 6, 2026
0dcb1ca
Address fp16 atomic review nits
jkwak-work Jun 6, 2026
3ecda6d
Document fp16 vector atomic sub support
jkwak-work Jun 6, 2026
8fda506
Cover pointer fp16 vector atomics
jkwak-work Jun 6, 2026
471c576
Fix fp16 pointer atomic filecheck order
jkwak-work Jun 6, 2026
efdd34a
Cover fp16 vector atomic sub
jkwak-work Jun 6, 2026
1867c38
Tighten pointer atomic checks
jkwak-work Jun 6, 2026
0475083
Address fp16 atomic review gaps
jkwak-work Jun 6, 2026
e483548
Avoid unnecessary groupshared variable pointers
jkwak-work Jun 6, 2026
b31e1b8
Share fp16 atomic value type helper
jkwak-work Jun 6, 2026
bbbf8f7
Assert groupshared variable pointer signature
jkwak-work Jun 6, 2026
ce70257
Emit variable pointer capability for workgroup signatures
jkwak-work Jun 7, 2026
2a97eae
Check memory model for groupshared variable pointers
jkwak-work Jun 7, 2026
e61856a
Fix SPIR-V workgroup pointer capabilities
jkwak-work Jun 7, 2026
98a681f
Use OpPtrAccessChain base type for capabilities
jkwak-work Jun 7, 2026
59b7cf8
Address latest PR review gaps
jkwak-work Jun 7, 2026
abd537c
Keep atomic validation name
jkwak-work Jun 7, 2026
a9e1943
Address atomic capability review
jkwak-work Jun 7, 2026
6e92f2d
Remove fp16 vector capability diagnostics
jkwak-work Jun 7, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/command-line-slangc-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -1314,6 +1314,7 @@ A capability describes an optional feature that a target may or may not support.
* `SPV_EXT_descriptor_indexing` : enables the SPV_EXT_descriptor_indexing extension
* `SPV_EXT_shader_atomic_float_add` : enables the SPV_EXT_shader_atomic_float_add extension
* `SPV_EXT_shader_atomic_float16_add` : enables the SPV_EXT_shader_atomic_float16_add extension
* `SPV_NV_shader_atomic_fp16_vector` : enables the SPV_NV_shader_atomic_fp16_vector extension
Comment thread
jkwak-work marked this conversation as resolved.
* `SPV_EXT_shader_atomic_float_min_max` : enables the SPV_EXT_shader_atomic_float_min_max extension
* `SPV_EXT_mesh_shader` : enables the SPV_EXT_mesh_shader extension
* `SPV_EXT_demote_to_helper_invocation` : enables the SPV_EXT_demote_to_helper_invocation extension
Expand Down Expand Up @@ -1351,6 +1352,7 @@ A capability describes an optional feature that a target may or may not support.
* `spvDeviceGroup`
* `spvAtomicFloat32AddEXT`
* `spvAtomicFloat16AddEXT`
* `spvAtomicFloat16VectorNV`
* `spvAtomicFloat64AddEXT`
* `spvInt64Atomics`
* `spvAtomicFloat32MinMaxEXT`
Expand Down
15 changes: 8 additions & 7 deletions docs/user-guide/a2-01-spirv-target-specific.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,13 +170,14 @@ GLSL 4.6 with [GLSL_EXT_shader_atomic_float](https://github.com/KhronosGroup/GLS
GLSL 4.6 with [GLSL_EXT_shader_atomic_float2](https://github.com/KhronosGroup/GLSL/blob/main/extensions/ext/GLSL_EXT_shader_atomic_float2.txt) can use atomic operations for 16-bit float type.

SPIR-V 1.5 with [SPV_EXT_shader_atomic_float_add](https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/EXT/SPV_EXT_shader_atomic_float_add.asciidoc) and [SPV_EXT_shader_atomic_float_min_max](https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/EXT/SPV_EXT_shader_atomic_float_min_max.asciidoc) can use atomic operations for 32-bit float type and 64-bit float type.
SPIR-V 1.5 with [SPV_EXT_shader_atomic_float16_add](https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/EXT/SPV_EXT_shader_atomic_float16_add.asciidoc) can use atomic operations for 16-bit float type

| | 32-bit integer | 64-bit integer | 32-bit float | 64-bit float | 16-bit float |
| ------ | -------------- | --------------- | --------------------- | ---------------- | ---------------- |
| HLSL | Yes (SM5.0) | Yes (SM6.6) | Only bit-wise (SM6.6) | No | No |
| GLSL | Yes (GL4.3) | Yes (GL4.4+ext) | Yes (GL4.6+ext) | Yes (GL4.6+ext) | Yes (GL4.6+ext) |
| SPIR-V | Yes | Yes | Yes (SPV1.5+ext) | Yes (SPV1.5+ext) | Yes (SPV1.5+ext) |
SPIR-V 1.5 with [SPV_EXT_shader_atomic_float16_add](https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/EXT/SPV_EXT_shader_atomic_float16_add.asciidoc) can use atomic operations for 16-bit float type.
SPIR-V 1.5 with [SPV_NV_shader_atomic_fp16_vector](https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/NV/SPV_NV_shader_atomic_fp16_vector.asciidoc) can use vector atomic add/min/max/exchange operations for 16-bit float vector types with 2 or 4 components. Vector atomic sub is emitted as a negated vector atomic add.

| | 32-bit integer | 64-bit integer | 32-bit float | 64-bit float | 16-bit float | 16-bit float vector |
| ------ | -------------- | --------------- | --------------------- | ---------------- | ---------------- | ----------------------- |
| HLSL | Yes (SM5.0) | Yes (SM6.6) | Only bit-wise (SM6.6) | No | No | No |
| GLSL | Yes (GL4.3) | Yes (GL4.4+ext) | Yes (GL4.6+ext) | Yes (GL4.6+ext) | Yes (GL4.6+ext) | Yes (GL_NV ext) |
| SPIR-V | Yes | Yes | Yes (SPV1.5+ext) | Yes (SPV1.5+ext) | Yes (SPV1.5+ext) | Yes (SPV_NV ext) |

## ConstantBuffer, StructuredBuffer and ByteAddressBuffer

Expand Down
9 changes: 9 additions & 0 deletions docs/user-guide/a3-02-reference-capability-atoms.md
Original file line number Diff line number Diff line change
Expand Up @@ -701,6 +701,10 @@ Extensions
`SPV_NV_ray_tracing_motion_blur`
> Represents the SPIR-V extension for ray tracing motion blur.

`SPV_NV_shader_atomic_fp16_vector`
> Represents the SPIR-V extension for vector atomic float 16 add/min/max/exchange operations.
> Vector atomic sub is emitted as a negated vector atomic add.

`SPV_NV_shader_image_footprint`
> Represents the SPIR-V extension for shader image footprint.

Expand All @@ -723,6 +727,11 @@ Extensions
`spvAtomicFloat16MinMaxEXT`
> Represents the SPIR-V capability for atomic float 16 min/max operations.

`spvAtomicFloat16VectorNV`
> Represents the SPIR-V capability for vector atomic float 16 add/min/max/exchange operations.
> Vector atomic sub is emitted as a negated vector atomic add.
> Implies scalar atomic float 16 add support.

`spvAtomicFloat32AddEXT`
> Represents the SPIR-V capability for atomic float 32 add operations.

Expand Down
27 changes: 24 additions & 3 deletions source/slang/hlsl.meta.slang
Original file line number Diff line number Diff line change
Expand Up @@ -6491,9 +6491,13 @@ $}
/// @param byteAddress The address at which to perform the atomic add operation.
/// @param fp16x2Value Two 16-bit floating point values are packed into a 32-bit unsigned integer.
/// @return The 2 16-bit floating point values packed into a 32-bit unsigned integer.
/// @remarks For SPIR-V, this helper requires `SPV_NV_shader_atomic_fp16_vector`
/// and emits a `half2` `OpAtomicFAdd`; the packed fp16x2 representation matches
/// the NVAPI HLSL ABI, but the underlying operation is a vector atomic.
[__requiresNVAPI]
Comment thread
jkwak-work marked this conversation as resolved.
[ForceInline]
[require(cuda_hlsl_spirv)]
[require(cuda_hlsl, sm_5_0)]
[require(spirv, spvAtomicFloat16VectorNV)]
Comment thread
jkwak-work marked this conversation as resolved.
uint _NvInterlockedAddFp16x2(uint byteAddress, uint fp16x2Value)
{
__target_switch
Expand All @@ -6511,14 +6515,17 @@ $}
/// @param byteAddress The address at which to perform the atomic add operation.
/// @param value The value to add to the value at `byteAddress`.
/// @param originalValue The original value at `byteAddress` before the add operation.
/// @remarks For SPIR-V, this function maps to `OpAtomicFAdd` and requires `SPV_EXT_shader_atomic_float16_add` extension.
/// @remarks For SPIR-V, this function requires `SPV_EXT_shader_atomic_float16_add`
/// and maps to `OpAtomicFAdd` on a `half`. When `SPV_NV_shader_atomic_fp16_vector`
/// is available, it uses the half-vector atomic path instead.
///
/// For HLSL, this function translates to an NVAPI call
/// due to lack of native HLSL intrinsic for floating point atomic add. For CUDA, this function
/// maps to `atomicAdd`.
[__requiresNVAPI]
[ForceInline]
[require(cuda_hlsl_spirv, sm_5_0)]
[require(cuda_hlsl, sm_5_0)]
Comment thread
jkwak-work marked this conversation as resolved.
[require(spirv, spvAtomicFloat16AddEXT)]
Comment thread
jkwak-work marked this conversation as resolved.
Comment thread
jkwak-work marked this conversation as resolved.
void InterlockedAddF16(uint byteAddress, half value, out half originalValue)
Comment thread
jkwak-work marked this conversation as resolved.
{
__target_switch
Expand All @@ -6536,6 +6543,20 @@ $}
originalValue = asfloat16((uint16_t)(_NvInterlockedAddFp16x2(byteAddress, packedInput) >> 16));
}
return;
case spvAtomicFloat16VectorNV:
{
let buf = __getEquivalentStructuredBuffer<half2>(this);
if ((byteAddress & 2) == 0)
{
originalValue = __atomic_add(buf[byteAddress/4], half2(value, half(0.0))).x;
}
else
{
originalValue = __atomic_add(buf[byteAddress/4], half2(half(0.0), value)).y;
}
return;
}
case spvAtomicFloat16AddEXT:
default:
{
let buf = __getEquivalentStructuredBuffer<half>(this);
Expand Down
13 changes: 12 additions & 1 deletion source/slang/slang-capabilities.capdef
Original file line number Diff line number Diff line change
Expand Up @@ -543,6 +543,11 @@ def SPV_EXT_shader_atomic_float_add : _spirv_1_0;
/// [EXT]
def SPV_EXT_shader_atomic_float16_add : SPV_EXT_shader_atomic_float_add;

/// Represents the SPIR-V extension for vector atomic float 16 add/min/max/exchange operations.
/// Vector atomic sub is emitted as a negated vector atomic add.
/// [EXT]
def SPV_NV_shader_atomic_fp16_vector : _spirv_1_0;

/// Represents the SPIR-V extension for atomic float min/max operations.
/// [EXT]
def SPV_EXT_shader_atomic_float_min_max : _spirv_1_0;
Expand Down Expand Up @@ -700,6 +705,12 @@ def spvAtomicFloat32AddEXT : SPV_EXT_shader_atomic_float_add;
/// [EXT]
def spvAtomicFloat16AddEXT : SPV_EXT_shader_atomic_float16_add;

/// Represents the SPIR-V capability for vector atomic float 16 add/min/max/exchange operations.
/// Vector atomic sub is emitted as a negated vector atomic add.
/// Implies scalar atomic float 16 add support.
/// [EXT]
def spvAtomicFloat16VectorNV : SPV_NV_shader_atomic_fp16_vector + spvAtomicFloat16AddEXT;

Comment thread
jkwak-work marked this conversation as resolved.
/// Represents the SPIR-V capability for atomic float 64 add operations.
/// [EXT]
def spvAtomicFloat64AddEXT : SPV_EXT_shader_atomic_float_add;
Expand Down Expand Up @@ -1261,7 +1272,7 @@ alias GL_NV_ray_tracing_motion_blur = _GL_NV_ray_tracing_motion_blur | spvRayTra

/// Represents the GL_NV_shader_atomic_fp16_vector extension.
/// [EXT]
alias GL_NV_shader_atomic_fp16_vector = _GL_NV_shader_atomic_fp16_vector + _GL_NV_gpu_shader5 | _spirv_1_0;
alias GL_NV_shader_atomic_fp16_vector = _GL_NV_shader_atomic_fp16_vector + _GL_NV_gpu_shader5 | spvAtomicFloat16VectorNV;

Comment thread
jkwak-work marked this conversation as resolved.
/// Represents the GL_NV_shader_invocation_reorder extension (NVIDIA-specific).
/// [EXT]
Expand Down
24 changes: 1 addition & 23 deletions source/slang/slang-check-shader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1956,28 +1956,6 @@ void validateEntryPoint(EntryPoint* entryPoint, DiagnosticSink* sink)
else
{
auto& targetOptionSet = target->getOptionSet();
bool specificProfileRequested =
targetOptionSet.hasOption(CompilerOptionName::Profile) &&
(targetOptionSet.getIntOption(CompilerOptionName::Profile) !=
SLANG_PROFILE_UNKNOWN);
bool specificCapabilityRequested = false;
for (auto atomVal : targetOptionSet.getArray(CompilerOptionName::Capability))
{
switch (atomVal.kind)
{
case CompilerOptionValueKind::Int:
if (atomVal.intValue != SLANG_CAPABILITY_UNKNOWN)
specificCapabilityRequested = true;
break;
case CompilerOptionValueKind::String:
// User made a specific capability request
specificCapabilityRequested = true;
break;
}
if (specificCapabilityRequested)
break;
}

if (auto declaredCapsMod =
entryPointFuncDecl->findModifier<ExplicitlyDeclaredCapabilityModifier>())
{
Expand All @@ -1988,7 +1966,7 @@ void validateEntryPoint(EntryPoint* entryPoint, DiagnosticSink* sink)
}

// Only attempt to error if a specific profile or capability is requested
if ((specificCapabilityRequested || specificProfileRequested) &&
if (isSpecificProfileOrCapabilityRequested(targetOptionSet) &&
targetCaps.atLeastOneSetImpliedInOther(
CapabilitySet{entryPointFuncDecl->inferredCapabilityRequirements}) ==
CapabilitySet::ImpliesReturnFlags::NotImplied)
Expand Down
26 changes: 26 additions & 0 deletions source/slang/slang-compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,32 @@ enum class DiagnosticCategory
None = 0,
Capability = 1 << 0,
};

inline bool isSpecificProfileRequested(CompilerOptionSet& optionSet)
{
return optionSet.hasOption(CompilerOptionName::Profile) &&
(optionSet.getIntOption(CompilerOptionName::Profile) != SLANG_PROFILE_UNKNOWN);
}

inline bool isSpecificCapabilityRequested(CompilerOptionSet& optionSet)
{
for (auto atomVal : optionSet.getArray(CompilerOptionName::Capability))
{
if ((atomVal.kind == CompilerOptionValueKind::Int &&
atomVal.intValue != SLANG_CAPABILITY_UNKNOWN) ||
atomVal.kind == CompilerOptionValueKind::String)
{
return true;
}
}
return false;
}

inline bool isSpecificProfileOrCapabilityRequested(CompilerOptionSet& optionSet)
{
return isSpecificProfileRequested(optionSet) || isSpecificCapabilityRequested(optionSet);
}
Comment thread
jkwak-work marked this conversation as resolved.

template<typename P, typename... Args>
bool maybeDiagnose(
DiagnosticSink* sink,
Expand Down
14 changes: 14 additions & 0 deletions source/slang/slang-diagnostics.lua
Original file line number Diff line number Diff line change
Expand Up @@ -4808,6 +4808,20 @@ warning(
span { loc = "location", message = "Slang's SPIR-V backend only supports SPIR-V version 1.3 and later. Use `-emit-spirv-via-glsl` option to produce SPIR-V 1.0 through 1.2." }
)

err(
"spirv-fp16-vector-atomic-unsupported-width",
50013,
"invalid SPIR-V fp16 vector atomic width",
span { loc = "location", message = "SPIR-V fp16 vector atomics only support half2 and half4." }
)

err(
"spirv-fp16-vector-atomic-unsupported-operation",
50014,
"invalid SPIR-V fp16 vector atomic operation",
span { loc = "location", message = "SPIR-V fp16 vector atomics only support add, sub, min, max, and exchange operations." }
)

err(
"invalid-mesh-stage-output-topology",
50060,
Expand Down
Loading
Loading