Skip to content

Add fp16 vector atomic capability#11324

Open
jkwak-work wants to merge 80 commits into
shader-slang:masterfrom
jkwak-work:issue-11083
Open

Add fp16 vector atomic capability#11324
jkwak-work wants to merge 80 commits into
shader-slang:masterfrom
jkwak-work:issue-11083

Conversation

@jkwak-work
Copy link
Copy Markdown
Collaborator

@jkwak-work jkwak-work commented May 28, 2026

Fixes #11083

Summary of the problem from the end user perspective

Targets can expose different SPIR-V fp16 atomic support: scalar half atomic add and half-vector atomics are covered by separate extensions. Slang needs distinct capabilities for those operations instead of treating all fp16 atomic add support as one capability.

Minimal repro shader; if applicable

RWByteAddressBuffer tmpBuffer;

[numthreads(1, 1, 1)]
void computeMain(uint3 dispatchThreadID : SV_DispatchThreadID)
{
    half originalValue;
    tmpBuffer.InterlockedAddF16(2, 1.0h, originalValue);
}

Root cause

Slang's capability model did not have a distinct SPV_NV_shader_atomic_fp16_vector / AtomicFloat16VectorNV capability path for the NV half-vector extension. That made the scalar SPV_EXT_shader_atomic_float16_add path and half-vector atomic support too easy to collapse into one requirement.

Solution in this PR

This PR adds spvAtomicFloat16VectorNV, wires GL_NV_shader_atomic_fp16_vector to it without the prior SPIR-V 1.0 fallback, and makes the vector capability imply scalar spvAtomicFloat16AddEXT support. InterlockedAddF16 remains available through the scalar SPIR-V requirement and now selects the half2 vector atomic fallback when spvAtomicFloat16VectorNV is present.

The focused capability test checks that the scalar path declares AtomicFloat16AddEXT / SPV_EXT_shader_atomic_float16_add without the NV vector extension, that the vector-capability profile satisfies the scalar requirement and routes InterlockedAddF16 through %v2half atomics, that the combined scalar/vector-capability case also selects the vector path, that the no-fp16 restrictive profile diagnoses the missing scalar fp16 add capability, that the emulated path declares AtomicFloat16VectorNV / SPV_NV_shader_atomic_fp16_vector and emits %v2half atomics, that CUDA codegen emits the scalar half atomicAdd path, and that unsupported half-vector compare-exchange reports E50014 instead of emitting invalid SPIR-V.

Runtime COMPARE_COMPUTE coverage is feature-gated for the scalar fp16 add path with atomic-half. Direct SPV_NV_shader_atomic_fp16_vector runtime directives remain disabled until RHI exposes a feature gate for VK_NV_shader_atomic_float16_vector; the active tests still cover scalar/vector SPIR-V codegen, the scalar missing-capability diagnostic, and unsupported vector operation/width diagnostics. Pointer-form vector helpers keep runtime coverage disabled because SPIR-V variable-pointer runtime coverage is disabled on current GCP runners; active tests cover static %v2half codegen.

Notes to the reviewers; where to focus on

The SPIR-V extension split is intentional:

  • SPV_EXT_shader_atomic_float_add defines OpAtomicFAddEXT for scalar fp32/fp64 via AtomicFloat32AddEXT and AtomicFloat64AddEXT.
  • SPV_EXT_shader_atomic_float16_add extends that scalar add extension for fp16 via AtomicFloat16AddEXT; the SPIR-V spec says scalar fp16 add modules need both SPV_EXT_shader_atomic_float16_add and the base SPV_EXT_shader_atomic_float_add extension declared.
  • SPV_NV_shader_atomic_fp16_vector adds AtomicFloat16VectorNV for fp16 vectors with 2 or 4 components, including add, sub, min, max, and exchange. For this PR's capability model, spvAtomicFloat16VectorNV intentionally implies spvAtomicFloat16AddEXT, so targets advertising the NV vector path also satisfy scalar half-add requirements while codegen can still select the vector opcode path when that capability is present.

Please focus review on the capability definitions and aliases in source/slang/slang-capabilities.capdef, the overload requirements and selection logic in source/slang/hlsl.meta.slang, and the SPIR-V extension/capability checks in tests/hlsl-intrinsic/byte-address-buffer/byte-address-half-atomics-capability.slang.

Reviewer Directives (maintained by agent)

Related PRs in the past

None identified.

@jkwak-work jkwak-work linked an issue May 28, 2026 that may be closed by this pull request
@jkwak-work jkwak-work added pr: non-breaking PRs without breaking changes CoPilot labels May 28, 2026
@jkwak-work jkwak-work self-assigned this May 28, 2026
@jkwak-work
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@jkwak-work jkwak-work marked this pull request as ready for review May 28, 2026 10:13
@jkwak-work jkwak-work requested a review from a team as a code owner May 28, 2026 10:13
@jkwak-work jkwak-work requested review from bmillsNV and removed request for a team May 28, 2026 10:13
github-actions[bot]

This comment was marked as outdated.

@jkwak-work
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
source/slang/slang-capabilities.capdef (1)

1272-1272: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

GL_NV_shader_atomic_fp16_vector still permits SPIR-V without the NV vector atomic capability.

The | _spirv_1_0 branch makes this alias satisfiable on plain SPIR-V 1.0, which can bypass spvAtomicFloat16VectorNV and break the intended “distinct fp16 vector atomic capability” contract.

Suggested fix
-alias GL_NV_shader_atomic_fp16_vector = _GL_NV_shader_atomic_fp16_vector + _GL_NV_gpu_shader5 | spvAtomicFloat16VectorNV | _spirv_1_0;
+alias GL_NV_shader_atomic_fp16_vector = _GL_NV_shader_atomic_fp16_vector + _GL_NV_gpu_shader5 | spvAtomicFloat16VectorNV;

As per coding guidelines, "source/slang/** ... (6) Cross-backend consistency — changes to one emitter may need parallel changes in others."


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 84a02cd5-d1be-42da-9ba8-3ad3c684b583

📥 Commits

Reviewing files that changed from the base of the PR and between d272020 and ba59dda.

📒 Files selected for processing (2)
  • source/slang/slang-capabilities.capdef
  • tests/spirv/gl-nv-shader-atomic-fp16-vector-capability.slang

Comment thread tests/spirv/gl-nv-shader-atomic-fp16-vector-compatibility.slang Outdated
@jkwak-work
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@jkwak-work
Copy link
Copy Markdown
Collaborator Author

Looks good to me

github-actions[bot]

This comment was marked as outdated.

@jkwak-work
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: bde1a37a-e5bd-426b-85d7-9b4a79ff7397

📥 Commits

Reviewing files that changed from the base of the PR and between 198733b and 8a800c4.

📒 Files selected for processing (3)
  • source/slang/hlsl.meta.slang
  • tests/hlsl-intrinsic/byte-address-buffer/byte-address-half-atomics-capability.slang
  • tests/spirv/gl-nv-shader-atomic-fp16-vector-compatibility.slang

@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@shader-slang shader-slang deleted a comment from coderabbitai Bot Jun 7, 2026
@jkwak-work
Copy link
Copy Markdown
Collaborator Author

[Agent] Deleted 56 old LLM/bot general PR conversation comments older than 2026-06-04T14:36:56Z as requested. I limited the cleanup to old CodeRabbit/LLM bot issue comments and left review threads, review bodies, human comments, and agent-authored comments intact.

Comment thread source/slang/slang-emit-spirv.cpp Outdated
if (!entryPoints)
return;

for (auto entryPoint : *entryPoints)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent] Addressed in c65af3a. The SPIR-V emitter no longer scans referencing entry points when requiring the fp16-vector atomic capability. I removed the emitter-side maybeDiagnoseCapabilityUse()/dedup state and moved the implicit capability diagnostic to diagnoseSPIRVAtomicCapabilityUses(), which runs after SPIR-V legalization builds the entry-point reference graph and before SPIR-V emission. The emitter now only declares SPV_NV_shader_atomic_fp16_vector / AtomicFloat16VectorNV for already-validated fp16-vector atomic ops.

I also recorded this implementation constraint in the PR description so later passes do not move that scan back into emission.

Validation:

  • cmake.exe --build --preset debug --target slangc
  • git.exe diff --check (CRLF warning only)
  • focused slangc.exe probes for direct vector atomic missing capability, pointer-form helper missing capability, GL_NV profile warning, unsupported compare-exchange, and positive %v2half vector atomic codegen.

Comment thread source/slang/slang-ir-spirv-legalize.cpp Outdated
Comment thread source/slang/slang-compiler.h
Comment thread source/slang/slang-ir-spirv-legalize.cpp Outdated
Comment thread source/slang/slang-ir-validate.cpp
Comment thread source/slang/slang-capabilities.capdef
Comment thread tests/language-feature/pointer/ptr-to-groupshared.slang
@@ -2659,7 +2660,7 @@ struct SPIRVLegalizationContext : public SourceEmitterBase
// For SPIR-V, we don't skip this validation, because we might then be generating
// invalid SPIR-V.
bool skipFuncParamValidation = false;
validateAtomicOperations(skipFuncParamValidation, m_sink, m_module->getModuleInst());
validateSPIRVAtomicOperations(skipFuncParamValidation, m_sink, m_module->getModuleInst());
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the old name, validateAtomicOperations

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent] Addressed in abd537c. I removed the validateSPIRVAtomicOperations entry point and kept the validation API under the existing validateAtomicOperations name, using a target-aware overload for the SPIR-V-specific fp16-vector checks. I also recorded this naming directive in the PR description so later passes do not reintroduce the separate SPIR-V-named API.

Validation:

  • cmake.exe --build --preset debug --target slangc slang-test
  • git.exe diff --check HEAD~1..HEAD
  • rg -n "validateSPIRVAtomicOperations" source/slang returned no matches.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 7, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c11af3f4-7d71-4de7-b154-d11a90ad99d5

📥 Commits

Reviewing files that changed from the base of the PR and between a9e1943 and 6e92f2d.

📒 Files selected for processing (2)
  • source/slang/slang-ir-spirv-legalize.cpp
  • tests/hlsl-intrinsic/byte-address-buffer/byte-address-half-atomics-capability.slang
💤 Files with no reviewable changes (1)
  • source/slang/slang-ir-spirv-legalize.cpp

📝 Walkthrough

Walkthrough

Adds SPIR-V FP16-vector atomic capability support (SPV_NV_shader_atomic_fp16_vector / spvAtomicFloat16VectorNV): capability defs and docs, IR validation and legalization diagnostics, SPIR‑V emission capability gating, HLSL intrinsic target paths, new diagnostics, and comprehensive tests.

Changes

NVIDIA FP16 Vector Atomics Capability

Layer / File(s) Summary
Capability definitions and documentation
source/slang/slang-capabilities.capdef, docs/command-line-slangc-reference.md, docs/user-guide/a3-02-reference-capability-atoms.md, docs/user-guide/a2-01-spirv-target-specific.md
New SPV_NV_shader_atomic_fp16_vector extension atom and spvAtomicFloat16VectorNV capability are defined. GL alias updated; docs and CLI entries list the new capability and add 16-bit float vector support to the target compatibility table.
Compiler option and profile/capability helpers
source/slang/slang-compiler.h, source/slang/slang-check-shader.cpp, source/slang/slang-type-layout.cpp
Adds inline helpers to detect a specific requested Profile or Capability in CompilerOptionSet and consolidates gating logic to use the combined predicate.
Diagnostics and IR atomic utility
source/slang/slang-diagnostics.lua, source/slang/slang-ir-util.h, source/slang/slang-ir-util.cpp
Adds diagnostics for unsupported fp16-vector widths/operations and introduces getAtomicOperationValueType(IRInst*) to compute the effective atomic value type.
IR validation for SPIR-V fp16 vector atomics
source/slang/slang-ir-validate.h, source/slang/slang-ir-validate.cpp
Refactors atomic validation to optionally enforce SPIR-V-specific checks. Rejects unsupported fp16-vector atomic opcodes and limits supported vector atomics to half2/half4 via the new flagged overload.
Legalization-time fp16 vector atomic diagnostics
source/slang/slang-ir-spirv-legalize.cpp
Enables SPIR-V atomic validation at legalization time and passes the SPIR-V validation flag into module processing to emit per-entry-point capability/compatibility diagnostics.
SPIR-V emission for fp16 vector atomics
source/slang/slang-emit-spirv.cpp
Adds detection and request helpers for fp16 vector atomic types, routes capability requests for 2/4-element vectors, moves atomic capability checks before emission, and tightens variable-pointer/workgroup-pointer capability gating and function-type capability requests.
HLSL intrinsics capability split and vector path
source/slang/hlsl.meta.slang
Splits CUDA vs SPIR-V requires for InterlockedAddF16/_NvInterlockedAddFp16x2 and adds a spvAtomicFloat16VectorNV target-switch implementation for half2 vector atomic add with alignment handling.
SPIR-V vector atomic operation tests
tests/spirv/atomic-float16-vector.slang
Expands tests to cover half2 and half4 add/min/max/exchange, resizes output buffer, and tightens FileCheck to require the NV vector capability/extension and exact opcode counts.
GL_NV shader atomic fp16 vector compatibility tests
tests/spirv/gl-nv-shader-atomic-fp16-vector-compatibility.slang
Adds NEGATIVE/POSITIVE/POSITIVE_MIN_MAX/IGNORE-CAPS runs verifying capability/extension emission, atomic opcode selection, and profile-compatibility diagnostics.
HLSL byte-address buffer fp16 atomic capability matrix
tests/hlsl-intrinsic/byte-address-buffer/byte-address-half-atomics-capability.slang
Adds comprehensive compile-time test matrix covering scalar/vector fp16 atomics, unsupported-width/operation diagnostics, emulation lowering paths (buffer/pointer), multi-entry-point diagnostics, and CUDA backend checks.
GroupShared pointer VariablePointers capability test
tests/language-feature/pointer/ptr-to-groupshared.slang
Refactors the test to spirv-asm/filecheck runs, adds a noinline entry-point variant for variable-pointers capability checks, and marks foo as [ForceInline] for default behavior.

Suggested reviewers

  • bmillsNV
  • csyonghe
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add fp16 vector atomic capability' clearly summarizes the main change: introducing a new spvAtomicFloat16VectorNV capability for fp16 vector atomics.
Description check ✅ Passed The description is detailed and directly related to the changeset, explaining the problem, solution, and implementation details for the new fp16 vector atomic capability.
Linked Issues check ✅ Passed The PR successfully addresses #11083 by adding spvAtomicFloat16VectorNV capability, wiring GL_NV_shader_atomic_fp16_vector to it, making it imply spvAtomicFloat16AddEXT, and enabling InterlockedAddF16 to select between scalar, vector, and emulated paths based on available capabilities.
Out of Scope Changes check ✅ Passed All changes are in-scope: capability definitions, HLSL overload requirements, diagnostic additions, IR validation and legalization, SPIR-V emission, test coverage, and documentation for the fp16 vector atomic feature. The variable-pointer capability refinements in emitter are supporting infrastructure for function signature capability handling related to the feature.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 155f6b04-c5e7-4ae6-98e0-4b4f45a790c2

📥 Commits

Reviewing files that changed from the base of the PR and between 59b7cf8 and abd537c.

📒 Files selected for processing (3)
  • source/slang/slang-ir-spirv-legalize.cpp
  • source/slang/slang-ir-validate.cpp
  • source/slang/slang-ir-validate.h

Comment thread source/slang/slang-ir-validate.h
Comment thread source/slang/slang-emit-spirv.cpp Outdated
return;

auto elementCount = elementCountInst->getValue();
SLANG_ASSERT(elementCount == 2 || elementCount == 4);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this assert.
We should just return false for 3

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent] Addressed in a9e1943. I removed the SLANG_ASSERT(elementCount == 2 || elementCount == 4) from maybeRequireFp16VectorAtomicCapability(), so unsupported fp16-vector atomic widths such as 3 now just return without declaring SPV_NV_shader_atomic_fp16_vector / AtomicFloat16VectorNV. I also recorded this directive in the PR description.

Validation:

  • cmake.exe --build --preset debug --target slangc slang-test
  • git.exe diff --check HEAD~1..HEAD

@@ -2659,7 +2660,7 @@ struct SPIRVLegalizationContext : public SourceEmitterBase
// For SPIR-V, we don't skip this validation, because we might then be generating
// invalid SPIR-V.
bool skipFuncParamValidation = false;
validateAtomicOperations(skipFuncParamValidation, m_sink, m_module->getModuleInst());
validateSPIRVAtomicOperations(skipFuncParamValidation, m_sink, m_module->getModuleInst());
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent] Addressed in abd537c. I removed the validateSPIRVAtomicOperations entry point and kept the validation API under the existing validateAtomicOperations name, using a target-aware overload for the SPIR-V-specific fp16-vector checks. I also recorded this naming directive in the PR description so later passes do not reintroduce the separate SPIR-V-named API.

Validation:

  • cmake.exe --build --preset debug --target slangc slang-test
  • git.exe diff --check HEAD~1..HEAD
  • rg -n "validateSPIRVAtomicOperations" source/slang returned no matches.

@@ -2994,6 +3124,7 @@ void legalizeIRForSPIRV(
eliminateDeadCode(module);

buildEntryPointReferenceGraph(context->m_referencingEntryPoints, module);
diagnoseSPIRVAtomicCapabilityUses(context, module);
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not add this diagnostics.
We will have to revisit on how to globally handle the capability check later; as a separate issue

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Agent] Addressed in 6e92f2d. I removed diagnoseSPIRVAtomicCapabilityUses() and its helper functions/call from SPIR-V legalization, and removed the missing-spvAtomicFloat16VectorNV diagnostic test expectations. Unsupported fp16-vector atomic operation/width validation remains in validateAtomicOperations, and the emitter still declares the NV vector capability when it emits supported fp16-vector atomics.

I also updated the PR description and reviewer directives to record that fp16-vector atomic capability diagnostics are out of scope for this PR and should be revisited separately.

Validation:

  • cmake.exe --build --preset debug --target slangc slang-test
  • git.exe diff --check HEAD~1..HEAD
  • direct slangc.exe probe with -restrictive-capability-check -capability spvAtomicFloat16AddEXT -DTEST_DIRECT_VECTOR_ATOMIC produced no implicit-capability diagnostic and emitted AtomicFloat16VectorNV, SPV_NV_shader_atomic_fp16_vector, and OpAtomicFAddEXT %v2half
  • direct unsupported-width probe produced E50013
  • direct unsupported compare-exchange probe produced E50014

The focused local slang-test.exe file harness still exits 1 with no output in this WSL/Windows checkout, matching prior local behavior.

@jkwak-work
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 7, 2026

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@jkwak-work
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 7, 2026

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verdict: ✅ Clean — no significant issues found

PR adds a distinct spvAtomicFloat16VectorNV SPIR-V capability (mapping to SPV_NV_shader_atomic_fp16_vector/AtomicFloat16VectorNV) so half-vector atomics are gated separately from the scalar AtomicFloat16AddEXT path, with the vector capability inheriting from the scalar one so vector-capable targets still satisfy scalar half-add. RWByteAddressBuffer.InterlockedAddF16 selects the %v2half lowering when the vector capability is present and falls back to the scalar OpAtomicFAddEXT otherwise.

Changes Overview

Capability model + GLSL/SPIR-V aliasing (source/slang/slang-capabilities.capdef, docs/command-line-slangc-reference.md, docs/user-guide/a3-02-reference-capability-atoms.md, docs/user-guide/a2-01-spirv-target-specific.md)

  • Adds SPV_NV_shader_atomic_fp16_vector extension atom and spvAtomicFloat16VectorNV : SPV_NV_shader_atomic_fp16_vector + spvAtomicFloat16AddEXT so the vector capability subsumes scalar half-add. Updates the GL_NV_shader_atomic_fp16_vector alias to require the new SPIR-V capability instead of falling through to _spirv_1_0. Documents the new atom in the user-facing capability/extension tables.

Intrinsic surface + lowering (source/slang/hlsl.meta.slang)

  • RWByteAddressBuffer.InterlockedAddF16(uint, half, out half) keeps a single [require(cuda_hlsl_spirv, sm_5_0)] declaration and target-switches between scalar __atomic_add(half) and a %v2half path gated on spvAtomicFloat16VectorNV (with an even/odd byteAddress & 2 lane selection). Pointer-form InterlockedAddF16Emulated(half*, ...) and InterlockedAddF16x2(half2*, ...) stay at [require(spirv)]; their vector capability is emitted on use.

SPIR-V emit + IR shared utilities (source/slang/slang-emit-spirv.cpp, source/slang/slang-ir-spirv-legalize.cpp, source/slang/slang-ir-util.{cpp,h}, source/slang/slang-ir-validate.{cpp,h}, source/slang/slang-check-shader.cpp, source/slang/slang-compiler.h, source/slang/slang-type-layout.cpp, source/slang/slang-diagnostics.lua)

  • Introduces getAtomicOperationValueType() shared by SPIR-V legalize and IR validation so both paths infer the atomic value type identically (with consistent null/VoidType handling). Replaces inline IRVectorType-of-half checks in emit with a maybeRequireFp16VectorAtomicCapability() helper that explicitly accepts only width 2 and 4 (early-returns on widths like 3 without declaring the vector capability — paired with validateAtomicOperations rejecting unsupported widths up-front rather than emitting OpUndef). Adds a function-type capability hook (requireFunctionTypeCapabilitiesIfNeeded) so Workgroup/GroupShared pointer params in emitted function signatures pull in VariablePointers. Adds diagnostic E50014 for unsupported half-vector compare-exchange.

Test coverage (tests/hlsl-intrinsic/byte-address-buffer/byte-address-half-atomics-capability.slang, tests/spirv/atomic-float16-vector.slang, tests/spirv/gl-nv-shader-atomic-fp16-vector-compatibility.slang, tests/language-feature/pointer/ptr-to-groupshared.slang)

  • Capability test covers: scalar-only profile declares AtomicFloat16AddEXT/SPV_EXT_shader_atomic_float16_add without NV; vector-capability profile routes InterlockedAddF16 through %v2half; combined scalar+vector still selects vector path; restrictive (no-fp16) profile diagnoses missing scalar capability; emulated half pointer path declares the NV capability and emits %v2half; CUDA emits scalar half atomicAdd; pointer-form InterlockedAddF16Emulated/InterlockedAddF16x2; unsupported half-vector compare-exchange triggers E50014; unsupported widths handled. Direct SPV_NV_shader_atomic_fp16_vector runtime coverage stays gated until RHI exposes a feature gate for VK_NV_shader_atomic_float16_vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CoPilot pr: non-breaking PRs without breaking changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make Fp16x2 atomics its own capability.

2 participants