Skip to content

Commit 30690e4

Browse files
authored
Address node shader support and a couple other small issues (#767)
Call out that these intrinsics are: - available in Node Shaders (except for Thread Launch Mode) - Not available in library export functions i.e. DXR - Are defined as WaveOps Addresses issues #765 and #766
1 parent 26ffca6 commit 30690e4

1 file changed

Lines changed: 35 additions & 8 deletions

File tree

proposals/0048-group-wave-index.md

Lines changed: 35 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ The proposal is for two new shader intrinsics:
2222

2323
## Motivation
2424

25-
Compute, Amplification and Mesh shader workloads consist of some number of
25+
Compute, Amplification, Node and Mesh shader workloads consist of some number of
2626
thread groups, with each thread group containing some number of waves and there
2727
being a number of threads in the wave. Certain algorithms can be accelerated by
2828
specializing work done by individual waves in a thread group.
@@ -112,9 +112,32 @@ uint GetGroupWaveCount();
112112

113113
#### Shader Stage Compatibility
114114

115-
Both `GetGroupWaveIndex` and `GetGroupWaveCount` are valid in compute, mesh, and
116-
amplification shaders. Using these intrinsics in any other shader stage will
117-
result in a compilation error.
115+
Both `GetGroupWaveIndex` and `GetGroupWaveCount` are valid in compute, mesh,
116+
Node (see [Node Shader Support](#node-shader-support)) and amplification
117+
shaders. Using these intrinsics in any other shader stage will result in a
118+
compilation error.
119+
120+
#### Library Restrictions
121+
122+
These intrinsics are not valid in exported functions of shader libraries (e.g.,
123+
DXR raytracing libraries). Raytracing shaders do not have thread group semantics
124+
and therefore have no meaningful wave index or wave count within a group
125+
context.
126+
127+
#### Node Shader Support
128+
129+
Both intrinsics are valid in node shaders with the exception of Thread Launch
130+
mode Nodes. In Thread launch mode, each thread executes independently without
131+
thread group semantics—there is no thread group context, and therefore no
132+
meaningful wave index or wave count within a group.
133+
134+
#### Feature Flag Requirements
135+
136+
Both `GetGroupWaveIndex` and `GetGroupWaveCount` are classified as wave
137+
operations. When a shader uses either intrinsic, the compiler must set the
138+
`WaveOps` feature flag in the shader's feature flags metadata. This ensures
139+
that the runtime can verify the device supports wave operations before
140+
attempting to execute the shader.
118141

119142
#### Value Ranges and Guarantees
120143

@@ -227,10 +250,10 @@ the total number of subgroups executing the workgroup.
227250
The following new compilation errors are introduced:
228251

229252
1. **Invalid Shader Stage**
230-
- Error: `error: GetGroupWaveIndex is only valid in compute, mesh, and
253+
- Error: `error: GetGroupWaveIndex is only valid in compute, node, mesh, and
231254
amplification shaders`
232255
- Occurs when: `GetGroupWaveIndex` is called in other shader stages
233-
- Error: `error: GetGroupWaveCount is only valid in compute, mesh, and
256+
- Error: `error: GetGroupWaveCount is only valid in compute, node, mesh, and
234257
amplification shaders`
235258
- Occurs when: `GetGroupWaveCount` is called in other shader stages
236259

@@ -242,8 +265,12 @@ DXIL validation is updated to verify:
242265
`dx.op.getGroupWaveCount` operations are only valid in Shader Model 6.10 or
243266
later.
244267

245-
2. **Shader Stage Check**: Both operations only appear in compute, amplification
246-
and mesh shaders.
268+
2. **Shader Stage Check**: Both operations only appear in compute, node,
269+
amplification and mesh shaders.
270+
- Use in VS, GS, HS, PS or in exported library functions will result in a
271+
compilation error.
272+
- Special case to check for invalid usage in node shaders with thread launch
273+
mode.
247274

248275
3. **Well-formed Usage**: Both operations are called with the correct signature
249276
(single i32 opcode operand, returns i32).

0 commit comments

Comments
 (0)