Skip to content

Commit 35d5189

Browse files
Fix dropped entry-point uniform on struct-returning Metal vertex shaders (#11607)
Associated issue: #11606 LLM used while working on the issue: Claude Opus 4.8 High ## 1. Motivation I was working on my own toy-RHI for Vulkan 1.3+ and Metal 4 and realized that the emitted Metal shaders are missing the entry-point uniforms but the emitted SPIR-V come out as expected. This bug hits the ordinary way to write a vertex shader on Metal, not a corner case: - **Returning a struct is how vertex shaders normally work.** A vertex shader almost always outputs more than position. It hands interpolated varyings (color, UVs, normals, world-space position, …) to the fragment stage, and the idiomatic Slang/HLSL spelling for "position plus user varyings" is a struct of semantic-tagged members. A bare `float4 : SV_Position` return is the rare position-only passthrough. So the broken path is the common one. - **Entry-point `uniform` parameters are Slang's portable mechanism for per-draw/root constants.** They lower to a push constant on SPIR-V and `buffer(0)` on Metal, and they already work on every *other* stage (compute, fragment) and every *other* target. The alternative, which is a module-scope global, lowers to a `$Globals` UBO at `set=0, binding=0` on Vulkan, which collides with a bindless descriptor heap, so entry-point uniforms are precisely the collision-free form users are steered toward. - **It breaks Slang's single-source, multi-target promise, silently.** The identical shader compiles correctly for SPIR-V but miscompiles for Metal with no error diagnostics, and an emitted kernel that reads uninitialized memory. A user gets a clean compile and wrong pixels, with nothing to react to. Concretely, any shader of the shape "read per-draw config from a uniform, emit per-vertex varyings" is affected. - E.g. an instanced grid / sprite-batch / UI-quad vertex shader that reads grid dimensions or a transform from a root constant and outputs position + color (see the reproducer in the associated issue above). On SPIR-V it works, but on Metal the uniform silently vanishes. ## 2. Proposed solution The bug is a stale cross-reference, not a codegen-emit issue. `moveEntryPointUniformParamsToGlobalScope` records, on each hoisted global uniform, which entry-point function it came from (an `IREntryPointParamDecoration`). `lowerOutParameters` later replaces that entry-point function with a wrapper but does not update those back-references, and `introduceExplicitGlobalContext` matches uniforms to entry points by exactly this reference. The principled fix is to keep the invariant "a hoisted uniform's `IREntryPointParamDecoration` names the live entry-point function" true at the moment the entry point is replaced (i.e. re-point the decorations from the old function to the wrapper, in the same place the code already strips the old function's entry-point decorations `legalizeVertexShaderOutputParamsForMetal`). This is target-correct because the wrapper rewrite is Metal-vertex-specific. No emit path or other target is touched. ## 3. Change summary | File | Change | |---|---| | `source/slang/slang-ir-legalize-varying-params.cpp` | Add `retargetEntryPointParamDecorations(oldFunc, newFunc)` and call it in `legalizeVertexShaderOutputParamsForMetal` right after `lowerOutParameters` replaces the entry point, alongside the existing entry-point-decoration cleanup. | | `tests/metal/entry-point-uniform-vertex-struct-output.slang` | New regression test: a struct-returning vertex `uniform` must bind as a buffer argument on Metal (index-agnostic. Matches `buffer(`, since the concrete slot is Slang's default layout); compute likewise; SPIR-V stays a push constant. | ## 4. Concepts and vocabulary - **`IREntryPointParamDecoration`**: decoration placed on a global `IRGlobalParam` that was originally an entry-point `uniform` parameter; operand 0 names the entry-point function it came from. Read by `introduceExplicitGlobalContext` to decide which entry point a global uniform belongs to. - **`lowerOutParameters` / output-struct wrapper**: on Metal, a vertex entry whose result is a struct (or that has `out` params) is rewritten so a new *wrapper* function becomes the entry point and calls the original; the original becomes an ordinary callee. - **`introduceExplicitGlobalContext`**: threads global uniforms into the actual entry point as a `KernelContext` carried from a `[[buffer]]` kernel argument. ## 5. Process report - **`retargetEntryPointParamDecorations`** iterates `oldFunc`'s use list, collects the `IREntryPointParamDecoration`s referencing it (operand 0), and re-points each to `newFunc` (`setOperand(0, newFunc)`). Necessary because, without it, `introduceExplicitGlobalContext` (`slang-ir-explicit-global-context.cpp:487`) skips the uniform for the wrapper entry point (`originatingEntryPoint != entryPointFunc`), so the `EntryPointParams ... [[buffer(0)]]` argument is never created and the `KernelContext` field is left default (the motivating garbage read). The old function survives (the wrapper still `call`s it), so re-pointing only the decorations, not its other uses, is correct and there is no use-after-free. - **Call site** is in `legalizeVertexShaderOutputParamsForMetal`, immediately after the `oldFunc == entryPoint.entryPointFunc` guard and before the loop that strips `IREntryPointDecoration`/`IRKeepAliveDecoration` from `oldFunc`. That block is already the single place that transfers entry-point identity from the original function to the wrapper, so the back-reference repair belongs with it. - **Input-shape check.** The shape handled is "an `IREntryPointParamDecoration` naming a function that is no longer the entry point." That shape is a transient, valid consequence of `lowerOutParameters` legitimately introducing a wrapper entry point, and not a malformed input from a broken producer. The producer (`lowerOutParameters`) is doing the right thing; the missing step is maintaining the decoration invariant across the replacement, which is what this change adds. An alternative placement inside `lowerOutParameters` was considered, but the entry-point-decoration cleanup already lives in the caller, so keeping both halves of "retarget entry-point identity" together there is clearer and avoids changing the general utility's contract for a Metal-specific rewrite. --- ## Reproducer Code `repro.slang` (self-contained): ```slang struct GridCfg { uint dim; }; struct VOut { float4 pos : SV_Position; float4 color : COLOR; }; [shader("vertex")] VOut vsMain(uint vid : SV_VertexID, uint iid : SV_InstanceID, uniform GridCfg* cfg) { VOut o; o.pos = float4(float(cfg.dim) * 0.001, 0, 0, 1); o.color = float4(1, 0, 0, 1); return o; } ``` ``` slangc repro.slang -target metal -stage vertex -entry vsMain ``` ## Old Behavior The `[[vertex]]` entry has no buffer argument; the uniform is read from a default-initialized local, producing garbage: ```cpp [[vertex]] vsMain_Result_0 vsMain(uint vid_1 [[vertex_id]], uint iid_1 [[instance_id]]) { thread KernelContext_0 kernelContext_1; // never initialized VOut_0 _S6 = vsMain_0(&_S5, &_S4, &kernelContext_1); ... ``` ## New Behavior The entry-point `uniform GridCfg* cfg` is bound as a kernel buffer argument, as it is for the compute/fragment/scalar-vertex cases: ```cpp [[vertex]] vsMain_Result_0 vsMain(uint vid_1 [[vertex_id]], uint iid_1 [[instance_id]], EntryPointParams_0 constant* entryPointParams_1 [[buffer(0)]]) { thread KernelContext_0 kernelContext_1; (&kernelContext_1)->entryPointParams_0 = entryPointParams_1; // initialized from buffer(0) VOut_0 _S3 = vsMain_0(&_S2, &_S1, &kernelContext_1); ... ``` For comparison, `slangc repro.slang -target spirv-asm -stage vertex -entry vsMain -fvk-use-entrypoint-name` correctly emits `%entryPointParams = OpVariable ... PushConstant`. --- --------- Co-authored-by: Sami Kiminki (NVIDIA) <235843927+skiminki-nv@users.noreply.github.com>
1 parent 7448987 commit 35d5189

2 files changed

Lines changed: 88 additions & 0 deletions

File tree

source/slang/slang-ir-legalize-varying-params.cpp

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4773,6 +4773,27 @@ class LegalizeWGSLEntryPointContext : public LegalizeShaderEntryPointContext
47734773
const UnownedStringSlice userSemanticName = toSlice("user_semantic");
47744774
};
47754775

4776+
// Re-point at `newFunc` every `IREntryPointParamDecoration` that currently names `oldFunc` as
4777+
// its originating entry point. When entry-point `uniform` parameters are hoisted to global scope
4778+
// (moveEntryPointUniformParamsToGlobalScope), each resulting global param is tagged with an
4779+
// IREntryPointParamDecoration recording the entry-point function it came from. If a later pass
4780+
// replaces the entry point with a wrapper (as lowerOutParameters does below), those tags still
4781+
// reference the old function; introduceExplicitGlobalContext binds a global uniform to an entry
4782+
// point only when this decoration names that entry point, so without re-pointing the uniform is
4783+
// silently dropped — on Metal a struct-returning vertex shader's `uniform T*` gets no [[buffer]]
4784+
// argument and reads uninitialized memory.
4785+
static void retargetEntryPointParamDecorations(IRFunc* oldFunc, IRFunc* newFunc)
4786+
{
4787+
List<IREntryPointParamDecoration*> decorationsToRetarget;
4788+
for (auto use = oldFunc->firstUse; use; use = use->nextUse)
4789+
{
4790+
if (auto decor = as<IREntryPointParamDecoration>(use->getUser()))
4791+
decorationsToRetarget.add(decor);
4792+
}
4793+
for (auto decor : decorationsToRetarget)
4794+
decor->setOperand(0, newFunc);
4795+
}
4796+
47764797
void legalizeVertexShaderOutputParamsForMetal(DiagnosticSink* sink, EntryPointInfo& entryPoint)
47774798
{
47784799
const auto oldFunc = entryPoint.entryPointFunc;
@@ -4793,6 +4814,10 @@ void legalizeVertexShaderOutputParamsForMetal(DiagnosticSink* sink, EntryPointIn
47934814
if (oldFunc == entryPoint.entryPointFunc)
47944815
return;
47954816

4817+
// The wrapper is now the entry point, so global uniform params that recorded `oldFunc` as
4818+
// their originating entry point must follow it (see retargetEntryPointParamDecorations).
4819+
retargetEntryPointParamDecorations(oldFunc, entryPoint.entryPointFunc);
4820+
47964821
// Since this will no longer be the entry point function, remove those decorations
47974822
List<IRDecoration*> ds;
47984823
for (auto decor : oldFunc->getDecorations())
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
// Regression: an entry-point `uniform T*` parameter on a vertex shader must still be bound as a
2+
// buffer argument on the Metal target when the vertex entry's output is lowered into a return
3+
// struct by legalizeVertexShaderOutputParamsForMetal -> lowerOutParameters. That pass replaces
4+
// the entry point with a wrapper function; the moved-to-global uniform's
5+
// IREntryPointParamDecoration used to keep pointing at the original function, so
6+
// introduceExplicitGlobalContext failed to thread it into the wrapper and the [[buffer]] argument
7+
// was silently dropped (it read uninitialized memory). The wrapper is produced for BOTH triggers
8+
// of that pass -- a struct return AND `out` parameters -- so both are covered here.
9+
//
10+
// Each check pins not just that a buffer argument exists, but that the wrapper initializes its
11+
// KernelContext from that argument (`kernelContext...= entryPointParams`). That assignment is
12+
// exactly what the buggy compiler omitted, so the check fails against the unfixed compiler rather
13+
// than merely matching an incidental `buffer(` elsewhere. The concrete buffer index is Slang's own
14+
// default layout (entry-point params take the first buffer slot), so the signature check matches
15+
// `buffer(` without pinning a slot number.
16+
17+
//TEST:SIMPLE(filecheck=METAL): -target metal -stage vertex -entry vsMain
18+
//TEST:SIMPLE(filecheck=METALOUT): -target metal -stage vertex -entry vsOut
19+
//TEST:SIMPLE(filecheck=METALCOMP): -target metal -stage compute -entry csMain
20+
//TEST:SIMPLE(filecheck=SPIRV): -target spirv-asm -stage vertex -entry vsMain -fvk-use-entrypoint-name
21+
22+
struct GridCfg { uint dim; };
23+
struct VOut { float4 pos : SV_Position; float4 color : COLOR; };
24+
25+
// Struct-return vertex: uniform must be a buffer argument and be threaded into the context.
26+
//METAL: {{\[\[}}vertex{{\]\]}}
27+
//METAL-SAME: buffer(
28+
//METAL: kernelContext{{.*}}= entryPointParams
29+
[shader("vertex")]
30+
VOut vsMain(uint vid : SV_VertexID, uint iid : SV_InstanceID, uniform GridCfg* cfg)
31+
{
32+
VOut o;
33+
o.pos = float4(float(cfg.dim) * 0.001 + float(iid) * 0.0 + float(vid) * 0.0, 0, 0, 1);
34+
o.color = float4(1, 0, 0, 1);
35+
return o;
36+
}
37+
38+
// `out`-parameter vertex: same lowerOutParameters wrapper swap, so the same regression applies.
39+
//METALOUT: {{\[\[}}vertex{{\]\]}}
40+
//METALOUT-SAME: buffer(
41+
//METALOUT: kernelContext{{.*}}= entryPointParams
42+
[shader("vertex")]
43+
void vsOut(uint vid : SV_VertexID, uniform GridCfg* cfg, out float4 pos : SV_Position,
44+
out float4 col : COLOR)
45+
{
46+
pos = float4(float(cfg.dim) * 0.001 + float(vid) * 0.0, 0, 0, 1);
47+
col = float4(1, 0, 0, 1);
48+
}
49+
50+
// Compute never goes through the vertex output-struct wrapper; guards the shared lowering.
51+
//METALCOMP: {{\[\[}}kernel{{\]\]}}
52+
//METALCOMP-SAME: buffer(
53+
[shader("compute")]
54+
[numthreads(1, 1, 1)]
55+
void csMain(uint3 tid : SV_DispatchThreadID, uniform GridCfg* cfg)
56+
{
57+
if (tid.x < cfg.dim)
58+
return;
59+
}
60+
61+
// On Vulkan the entry-point uniform lowers to a push constant, not a descriptor in set 0.
62+
//SPIRV: OpEntryPoint Vertex
63+
//SPIRV: PushConstant

0 commit comments

Comments
 (0)