Move @SpecConst rate propagation from IR cloning to IR instruction creation (#10694)

aidanfnv · web-flow · commit f09c7e8f48ae · 2026-04-10T22:46:03.000Z
Fixes #10373 Spec-const rate propagation is performed in `cloneInstAndOperands`, which only applies the `@SpecConst` rate when instructions are cloned (e.g. during generic specialization), not when they were first created. This meant that `static const` expressions over specialization constants (like `FOO2 == 0 ? 0 : FOO1 / FOO2`) created directly do not receive the `@SpecConst` rate and would fail to be emitted as `OpSpecConstantOp` in SPIR-V. PR #10391 attempted to fix this but caused regressions and was partially reverted in #10663. This change moves `shouldHaveSpecConstRate` and `ensureSpecConstRate` (formerly `maybeAddSpecConstRate`) out of `slang-ir-clone.cpp` into `slang-ir-util.cpp` as general utilities, and calls them from `IRBuilder::_createInst` so that spec-const rate is applied at instruction creation time regardless of how the instruction is produced. Instructions that receive spec-const rate are routed through `_findOrEmitHoistableInst` for deduplication, matching the prior behavior for hoistable ops. `ensureSpecConstRate` now correctly replaces an existing `ConstExpr` rate with `SpecConst` when `shouldHaveSpecConstRate` determines an instruction should be a specialization constant. This handles `static const` expressions whose operands are specialization constants, since such values are only known at pipeline creation time and `ConstExpr` (compile-time) is incorrect. `isInstHoistable` is simplified to only check the `kIROpFlag_Hoistable` flag, and the now-unused `isSpecConstOpHoistable` is removed. `isLegalGlobalInstForTarget` in the SPIR-V legalization pass treats any instruction with a spec-const rate type as a legal global instruction, preventing the global inst inlining pass from moving them into function bodies. This is required because `OpSpecConstantOp` results must appear at module scope per the SPIR-V spec. A targeted fix in `_cloneInstDecorationsAndChildren` prevents `NameHintDecoration` accumulation on deduplicated hoistable spec const instructions across repeated generic-specialization passes, without blocking children cloning (which is needed for collection-like hoistable insts such as `FuncSet`). Re-enables the test disabled in #10663 and adds a regression test that more closely resembles the reproducer code from #10605 (the regression issue fixed by #10663) to accompany the minimal regression test. Also, closes #10665 Changes from that PR, which are required for the changes described above, have been folded into this change, and the following is the descriptions of the changes from that PR: When a bool specialization constant is cast to an integer type (or vice-versa) inside a global constant initializer, the SPIR-V emitter needs to lower the cast as an OpSpecConstantOp. The existing integer cast handling in emitSpecializationConstantOp only covered integer-to-integer conversions via UConvert/SConvert, but those opcodes cannot accept OpTypeBool as a source or destination type, so casts involving bools were unhandled. This adds two cases before the integer-to-integer path. For bool-to-integer, the emitter now produces OpSpecConstantOp ... Select %bool_operand %one %zero, picking between literal 1 and 0 since UConvert/SConvert cannot accept a bool operand. For integer-to-bool, it produces OpSpecConstantOp ... INotEqual %int_operand %zero, comparing the operand against zero since UConvert/SConvert cannot produce a bool result. Also adds a regression test with both a SPIR-V assembly filecheck verifying the expected OpSpecConstantOp opcodes and a Vulkan compute test verifying runtime correctness.
diff --git a/source/slang/slang-emit-spirv.cpp b/source/slang/slang-emit-spirv.cpp
@@ -3549,15 +3549,55 @@ struct SPIRVEmitContext : public SourceEmitterBase, public SPIRVEmitSharedContex
         }
 
         // Handle integer casts in spec constant context.
-        // Inside OpSpecConstantOp, UConvert/SConvert require different bit widths
-        // and matching signedness on the result type, and OpBitcast is not in the
-        // set of allowed opcodes. See emitSpecConstantSignReinterpret for the
-        // workaround used for signedness changes.
         auto irOp = inst->getOp();
         if (irOp == kIROp_IntCast || irOp == kIROp_ConstexprIntCast)
         {
             auto srcType = inst->getOperand(0)->getDataType();
             auto dstType = inst->getDataType();
+
+            // bool to integer: UConvert/SConvert cannot accept a bool operand,
+            // so we use OpSelect to pick between literal 1 and 0 instead.
+            if (srcType->getOp() == kIROp_BoolType && isIntegralType(dstType))
+            {
+                auto operand = emitSpecializationConstantOp(inst->getOperand(0));
+                IRBuilder builder(m_irModule);
+                auto one = emitLit(builder.getIntValue(dstType, 1));
+                auto zero = emitLit(builder.getIntValue(dstType, 0));
+                return emitInst(
+                    getSection(SpvLogicalSectionID::ConstantsAndTypes),
+                    inst,
+                    SpvOpSpecConstantOp,
+                    inst->getFullType(),
+                    kResultID,
+                    SpvOpSelect,
+                    operand,
+                    one,
+                    zero);
+            }
+
+            // integer to bool: UConvert/SConvert cannot produce a bool result,
+            // so we compare the operand against zero with OpINotEqual instead.
+            if (isIntegralType(srcType) && dstType->getOp() == kIROp_BoolType)
+            {
+                auto operand = emitSpecializationConstantOp(inst->getOperand(0));
+                IRBuilder builder(m_irModule);
+                auto zero = emitLit(builder.getIntValue(srcType, 0));
+                return emitInst(
+                    getSection(SpvLogicalSectionID::ConstantsAndTypes),
+                    inst,
+                    SpvOpSpecConstantOp,
+                    inst->getFullType(),
+                    kResultID,
+                    SpvOpINotEqual,
+                    operand,
+                    zero);
+            }
+
+            // Inside OpSpecConstantOp, UConvert/SConvert require different bit
+            // widths and matching signedness on the result type, and OpBitcast
+            // is not in the set of allowed opcodes. See
+            // emitSpecConstantSignReinterpret for the workaround used for
+            // signedness changes.
             if (isIntegralType(srcType) && isIntegralType(dstType))
             {
                 auto srcInfo = getIntTypeInfo(m_targetRequest, srcType);
diff --git a/source/slang/slang-ir-clone.cpp b/source/slang/slang-ir-clone.cpp
@@ -48,43 +48,6 @@ IRInst* findCloneForOperand(IRCloneEnv* env, IRInst* oldOperand)
     return oldOperand;
 }
 
-static bool shouldHaveSpecConstRate(IRInst* oldInst, IRInst* const* newOperands, UInt operandCount)
-{
-    if (operandCount == 0)
-        return false;
-
-    if (!canOperationBeSpecConst(
-            oldInst->getOp(),
-            oldInst->getDataType(),
-            nullptr,
-            oldInst->getOperands()))
-        return false;
-
-    // An instruction whose result carries a spec-const rate will be hoisted
-    // to global scope and, for SPIR-V, emitted as OpSpecConstantOp. That is
-    // only valid when every operand is itself a specialization constant or a
-    // plain constant. Mixing in a runtime value (e.g. a function parameter)
-    // would produce invalid SPIR-V.
-    //
-    bool hasSpecConstOperand = false;
-    for (UInt ii = 0; ii < operandCount; ++ii)
-    {
-        if (isSpecConstRateType(newOperands[ii]->getFullType()))
-            hasSpecConstOperand = true;
-        else if (!as<IRConstant>(newOperands[ii]))
-            return false;
-    }
-    return hasSpecConstOperand;
-}
-
-static IRType* maybeAddSpecConstRate(IRBuilder* builder, IRType* type)
-{
-    // Do not add a spec-const rate if the type already carries a rate.
-    if (as<IRRateQualifiedType>(type))
-        return type;
-    return builder->getRateQualifiedType(builder->getSpecConstRate(), type);
-}
-
 IRInst* cloneInstAndOperands(IRCloneEnv* env, IRBuilder* builder, IRInst* oldInst)
 {
     SLANG_ASSERT(env);
@@ -136,8 +99,12 @@ IRInst* cloneInstAndOperands(IRCloneEnv* env, IRBuilder* builder, IRInst* oldIns
         newOperands[ii] = newOperand;
     }
 
-    if (shouldHaveSpecConstRate(oldInst, newOperands.getArrayView().getBuffer(), operandCount))
-        newType = maybeAddSpecConstRate(builder, newType);
+    if (shouldHaveSpecConstRate(
+            oldInst->getOp(),
+            newType,
+            operandCount,
+            newOperands.getArrayView().getBuffer()))
+        newType = ensureSpecConstRate(builder, newType);
 
     // Finally we create the inst with the updated operands.
     auto newInst = builder->emitIntrinsicInst(
@@ -283,6 +250,12 @@ static void _cloneInstDecorationsAndChildren(
         if (lookUp(env, oldChild))
             continue;
 
+        // When dedup returns a pre-existing instruction (e.g. a hoistable inst),
+        // cloning NameHintDecorations onto it again would cause unbounded
+        // accumulation across repeated generic-specialization passes.
+        if (as<IRNameHintDecoration>(oldChild) && newInst->findDecoration<IRNameHintDecoration>())
+            continue;
+
         // Now we can perform the first phase of cloning
         // on the child, and register it in our map from
         // old to new values.
diff --git a/source/slang/slang-ir-spirv-legalize.cpp b/source/slang/slang-ir-spirv-legalize.cpp
@@ -1780,6 +1780,13 @@ struct SPIRVLegalizationContext : public SourceEmitterBase
     {
         bool isLegalGlobalInstForTarget(IRInst* inst) override
         {
+            // Spec-const-rate instructions must stay at their current scope
+            // (module level or inside a generic) because SPIR-V requires
+            // OpSpecConstantOp results to appear outside function bodies.
+            // Only ops validated by canOperationBeSpecConst (via
+            // shouldHaveSpecConstRate in _createInst) acquire this rate.
+            if (isSpecConstRateType(inst->getFullType()))
+                return true;
             switch (inst->getOp())
             {
             case kIROp_MakeStruct:
diff --git a/source/slang/slang-ir-util.cpp b/source/slang/slang-ir-util.cpp
@@ -2855,6 +2855,27 @@ IRType* maybeAddRateType(IRBuilder* builder, IRType* rateQulifiedType, IRType* o
     return oldType;
 }
 
+// Ensures `type` carries a SpecConst rate.
+// If the type already has a different rate (e.g. ConstExpr from a `static const`
+// expression whose operands are specialization constants), the existing rate is
+// replaced, as a value that depends on a spec-const is only known at pipeline
+// creation time, so `ConstExpr` (compile-time) would be incorrect.
+//
+IRType* ensureSpecConstRate(IRBuilder* builder, IRType* type)
+{
+    if (isSpecConstRateType(type))
+        return type;
+
+    // Strip any existing rate (e.g. ConstExpr) to avoid double-wrapping,
+    // since getRateQualifiedType does not unwrap for us.
+    if (auto rateQualified = as<IRRateQualifiedType>(type))
+        return builder->getRateQualifiedType(
+            builder->getSpecConstRate(),
+            rateQualified->getValueType());
+
+    return builder->getRateQualifiedType(builder->getSpecConstRate(), type);
+}
+
 bool canOperationBeSpecConst(IROp op, IRType* resultType, IRInst* const* fixedArgs, IRUse* operands)
 {
     // Returns true for ops that can be declared as an operation under `OpSpecConstantOp`.
@@ -2915,18 +2936,44 @@ bool canOperationBeSpecConst(IROp op, IRType* resultType, IRInst* const* fixedAr
     }
 }
 
-bool isSpecConstOpHoistable(IROp op, IRType* type, IRInst* const* fixedArgs)
+bool shouldHaveSpecConstRate(
+    IROp op,
+    IRType* resultType,
+    UInt operandCount,
+    IRInst* const* operands)
 {
-    auto rateType = as<IRRateQualifiedType>(type);
-    return rateType && as<IRSpecConstRate>(rateType->getRate()) &&
-           canOperationBeSpecConst(op, rateType->getValueType(), fixedArgs, nullptr);
-}
+    if (operandCount == 0)
+        return false;
+
+    // Unwrap any rate qualification so canOperationBeSpecConst sees the bare
+    // value type. isFloatingType checks as<IRBasicType> which doesn't match
+    // rate-qualified types like @ConstExpr float, so without unwrapping we
+    // would incorrectly allow float arithmetic as `OpSpecConstantOp`.
+    IRType* valueType = resultType;
+    if (auto rateQualifiedType = as<IRRateQualifiedType>(resultType))
+        valueType = rateQualifiedType->getValueType();
+
+    if (!canOperationBeSpecConst(op, valueType, operands, nullptr))
+        return false;
 
+    // An instruction whose result carries a spec-const rate is hoisted and
+    // emitted as OpSpecConstantOp for SPIR-V. That is only valid when
+    // every operand is itself a specialization constant or a plain
+    // constant. Mixing in a runtime value would produce invalid SPIR-V.
+    bool hasSpecConstOperand = false;
+    for (UInt ii = 0; ii < operandCount; ++ii)
+    {
+        if (isSpecConstRateType(operands[ii]->getFullType()))
+            hasSpecConstOperand = true;
+        else if (!as<IRConstant>(operands[ii]))
+            return false;
+    }
+    return hasSpecConstOperand;
+}
 
-bool isInstHoistable(IROp op, IRType* type, IRInst* const* fixedArgs)
+bool isInstHoistable(IROp op)
 {
-    return (getIROpInfo(op).flags & kIROpFlag_Hoistable) ||
-           isSpecConstOpHoistable(op, type, fixedArgs);
+    return (getIROpInfo(op).flags & kIROpFlag_Hoistable);
 }
 
 IRType* getUnsignedTypeFromSignedType(IRBuilder* builder, IRType* type)
diff --git a/source/slang/slang-ir-util.h b/source/slang/slang-ir-util.h
@@ -453,12 +453,18 @@ bool isFirstBlock(IRInst* inst);
 bool isSpecConstRateType(IRType* type);
 void hoistInstAndOperandsToGlobal(IRBuilder* builder, IRInst* inst);
 IRType* maybeAddRateType(IRBuilder* builder, IRType* rateQulifiedType, IRType* oldType);
+IRType* ensureSpecConstRate(IRBuilder* builder, IRType* type);
 bool canOperationBeSpecConst(
     IROp op,
     IRType* resultType,
     IRInst* const* fixedArgs,
     IRUse* operands);
-bool isInstHoistable(IROp op, IRType* type, IRInst* const* fixedArgs);
+bool shouldHaveSpecConstRate(
+    IROp op,
+    IRType* resultType,
+    UInt operandCount,
+    IRInst* const* operands);
+bool isInstHoistable(IROp op);
 
 // most of <algorithm> doesn't work on out non-const iterators, so define this
 // version
diff --git a/source/slang/slang-ir.cpp b/source/slang/slang-ir.cpp
@@ -1861,7 +1861,19 @@ IRInst* IRBuilder::_createInst(
     m_dedupContext->getInstReplacementMap().tryGetValue(type, instReplacement);
     type = (IRType*)instReplacement;
 
-    if (isInstHoistable(op, type, fixedArgs))
+    if (type && shouldHaveSpecConstRate(op, type, fixedArgCount, fixedArgs))
+    {
+        type = ensureSpecConstRate(this, type);
+        return _findOrEmitHoistableInst(
+            type,
+            op,
+            fixedArgCount,
+            fixedArgs,
+            varArgListCount,
+            listArgCounts,
+            listArgs);
+    }
+    else if (isInstHoistable(op))
     {
         return _findOrEmitHoistableInst(
             type,
diff --git a/tests/bugs/neural-autodiff-constexpr-crash-full.slang b/tests/bugs/neural-autodiff-constexpr-crash-full.slang
@@ -0,0 +1,94 @@
+//TEST:SIMPLE(filecheck=CHECK): -target cuda -experimental-feature
+
+// Full regression test for https://github.com/shader-slang/slang/issues/10605
+
+// 4-layer MLP with autodiff backward pass targeting CUDA. Exercises the
+// interaction between `static const` globals computed from macro expansions,
+// generic struct members with constexpr arithmetic (division, ternary),
+// and the autodiff pass. The original crash was a null-operand segfault in
+// AutoDiffPass::processReferencedFunctions.
+
+// CHECK-DAG: __global__ void __kernel__backward
+// CHECK-DAG: s_apply_runWaveMLP
+
+import slang.neural;
+
+#ifndef TCNN_MLP_IN_DIM
+#define TCNN_MLP_IN_DIM 2
+#endif
+
+#ifndef WT_WAVE_WARPS
+#define WT_WAVE_WARPS 4
+#endif
+
+static const int IN_DIM = TCNN_MLP_IN_DIM;
+static const int HIDDEN_DIM = 128;
+static const int OUT_DIM = 3;
+static const int SubgroupSize = 32;
+static const int WARPS_PER_BLOCK = WT_WAVE_WARPS;
+static const float LEAKY_ALPHA = 0.01f;
+
+typealias Storage = TorchTensorViewAddress<half>;
+typealias ShMemSize = SharedMemorySize<half, TargetEnum.CUDA, ExecutionMode.Training, SubgroupSize, WARPS_PER_BLOCK>;
+typealias ShMemSizeLayer = ShMemSize.OfLayer4<IN_DIM, HIDDEN_DIM, HIDDEN_DIM, HIDDEN_DIM, OUT_DIM>;
+typealias InVec = WaveTangledVector<half, ShMemSizeLayer, IN_DIM, SubgroupSize>;
+typealias HidVec = WaveTangledVector<half, ShMemSizeLayer, HIDDEN_DIM, SubgroupSize>;
+typealias OutVec = WaveTangledVector<half, ShMemSizeLayer, OUT_DIM, SubgroupSize>;
+typealias Layer1 = FFLayer<half, InVec, HidVec, LinearLayout, LeakyReLU<half>, true>;
+typealias Layer2 = FFLayer<half, HidVec, HidVec, LinearLayout, LeakyReLU<half>, true>;
+typealias Layer3 = FFLayer<half, HidVec, HidVec, LinearLayout, LeakyReLU<half>, true>;
+typealias Layer4 = FFLayer<half, HidVec, OutVec, LinearLayout, Sigmoid<half>, true>;
+
+static const int TotalParamCount =
+    Layer1.ParameterCount + Layer2.ParameterCount + Layer3.ParameterCount + Layer4.ParameterCount;
+
+[Differentiable]
+OutVec runWaveMLP(InVec input, Storage params)
+{
+    LeakyReLU<half> leaky = LeakyReLU<half>(half(LEAKY_ALPHA));
+    Layer1 layer1 = Layer1(leaky);
+    Layer2 layer2 = Layer2(leaky);
+    Layer3 layer3 = Layer3(leaky);
+    Layer4 layer4 = Layer4();
+
+    int offset = 0;
+    HidVec h1 = layer1.eval<Storage>(input, params.getOffset(offset)); offset += Layer1.ParameterCount;
+    HidVec h2 = layer2.eval<Storage>(h1,    params.getOffset(offset)); offset += Layer2.ParameterCount;
+    HidVec h3 = layer3.eval<Storage>(h2,    params.getOffset(offset)); offset += Layer3.ParameterCount;
+    return     layer4.eval<Storage>(h3,    params.getOffset(offset));
+}
+
+[AutoPyBindCUDA]
+[CUDAKernel]
+void backward(
+    DiffTensorView input,
+    TensorView<half> params,
+    TensorView<half> paramsGrad,
+    DiffTensorView output)
+{
+    uint idx = cudaBlockIdx().x * cudaBlockDim().x + cudaThreadIdx().x;
+    bool isActive = idx < input.size(0);
+
+    InVec x = InVec(half(0));
+    if (isActive)
+    {
+        for (int i = 0; i < IN_DIM; ++i) x[i] = half(input.primal[idx, i]);
+    }
+
+    OutVec dL_dOutput = OutVec(half(0));
+    if (isActive)
+    {
+        for (int i = 0; i < OUT_DIM; ++i) dL_dOutput[i] = half(output.diff.diff[idx, i]);
+    }
+
+    InVec dx = InVec(half(0));
+    var dpInput = diffPair(x, dx);
+    bwd_diff(runWaveMLP)(dpInput,
+        DifferentialPtrPair<Storage>(Storage(params), Storage(paramsGrad)),
+        dL_dOutput);
+
+    if (isActive)
+    {
+        for (int i = 0; i < IN_DIM; ++i) input.diff.diff[idx, i] = float(dpInput.d[i]);
+    }
+}
diff --git a/tests/spirv/spec-constant-bool-int-cast.slang b/tests/spirv/spec-constant-bool-int-cast.slang
diff --git a/tests/spirv/spec-constant-select-global-const.slang b/tests/spirv/spec-constant-select-global-const.slang

Original file line number	Diff line number	Diff line change
`@@ -1780,6 +1780,13 @@ struct SPIRVLegalizationContext : public SourceEmitterBase`
`1780`	`1780`	`{`
`1781`	`1781`	`bool isLegalGlobalInstForTarget(IRInst* inst) override`
`1782`	`1782`	`{`
	`1783`	`+ // Spec-const-rate instructions must stay at their current scope`
	`1784`	`+ // (module level or inside a generic) because SPIR-V requires`
	`1785`	`+ // OpSpecConstantOp results to appear outside function bodies.`
	`1786`	`+ // Only ops validated by canOperationBeSpecConst (via`
	`1787`	`+ // shouldHaveSpecConstRate in _createInst) acquire this rate.`
	`1788`	`+ if (isSpecConstRateType(inst->getFullType()))`
	`1789`	`+ return true;`
`1783`	`1790`	`switch (inst->getOp())`
`1784`	`1791`	`{`
`1785`	`1792`	`case kIROp_MakeStruct:`