[Reflection]: Track "used" for uniforms#11135
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR adds byte-granularity uniform parameter usage tracking to the Slang compiler. An IR analysis pass traverses instruction chains from global parameters through field and array accesses to compute which byte ranges are actually read. Usage information is integrated into compilation metadata, exposed via a new reflection API, and emitted to JSON as byte-range intervals per parameter, enabling tools to see which uniform fields are accessed without over-reporting shared register space usage. ChangesUniform parameter byte-granularity usage tracking and reflection
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0d39a2f2-e6a5-46fb-a961-d6b54634a7c8
📒 Files selected for processing (6)
source/compiler-core/slang-artifact-associated-impl.hsource/slang/slang-ir-metadata.cppsource/slang/slang-ir-uniform-usage.cppsource/slang/slang-ir-uniform-usage.hsource/slang/slang-reflection-json.cppsource/slang/slang-target-program.cpp
| static UInt _getUniformSize(IRTypeLayout* typeLayout) | ||
| { | ||
| if (!typeLayout) | ||
| return 0; | ||
| auto sizeAttr = typeLayout->findSizeAttr(LayoutResourceKind::Uniform); | ||
| if (!sizeAttr) | ||
| return 0; | ||
| auto size = sizeAttr->getSize(); | ||
| if (!size.isFinite()) | ||
| return 0; | ||
| return UInt(size.getFiniteValue().getValidValue()); | ||
| } |
There was a problem hiding this comment.
Potential runtime error: Missing validity check before getValidValue().
Line 19 calls getValidValue() without first checking that size.getFiniteValue().isValid(). If the finite value is invalid, this may throw an exception or trigger an assertion.
🛡️ Proposed fix to add validation
static UInt _getUniformSize(IRTypeLayout* typeLayout)
{
if (!typeLayout)
return 0;
auto sizeAttr = typeLayout->findSizeAttr(LayoutResourceKind::Uniform);
if (!sizeAttr)
return 0;
auto size = sizeAttr->getSize();
if (!size.isFinite())
return 0;
- return UInt(size.getFiniteValue().getValidValue());
+ auto finiteValue = size.getFiniteValue();
+ if (!finiteValue.isValid())
+ return 0;
+ return UInt(finiteValue.getValidValue());
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| static UInt _getUniformSize(IRTypeLayout* typeLayout) | |
| { | |
| if (!typeLayout) | |
| return 0; | |
| auto sizeAttr = typeLayout->findSizeAttr(LayoutResourceKind::Uniform); | |
| if (!sizeAttr) | |
| return 0; | |
| auto size = sizeAttr->getSize(); | |
| if (!size.isFinite()) | |
| return 0; | |
| return UInt(size.getFiniteValue().getValidValue()); | |
| } | |
| static UInt _getUniformSize(IRTypeLayout* typeLayout) | |
| { | |
| if (!typeLayout) | |
| return 0; | |
| auto sizeAttr = typeLayout->findSizeAttr(LayoutResourceKind::Uniform); | |
| if (!sizeAttr) | |
| return 0; | |
| auto size = sizeAttr->getSize(); | |
| if (!size.isFinite()) | |
| return 0; | |
| auto finiteValue = size.getFiniteValue(); | |
| if (!finiteValue.isValid()) | |
| return 0; | |
| return UInt(finiteValue.getValidValue()); | |
| } |
🧰 Tools
🪛 Cppcheck (2.20.0)
[error] 19-19: There is an unknown macro here somewhere. Configuration is required. If FIDDLE is a macro then please configure it.
(unknownMacro)
tangent-vector
left a comment
There was a problem hiding this comment.
This PR needs significant work before it is anywhere in the vicinity of ready to accept into the codebase. At the very minimum I would need to see:
- A change to the public Slang API to allow querying for an explicit list of byte-offset ranges for a parameter binding range
- A change to the JSON reflection data emission so that it is driven by the byte ranges and not by recursive type walking that could potentially visit every element of a large array.
- Comprehensive test cases showing that the system produces good conservative byte ranges across all Slang compilation targets, in a variety of scenarios that cover all the interesting corner cases.
- Significantly improved naming for the all of the functions/types/etc. being introduced in the relevant pass.
- Proper documentation comments on all types/functions that actually say what things do.
- A comprehensive approach to commenting in the pass implementation itself, showing a technically informed perspective on the problem to be solved, along with a clear description of the solution strategy, and comments throughout that explain corner cases and the steps the pass needs to take to be properly conservative in its usage estimates
- A clear analysis of the usage characteristics of the backing structure(s) for the pass and the reflection representation, to provide reassurance that the approach isn't wasteful of space
- A dedication to robust coding practices, where assumptions and invariants are clearly stated/documented and then asserted, rather than using slippery "defensive" coding practices that actually serve to mask bugs and make everybody's life harder in the long run
I don't think the agent who authored this first version of the PR is up to the task, so I am inclined to say we should close this PR and let the issue remain open until we have time for a human (and/or a stronger coding agent) to try and tackle it.
| case slang::Uniform: | ||
| return true; |
There was a problem hiding this comment.
AFAICT returning true universally for the Uniform case is not actually correct. The Uniform "parameter category" is used to represent everything that is measured in bytes, which means that it applies to the contents of all buffers, ray-tracing payloads, and various other cases.
The real problem is that this existing isUsageTracked() query isn't really suited for the question that ones really wants to ask: "is byte-granularity usage tracking enabled for the contents of constant buffers?" (or whatever kind of buffer one has in mind).
There was a problem hiding this comment.
Will revert this. Uniform queries now bypass the categorical gate and consult the metadata directly. If there are recorded ranges we answer, otherwise SLANG_E_NOT_AVAILABLE. The IR pass only records ranges for parameter group contents, so RT payloads and byte address buffer contents are out of scope.
| if (!layoutDecoration) | ||
| continue; | ||
| auto varLayout = as<IRVarLayout>(layoutDecoration->getLayout()); | ||
| if (!varLayout) | ||
| continue; |
There was a problem hiding this comment.
These continues seem like an attempt at "defensive" programming of the wrong kind. I presume that an invariant of the collectUniformUsage() operation should be that all of the entries in the generated dictionary have keys that are shader parameters such that (1) they have a layout decoration, and (2) the layout on that decoration is an IRVarLayout. If that invariant is violated, then the right thing to do would be to assert-fail here (at the very least in debug builds), rather than silently glossing over the issue with a continue.
This kind of idiom is something I notice AI coding agents really like to write, because they are often incentivized to "cheat" and write code that looks safe and passes tests in the short term even if it is a lurking liability in the long run. A line of code that noisily crashes when an invariant is violated might stop their PR from getting merged today (even if noisy crashes are usually the best way to find an underlying bug), but the subtle semantic errors that can arise when an invariant is violated and LLM-authored code silently glosses over it? That will be much harder to debug, but from the LLM's perspective it will be somebody else who has to pay that cost.
I'd like to see asserts here (probably even release-mode asserts) and if this change cannot pass tests with those asserts in place, we should be debugging and fixing whatever underlying issue is leading to the invariant violation.
| Dictionary<IRGlobalParam*, List<UniformUsageRange>> uniformUsage; | ||
| collectUniformUsage(irModule, uniformUsage); | ||
| for (auto& kv : uniformUsage) |
There was a problem hiding this comment.
Having looked at the code that follows, I have to ask: why is this function "returning" a dictionary? I mean that in two senses:
-
Why is it "returning" via an out parameter rather than just returning the value?
-
Why is the returned value a dictionary instead of something like
List<ByteGranularityParameterUsageInfo>, which is a meaningful type that holds theIRGlobalParam*for the parameter along with aList<ByteGranularityUsageRange>?Nothing in the code below seems to make use of the dictionary-ness of the value, so it seems like its just a case of implementation details leaking out of the subroutine, along with a (typical of LLMs) desire to avoiding actually defining named types for concepts, resulting in uglier and less readable code.
| @@ -0,0 +1,162 @@ | |||
| // slang-ir-uniform-usage.cpp | |||
There was a problem hiding this comment.
This file/pass is incorrectly named. While much of the code still uses the name Uniform to refer to the thing being measured here, we should all be on the same page that what is actually being discussed here is bytes and byte-granularity usage information.
I would advocate for slang-ir-byte-granularity-usage.*.
A verbose-but-accurate name is always preferrable over a terse-but-meaningless/-confusing one. We should be aspiring to name things what they are as much as possible, even when some of the existing legacy cruft in the codebase makes shifting to the correct terminology slower than somebody like me would like.
| namespace Slang | ||
| { | ||
|
|
||
| static UInt _getUniformSize(IRTypeLayout* typeLayout) |
There was a problem hiding this comment.
This should be _getSizeInBytes()
| { | ||
|
|
||
| struct UniformUsageRange | ||
| { |
There was a problem hiding this comment.
This type is uncommented and, as I discussed in a previous comment, also poorly named for what it does.
| void collectUniformUsage( | ||
| const IRModule* module, | ||
| Dictionary<IRGlobalParam*, List<UniformUsageRange>>& outUsage); |
There was a problem hiding this comment.
Proposed name: collectApproximateByteGranularityUsageInformationForParameterGroups.
The current name is misleadingly wrong, and I'd rather have an extremely verbose name that actually says what the function does (or rather, what it wants to do, because I'm not convinced this code actually handles the various cases right).
There was a problem hiding this comment.
You were right, not all cases were handled. Added quick sanity tests for swizzle, vector subscript, matrix access, and struct-copy-then-call all soundly conservative but coarser than they should be.
Now handled:
IRSwizzle(v.x/v.xy): narrows to per-component byte ranges- Vector
IRGetElement[Ptr]with constant index: same, via the base'sIRVectorTypefor element size
Still widens to whole field:
- Matrix
m[i][j]: column-major makes row access discontiguous in bytes - Aggregate copies into helpers: walker stops at call boundaries
- Anything else hitting the default branch
| } | ||
|
|
||
|
|
||
| static void emitUniformFieldUsageJSON( |
There was a problem hiding this comment.
Incorrectly named. Again, this is "byte-granularity" or "byte-range" usage for specific shader parameter binding ranges. The name "uniform field usage" implies a level of generality that just isn't here.
| return; | ||
| } | ||
|
|
||
| if (kind == slang::TypeReflection::Kind::Array) |
There was a problem hiding this comment.
I'm not really happy with any of the code from this point down in this function, because in my mind the right deliverable here is a nice compact list of what offset/size byte ranges are assumed used in the given parameter. Trying to associate that information with fields or array elements, etc. Is... kind of wrong by design.
| if (!elementType || count == 0 || count == SLANG_UNBOUNDED_SIZE || stride == 0) | ||
| return; | ||
| for (size_t i = 0; i < count; ++i) | ||
| { |
There was a problem hiding this comment.
The concern here is very true, and should be taken seriously.
The approach taken here can easily result in a combinatorial explosion of output in the JSON path, which is actually something that I worked to avoid in the first iteration(s) of the logic. It is intentional that nothing in here drills down into arrays on a per-element basis, because that is is where the worst explosions in output size can come from.
The right output here would either be just plain byte ranges, or even more conservative output that, for something like an array, only cares whether any byte within a given element is used.
Either way, it is important to note that what would actually drive a good version of this logic is not the spIsParameterLocationUsed style of query. For this to not be a complete mess, there needs to be a change in the public API of the slang compiler to allow for directly requesting a list of ranges as (offset,size) pairs. Any recursion into a type's structure here should then be driven by primarily by decomposition/iteration over the ranges, and not primarily over elements/fields.
|
@tangent-vector thank you so much for your help! Let me know if there is anything else I did wrong :) |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
source/slang/slang-reflection-json.cpp (1)
1106-1115:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBinding-category detection breaks for descriptor-table-slot index 0.
At Line 1109, fallback is gated by
parentBindingIndex == 0anddescriptorTableSlot != 0, so a valid parameter-block binding at descriptor slot 0 is misclassified as ConstantBuffer. That can suppressusedByteRangesfor legitimate bindings.Proposed fix
- SlangUInt spaceIndex = param->getBindingSpace(SLANG_PARAMETER_CATEGORY_CONSTANT_BUFFER); - SlangUInt parentBindingIndex = - param->getOffset(SLANG_PARAMETER_CATEGORY_CONSTANT_BUFFER); - if (parentBindingIndex == 0 && - param->getOffset(SLANG_PARAMETER_CATEGORY_DESCRIPTOR_TABLE_SLOT) != 0) - { - spaceIndex = param->getBindingSpace(SLANG_PARAMETER_CATEGORY_DESCRIPTOR_TABLE_SLOT); - parentBindingIndex = - param->getOffset(SLANG_PARAMETER_CATEGORY_DESCRIPTOR_TABLE_SLOT); - } + bool hasCB = false; + bool hasDT = false; + for (uint32_t i = 0; i < param->getCategoryCount(); ++i) + { + auto c = SlangParameterCategory(param->getCategoryByIndex(i)); + hasCB = hasCB || (c == SLANG_PARAMETER_CATEGORY_CONSTANT_BUFFER); + hasDT = hasDT || (c == SLANG_PARAMETER_CATEGORY_DESCRIPTOR_TABLE_SLOT); + } + + SlangParameterCategory parentCategory = SLANG_PARAMETER_CATEGORY_CONSTANT_BUFFER; + if ((kind == slang::TypeReflection::Kind::ParameterBlock && hasDT) || (!hasCB && hasDT)) + parentCategory = SLANG_PARAMETER_CATEGORY_DESCRIPTOR_TABLE_SLOT; + + SlangUInt spaceIndex = param->getBindingSpace(parentCategory); + SlangUInt parentBindingIndex = param->getOffset(parentCategory);As per coding guidelines, “source/slang/** … Review for … IR pass correctness … Null pointer safety and proper error handling via diagnostics,” this is a correctness issue in emitter-side binding resolution.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: c7acab4f-12a7-437b-a266-3c3a95abeddc
📒 Files selected for processing (26)
include/slang.hsource/compiler-core/slang-artifact-associated-impl.cppsource/compiler-core/slang-artifact-associated-impl.hsource/compiler-core/slang-artifact-associated.hsource/slang/slang-end-to-end-request.cppsource/slang/slang-ir-byte-granularity-usage.cppsource/slang/slang-ir-byte-granularity-usage.hsource/slang/slang-ir-metadata.cppsource/slang/slang-reflection-json.cpptests/bindings/hlsl-to-vulkan-global.hlsl.expectedtests/bindings/hlsl-to-vulkan-shift.hlsl.expectedtests/reflection/used-parameters.slangtests/reflection/used-parameters.slang.expectedtests/reflection/used-uniforms-array-const.slangtests/reflection/used-uniforms-array-const.slang.expectedtests/reflection/used-uniforms-array-runtime.slangtests/reflection/used-uniforms-array-runtime.slang.expectedtests/reflection/used-uniforms-multi-ep.slangtests/reflection/used-uniforms-multi-ep.slang.1.expectedtests/reflection/used-uniforms-multi-ep.slang.expectedtests/reflection/used-uniforms-nested.slangtests/reflection/used-uniforms-nested.slang.expectedtests/reflection/used-uniforms-single-cb.slangtests/reflection/used-uniforms-single-cb.slang.expectedtests/reflection/used-uniforms-swizzle.slangtests/reflection/used-uniforms-swizzle.slang.expected
| [numthreads(1, 1, 1)] | ||
| void main(uint tid : SV_DispatchThreadID) | ||
| { | ||
| outBuf[tid] = params.used.a + float4(params.topUsed.xxxx); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
Redundant float4() constructor.
The expression float4(params.topUsed.xxxx) contains a redundant float4() wrapper, since .xxxx on a scalar already produces a float4. While this is valid syntax and doesn't affect correctness, it can be simplified.
♻️ Optional simplification
- outBuf[tid] = params.used.a + float4(params.topUsed.xxxx);
+ outBuf[tid] = params.used.a + params.topUsed.xxxx;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| outBuf[tid] = params.used.a + float4(params.topUsed.xxxx); | |
| outBuf[tid] = params.used.a + params.topUsed.xxxx; |
There was a problem hiding this comment.
The proposed simplification is the wrong one. Always prefer float4(v.x) over `v.xxxx’. The former is more universally readable since it can be clear about what it means even to folks who don’t know what “swizzles” even are.
General-purpose swizzles should, at this point, be thought of more a historical curiosity than a central and important feature if GPU languages. We should all aspire to use swizzles as little as possible in our code, and make the swizzles we do use only be the obvious ones (xy, and xyz).
41eb88e to
8f4d46f
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
source/compiler-core/slang-artifact-associated-impl.h (1)
157-172:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not advertise
slang::Uniformas globally tracked here.This helper still drives flat
m_usedBindingscollection, so returningtrueforUniformreintroduces coarse byte-range tracking for every uniform-sized layout even though the new logic only supports per-parameter-group metadata. The dedicatedUniformbranch inArtifactPostEmitMetadata::isParameterLocationUsed()already bypasses this gate, so this case should stay disabled.Suggested fix
case slang::SpecializationConstant: case slang::SubElementRegisterSpace: return true; - case slang::Uniform: - return true; default: return false; }source/slang/slang-reflection-json.cpp (1)
1106-1115:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMatch the IR pass's CB-vs-DT precedence when computing the parent binding.
This fallback switches to
DescriptorTableSlotwhenever the constant-buffer offset is zero, but the producer side recordsparentBindingIndexbased on attribute presence, not on whether the CB binding number is non-zero. If both attrs exist and the real CB binding is0, the lookup misses andusedByteRangesdisappears from the JSON.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: fdc3e816-3465-4bb3-9094-81038b17679d
📒 Files selected for processing (26)
include/slang.hsource/compiler-core/slang-artifact-associated-impl.cppsource/compiler-core/slang-artifact-associated-impl.hsource/compiler-core/slang-artifact-associated.hsource/slang/slang-end-to-end-request.cppsource/slang/slang-ir-byte-granularity-usage.cppsource/slang/slang-ir-byte-granularity-usage.hsource/slang/slang-ir-metadata.cppsource/slang/slang-reflection-json.cpptests/bindings/hlsl-to-vulkan-global.hlsl.expectedtests/bindings/hlsl-to-vulkan-shift.hlsl.expectedtests/reflection/used-parameters.slangtests/reflection/used-parameters.slang.expectedtests/reflection/used-uniforms-array-const.slangtests/reflection/used-uniforms-array-const.slang.expectedtests/reflection/used-uniforms-array-runtime.slangtests/reflection/used-uniforms-array-runtime.slang.expectedtests/reflection/used-uniforms-multi-ep.slangtests/reflection/used-uniforms-multi-ep.slang.1.expectedtests/reflection/used-uniforms-multi-ep.slang.expectedtests/reflection/used-uniforms-nested.slangtests/reflection/used-uniforms-nested.slang.expectedtests/reflection/used-uniforms-single-cb.slangtests/reflection/used-uniforms-single-cb.slang.expectedtests/reflection/used-uniforms-swizzle.slangtests/reflection/used-uniforms-swizzle.slang.expected
Previously, spIsParameterLocationUsed returned SLANG_E_NOT_AVAILABLE for category=Uniform, and slangc -reflection-json had no "used" field for uniform data. This change tracks usage at byte-offset granularity within each global parameter's uniform region. A new IR pass (slang-ir-uniform-usage.cpp) walks each uniform-bearing IRGlobalParam forward through IRFieldAddress / IRFieldExtract / IRGetElement(Ptr) / IRLoad chains and records the byte ranges that reach a read. CollectMetadata inserts those ranges into m_usedBindings as Uniform-category entries, so both the C++ API and the JSON emitter can answer per-field queries. The JSON emitter gains a "fields" array under cbuffer/struct entries in entryPoints[].bindings[], with arrays decomposed per element when statically sized. Conservative widening where static analysis runs out. Dynamic array indices (arr[runtimeValue]) mark the whole array as used, bounded to the array's byte extent so sibling fields are unaffected. Whole aggregate values escaping across a call boundary mark the full passed sub-region. Any IR op the walker doesn't peel triggers the same sub-region widening. The result is always a correct over-approximation never a false used:false.
- Drop per-EP re-emit loop in _createWholeProgramResult. Reused a multi entry point CodeGenContext, held m_resultCacheMutex across N codegen calls, and could null vended artifacts. Per-EP results are already populated correctly by _createEntryPointResult. - collectUniformUsage now takes const IRModule*, matching collectMetadata's contract and removing the const_cast at the call.
- Rename uniform-usage IR pass to byte-granularity-usage to match what the pass actually tracks. - Scope per-uniform usage by parent CB binding via new UniformParamUsage struct, fixing multi-CB conflation in JSON output. - Fall back to untracked entry instead of asserting when a parameter group lacks a ConstantBuffer or DescriptorTableSlot offset attr (cooperative type metadata shaders hit this path). - Expand IMetadata::isParameterLocationUsed doc to describe byte-offset semantics and multi-CB over-reporting for Uniform queries. - Add reflection tests for single CB, nested structs, literal and runtime array indices, swizzles, and multi entry point scoping. - Regenerate stale .expected files for tests/bindings/hlsl-to-vulkan-global and hlsl-to-vulkan-shift now that uniform bindings carry usedByteRanges.
- Removed the swizzle for easier readability
8f4d46f to
71c15ea
Compare
|
@ramang-unity would you mind addressing the latest review comments and then tag tess for a final review, I'm happy to get this merged but no hurry! |
@expipiplus1 sure. Sorry I was waiting to see if these were valid. Should I fix the 🔴 only or all the gaps as well? |
- Add reflection tests for the uniform used byte range paths covering matrix and function call widening, the untracked push constant case, a Vulkan descriptor backed constant buffer, and under aligned array element stride. - Add a unit test for isParameterLocationUsed on the uniform category, including the untracked not available result. - Document the uniform used reflection contract and the used byte range JSON keys in the reflection guide, and fix the example to treat not available as used. - Refine the byte granularity usage pass, JSON emitter, and metadata query that back these paths.
|
@tangent-vector Let me know if there is anything else I did wrong :) |
Gap 1 is an issue and will be pushing a fix. Gap 2 is not a real issue. The early-return yields |
- Split the uniform-usage pass's parent-binding space into two fields: parentSpace (RegisterSpace offset, what the Uniform location query matches) and parentBindingSpace (includes the descriptor set, what the reflection emitter matches via getBindingSpace). - Fix usedByteRanges silently dropping off a constant buffer bound to a non-zero Vulkan descriptor set; isParameterLocationUsed and loose-global used reporting are unchanged. - Add used-uniforms-vulkan-cb-nonzero-set reflection test to pin the non-zero-set case.
Head branch was pushed to by a user without write access
There was a problem hiding this comment.
Verdict: 🟡 Minor gaps and questions — 0 bugs, 2 gaps, 3 questions
The PR adds a new IR pass slang-ir-byte-granularity-usage.cpp that walks each uniform-bearing IRGlobalParam forward through IRFieldAddress/IRFieldExtract/IRGetElement(Ptr)/IRSwizzle/IRLoad, recording the byte ranges that reach a read. CollectMetadata plumbs those ranges into m_uniformParamUsage so the C++ API and the JSON emitter can answer per-field queries; the IArtifactPostEmitMetadata interface gains an appended getUniformParamUsage() method (ABI-additive, no reordering of existing virtuals). The conservative widening rules (call escapes, runtime indices, unrecognized ops, matrix subscripts, non-finite layouts) are all explicit in the source and pinned by dedicated tests. No correctness bugs found; the comments below flag a serialization-precision regression, a divergent provider-of-truth between the IR pass and the JSON emitter, and three contract questions.
Changes Overview
Per-uniform byte-range tracking IR pass (source/slang/slang-ir-byte-granularity-usage.{cpp,h} — new)
- New pass
collectApproximateByteGranularityUsageInformationForParameterGroups()walks forward from eachIRGlobalParamwhose layout is wrapped in anIRParameterGroupTypeLayout, peeling field/element/swizzle/load ops and recording byte-offset ranges. Anything else (calls, stores, casts, runtime indices) widens to the current sub-region. Ranges are sorted+coalesced once per param. Unbounded uniform element types or params with no parent CB/descriptor-slot binding emit anisUntrackedmarker.
Public API + metadata plumbing (include/slang.h, source/compiler-core/slang-artifact-associated{.h,-impl.h,-impl.cpp}, source/slang/slang-end-to-end-request.cpp, source/slang/slang-ir-metadata.cpp)
IMetadata::isParameterLocationUseddoc updated: forSLANG_PARAMETER_CATEGORY_UNIFORM,registerIndexis a byte offset, andSLANG_E_NOT_AVAILABLEis the correct response when the queried space has no recorded entries (vs. fabricatingoutUsed=false). NewUniformParamUsagestruct +selectUniformParentBinding()helper resolve the parent CB/descriptor-slot key shared between the IR pass and the JSON emitter.IArtifactPostEmitMetadata::getUniformParamUsage()appended. End-to-end path: returnsSLANG_E_NOT_AVAILABLEfor Uniform queries whengetUniformParamUsage().count == 0.
IRArrayTypeLayout uniform-stride (source/slang/slang-ir-insts.h, slang-ir.cpp, slang-lower-to-ir.cpp, slang-ir-glsl-legalize.cpp, slang-ir-lower-combined-texture-sampler.cpp)
Builderconstructor now takes auniformStrideoperand;getUniformStride()reads it back. Existing call sites pass 0 where the layout is synthesized for varyings/resources. Pre-PR serialized modules return 0 (precision regression flagged inline).
JSON reflection emitter (source/slang/slang-reflection-json.cpp)
- For ConstantBuffer/ParameterBlock/Struct params, the emitter resolves the parent binding via
selectUniformParentBinding()(usinggetCategoryByIndexto decide presence), looks up the matchingUniformParamUsageentry, and writesusedByteRanges: [{offset, size}, …]orusedByteRangesUnavailable: true.
Documentation (docs/user-guide/09-reflection.md)
- New "Uniform Parameters" subsection documents the byte-offset reinterpretation of
registerIndex, the over-reporting boundary when multiple CBs share a space, and theusedByteRangesUnavailableJSON path.
Tests (tests/reflection/used-uniforms-*.slang × 12, tests/reflection/used-parameters.slang*, tools/slang-unit-test/unit-test-parameter-usage-reflection.cpp, tests/bindings/hlsl-to-vulkan-{global,shift}.hlsl.expected)
- Reflection tests cover: const/runtime-index arrays, array stride, function-call escape, matrix widening, multi-EP independence, nested struct narrowing, single CB, swizzle, untracked push-constant path, Vulkan CB at zero and non-zero descriptor sets. Two new C++ unit tests pin
SLANG_E_NOT_AVAILABLEsemantics for unknown space and untracked entry. Theused-parameters.slangexpected file is updated withusedflags on uniform fields and a new comment acknowledging that loose uniforms over-reportused=1because they share a register space with sibling CBs.
Findings (5 total)
| Severity | Location | Finding |
|---|---|---|
| 🟡 Gap | source/slang/slang-ir-insts.h:~1208 |
Older serialized modules silently lose per-element array narrowing because getUniformStride() returns 0; consider a serialization version bump or a stride-recompute fallback. |
| 🟡 Gap | source/compiler-core/slang-artifact-associated.h:~194 |
The IR pass and the JSON emitter use different APIs to decide "category present" before calling selectUniformParentBinding; consolidate or assert cross-consistency. |
| 🔵 Question | tests/reflection/used-parameters.slang:~44 |
Loose top-level uniforms over-report used=1 for unread fields. Is the planned fix to give the implicit CB its own synthesized space, or is this the long-term ceiling? |
| 🔵 Question | source/slang/slang-ir-byte-granularity-usage.cpp (Swizzle / vector subscript) |
Component byte size is derived as currentSize / elementCount rather than queried from the element scalar layout; document why that stays correct under all uniform-rule families. |
| 🔵 Question | source/slang/slang-end-to-end-request.cpp:~2293 |
Programs with zero parameter groups return SLANG_E_NOT_AVAILABLE on every Uniform query; intentional, or should the empty-but-ran case return SLANG_OK + outUsed=false? |
| // the same space (including UsedCB's). The expected file reflects | ||
| // this over reporting; UnusedUniform shows used=1 even though it is | ||
| // never read. Tracked for follow up; safe over reporting only, never | ||
| // missing a real usage. |
There was a problem hiding this comment.
🔵 Question: Loose uniforms always over-report used=1 even when never read
The new comment honestly documents a real over-reporting case: UnusedUniform is reported as used=1 in the expected output (used-parameters.slang.expected:386) because loose top-level uniforms aggregate into an implicit constant buffer that shares register space with other CBs, and the Uniform-category query does not take a parent-CB identity. This is acknowledged as "tracked for follow up" in the test comment.
The PR contract preserves the safety invariant (no false used:false), but for callers that strip unused uniforms based on this API, every loose top-level uniform in a multi-CB program becomes effectively non-strippable. Is the plan to give the implicit CB its own synthesized space so the Uniform query disambiguates it, or is the long-term position that loose uniforms simply cannot be tracked at field granularity? A one-line note in docs/user-guide/09-reflection.md capturing whichever answer is intended would help integrators reason about the expected ceiling on tracking precision.
| if (elementCount == 0) | ||
| { | ||
| outRanges.add({currentOffset, currentSize}); | ||
| break; |
There was a problem hiding this comment.
🔵 Question: Swizzle / vector subscript element-size derivation could under-record on padded vectors
Both the Swizzle and the vector branch of GetElement(Ptr) derive component byte size as elementSize = currentSize / elementCount, where currentSize is the uniform byte size of the containing vector layout. For naturally packed scalar vectors (float4, int2, etc.) this matches the natural packing — 16 / 4 == 4, 8 / 2 == 4 — and the recorded byte ranges are correct.
For vectors of types whose uniform size differs from their natural packed size (e.g. a bool4 whose uniform-rule size is 16 but whose stored components might be padded), currentSize / elementCount is still likely correct under the cbuffer rules, but the math is implicit. Is there a reason this stays as bare integer division rather than asking the type layout (or the element scalar layout) for an explicit per-component byte size? A short comment justifying why no scalar-component layout query is needed would harden the invariant against future layout-rule additions, and would mirror the explicit getUniformStride() rationale used for the array branch above.
Not flagging as a Gap because the existing tests (used-uniforms-swizzle.slang, used-uniforms-array-stride.slang) confirm the behavior matches the documented byte offsets for all currently shipped vector types.
| return {constantBufferSpace, constantBufferIndex, true}; | ||
| if (hasDescriptorTableSlot) | ||
| return {descriptorTableSlotSpace, descriptorTableSlotIndex, true}; | ||
| return {0, 0, false}; |
There was a problem hiding this comment.
🟡 Gap: Provider-of-truth for "category present" diverges between the IR pass and the JSON emitter even though they share selectUniformParentBinding
selectUniformParentBinding documents that "presence is decided by the caller (the layout actually carries an offset for that category)". The two callers decide presence with different APIs:
- IR pass (
slang-ir-byte-granularity-usage.cpp):cbAttr = varLayout->findOffsetAttr(LayoutResourceKind::ConstantBuffer)anddtAttr = varLayout->findOffsetAttr(LayoutResourceKind::DescriptorTableSlot)→ presence = pointer non-null. - JSON emitter (
slang-reflection-json.cpp): walksparam->getCategoryCount()/getCategoryByIndex()→ presence = a category-index loop hit.
These should agree on the same set of params, and on every test in this PR they do. But they are derived from different layers (IR layout attrs vs. the public reflection categories enumeration), and a future divergence (a category that surfaces in one but not the other) would silently make recorded byte ranges fail to match in the emitter — exactly the silent failure mode the helper's comment warns about.
Suggestion: Either (a) add a single helper next to selectUniformParentBinding that takes the param and returns (hasConstantBuffer, hasDescriptorTableSlot) from one source, used by both callers; or (b) add an SLANG_RELEASE_ASSERT in the JSON emitter cross-checking the picked (parentBindingSpace, parentBindingIndex) against the categories the param actually exposes, so a mismatch fails loudly during testing rather than producing missing usedByteRanges.
There was a problem hiding this comment.
Not a defect. getCategoryByIndex returns resourceInfos[index].kind and ParameterCategory is locked one to one onto LayoutResourceKind by an enforced macro, so the category loop and findOffsetAttr see the same kinds and cannot silently diverge.
Already guarded by used-uniforms-vulkan-cb-nonzero-set.slang, where set 2 pulls parentSpace 0 and parentBindingSpace 2 apart. It passes and emits usedByteRanges at space 2, proving both pick the same parent.
| { | ||
| Builder(IRBuilder* irBuilder, IRTypeLayout* elementTypeLayout) | ||
| : Super::Builder(irBuilder), m_elementTypeLayout(elementTypeLayout) | ||
| Builder(IRBuilder* irBuilder, IRTypeLayout* elementTypeLayout, UInt uniformStride) |
There was a problem hiding this comment.
🟡 Gap: Existing serialized IR modules silently regress to "stride unknown" after this change
The new operand on IRArrayTypeLayout makes getUniformStride() return 0 for "modules serialized before the stride was tracked" (per the docstring), and the byte-granularity pass then falls through to its widening fallback whenever it sees a literal index into one of those arrays:
if (!indexLit || !elemFinite || elemSize == 0 || elemStride == 0)
{
outRanges.add({currentOffset, currentSize});
break;
}So a precompiled module loaded from a slang-module file built before this PR loses per-element narrowing for arrays — every constant-indexed arr[k] widens to the whole array. The behavior is still safe (it's still a correct over-approximation), but it's a silent precision regression that won't show up in any test in this PR (all tests compile from source).
Suggestion: Either (a) bump the slang-module serialization version so older modules are recompiled rather than silently widening, or (b) when a non-uniform IRArrayTypeLayout operand is missing on deserialization, fall back to recomputing uniformStride from the element type-layout's uniform size — only widen on layouts where stride is genuinely unknowable. At minimum, the docstring's "modules serialized before the stride was tracked" branch deserves an explicit regression test or a note in the PR body about the expected precision drop on older artifacts.
There was a problem hiding this comment.
Not an issue. Array layout IR (stride operand included) is never serialized into a .slang-module. It is target specific and recomputed fresh on every compile via createIRModuleForLayout, so a reloaded module gets a freshly built stride no matter which compiler wrote it.
Verified end to end. I built slangc on master, emitted a module from a cbuffer array shader, then reloaded it into this branch. usedByteRanges still narrows to offsets 0 and 32 at size 16, identical to compiling from source, and dump-module confirms the serialized module carries zero layout IR.
So older artifacts cannot lose narrowing, because they never carried the layout in the first place. The getUniformStride 0 path for serialized modules is unreachable here.
| if (metadata->getUniformParamUsage().count == 0) | ||
| return SLANG_E_NOT_AVAILABLE; | ||
| } | ||
|
|
There was a problem hiding this comment.
🔵 Question: Empty-CB program returns SLANG_E_NOT_AVAILABLE on every Uniform query
Per the new check, when metadata->getUniformParamUsage().count == 0 the function returns SLANG_E_NOT_AVAILABLE for any Uniform query. This correctly disambiguates "tracking didn't run" from "tracked and unused" for normal programs.
For a program with no parameter groups at all (e.g. a compute shader using only resources, no cbuffers/parameter blocks/loose uniforms), the IR pass will produce an empty m_uniformParamUsage list, which the API path then reports as not-available. A caller asking "is byte 0 of space 0 used?" gets SLANG_E_NOT_AVAILABLE and (per the documented contract) must conservatively treat that as used and bind it.
That's a benign pessimism — there's nothing to bind — but it does mean a caller cannot use a SLANG_OK + outUsed=false response to confirm "this program has no uniform usage at all." Is that intentional, or should the entry-to-end path distinguish "tracking ran and produced zero entries" from "tracking didn't run" so the empty-CB program returns SLANG_OK + outUsed=false?
Track per-uniform-field "used" in reflection metadata
Previously,
spIsParameterLocationUsedreturnedSLANG_E_NOT_AVAILABLEfor categoryUniform, and slangc -reflection-json had no "used" field for uniform data. This change tracks usage at byte-offset granularity within each global parameter's uniform region.A new IR pass (slang-ir-uniform-usage.cpp) walks each uniform-bearing
IRGlobalParamforward throughIRFieldAddress/IRFieldExtract/IRGetElement(Ptr)/IRLoadchains and records the byte ranges that reach a read.CollectMetadatainserts those ranges intom_usedBindingsas Uniform-category entries, so both the C++ API and the JSON emitter can answer per-field queries. The JSON emitter gains a "fields" array under cbuffer/struct entries inentryPoints[].bindings[], with arrays decomposed per element when statically sized.Conservative widening where static analysis runs out. Dynamic array indices (
arr[runtimeValue]) mark the whole array as used, bounded to the array's byte extent so sibling fields are unaffected. Whole aggregate values escaping across a call boundary mark the full passed sub-region. Any IR op the walker doesn't peel triggers the same sub-region widening. The result is always a correct over-approximation never a falseused:false.Discussion about this in #11107
Created with the help of Opus 4.7.