Skip to content

[2/3] ByteAddressBuffer: decide wide-vs-scalarized + validate the alignment contract#11595

Open
nv-slang-bot[bot] wants to merge 3 commits into
masterfrom
fix/issue-11591
Open

[2/3] ByteAddressBuffer: decide wide-vs-scalarized + validate the alignment contract#11595
nv-slang-bot[bot] wants to merge 3 commits into
masterfrom
fix/issue-11591

Conversation

@nv-slang-bot

@nv-slang-bot nv-slang-bot Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Part 2 of 3 decomposing the original #11564 for #11545. Stacked on #11594 ([1/3], the constexpr contract); rebase down to master once [1/3] lands. Held as draft pending review.

Closes #11591. Closes #11592.

Motivation

With the alignment argument established as a compile-time (constexpr) contract in [1/3], the legalizer can stop re-deriving alignment from the load/store offset and instead make wide-vs-scalarized chunking a pure function of the promise. Two gaps remain that this PR closes:

Motivating case for the single-arg fix:

// float3 has a 12-byte natural stride; forwarding 12 (not a power of two) is wrong.
// The largest power of two dividing 12 is 4 — a valid promise.
RWByteAddressBuffer buf;
float3 v = buf.LoadAligned<float3>(location);   // must not be rejected

Proposed solution

  1. Predicate refactor (ByteAddressBuffer alignment: decide wide-vs-scalarized from the alignment promise alone (consistency) #11591). Replace the offset-re-deriving isAligned with a pure isWideAccessAligned(baseOffset, immediateOffset, promisedAlignment, accessSize). Per maintainer review, the promise is now SLANG_RELEASE_ASSERT(alignLit)-guaranteed to be a literal after [1/3]'s constexpr folding. The baseOffset operand is retained only for the alignment == 0 (no-promise) case — dropping it there would let 0 % accessSize == 0 mark every plain Load/Store wide-aligned at an unproven offset (a miscompile). For any non-zero promise, the decision is promise % accessSize == 0 && immediateOffset % accessSize == 0 and baseOffset is unused.

  2. Validation (ByteAddressBuffer alignment: power-of-two requirement, scalar-component relaxation, single-arg natural-alignment fix #11592) in validateExplicitAlignment: a non-zero promise must be a power of two (E41301) and at least the access type's scalar-component alignment (E41300, relaxed wording); a compile-time-constant location must be a multiple of an otherwise-valid promise (E41303). The custom 41302 diagnostic is removed — compile-time-constness is enforced by the constexpr parameter itself (front-end E40012 need-compile-time-constant), so no dedicated diagnostic is needed.

  3. Natural-alignment forwarding (ByteAddressBuffer alignment: power-of-two requirement, scalar-component relaxation, single-arg natural-alignment fix #11592). Introduce the getNaturalAlignment IR op (__naturalAlignmentOf<T> — the largest power of two dividing T's natural stride) and switch the implicit single-argument *Aligned accessors to forward it instead of the stride, so float3 forwards a valid 4.

These two issues are merged into one PR so the pre-existing alignment diagnostic transitions E41300 → E41301 in a single coherent commit — splitting them would leave byte-address-buffer-align-error.slang red in the window between the predicate refactor and the validation restore.

Change summary

Area File What
Predicate slang-ir-byte-address-legalize.cpp isAligned → pure isWideAccessAligned; validateExplicitAlignment (41301 / relaxed-41300 / 41303)
Diagnostics slang-diagnostics.lua add 41301, 41303; reword 41300; (41302 removed)
IR op slang-ir-insts.lua, slang-ir-insts-stable-names.lua, slang-ir.h, slang-ir-peephole.cpp, slang-ir-use-uninitialized-values.cpp getNaturalAlignment op + constant-fold
Stdlib hlsl.meta.slang, core.meta.slang single-arg *Aligned forwards __naturalAlignmentOf<T>
Tests byte-address-buffer-align-error.slang (E41300→E41301), byte-address-buffer-alignment-11545.slang (DIAGNOSTIC_TEST), byte-address-buffer-singlearg-float3-11592.slang (-cpu COMPARE_COMPUTE), *-aligned, *-array validation + single-arg coverage

Concepts and vocabulary

  • alignment promise — the constexpr alignment argument; after [1/3]'s folding it reaches the legalizer as an IRIntLit. 0 is the no-promise sentinel for plain Load/Store.
  • __naturalAlignmentOf<T> — largest power of two dividing T's natural stride (float4→16, float2/half4→8, float3→4). Distinct from __naturalStrideOf<T>, which can be a non-power-of-two (float3→12).

Process report

  • isWideAccessAligned is a pure predicate over (promise, immediateOffset, accessSize) — the input shape (a folded IRIntLit promise) is guaranteed by [1/3]'s constexpr + SCCP, so the SLANG_RELEASE_ASSERT(alignLit) is a fail-loud contract check, not a guard over malformed input. The baseOffset retention is the one principled exception (the alignment == 0 no-promise path has no literal promise to reason from), justified above.
  • 41302 removal answers the input-shape question at the right layer: compile-time-constness is a front-end property of the argument (constexpr, E40012), not something the IR legalizer should re-diagnose.
  • getNaturalAlignment is a new canonical IR op rather than an open-coded computation so the single-arg forwarding and any future consumer share one definition and constant-fold identically.

🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify.

Comment thread tests/compute/byte-address-buffer-consistency-11591.slang Outdated
Comment thread source/slang/slang-ir-byte-address-legalize.cpp Outdated
@jkwak-work jkwak-work self-assigned this Jun 13, 2026
nv-slang-bot Bot and others added 3 commits June 16, 2026 00:42
Mark the `alignment` parameter of `LoadAligned`/`Store` and the underlying
`__byteAddressBufferLoad`/`__byteAddressBufferStore` intrinsics `constexpr`, so
a non-compile-time-constant alignment is rejected by the language's built-in
constexpr enforcement (diagnostic 40012) instead of a bespoke check. Front-end
enforcement, so it is target-agnostic. Part of the #11545
ByteAddressBuffer alignment-contract decomposition (slice 1 of 4).

Co-authored-by: Harsh Aggarwal <haaggarwal@nvidia.com>
…e alone

Replace the decide-and-diagnose `isAligned` with a pure, promise-first
`isWideAccessAligned`: a non-zero `constexpr` alignment promise is the sole
input to the wide-vs-scalarized decision, so a constant and a runtime
`location` carrying the same promise legalize identically. A
compile-time-constant base-offset fallback is retained only for the no-promise
(`alignment == 0`) plain `Load`/`Store` forms (dropping it verbatim would
wide-load every no-promise access at an unproven offset). Assert the alignment
operand is an `IRIntLit`, guaranteed by slice 1's `constexpr`. Part of
#11545 (slice 2 of 4).

NOTE (pending boundary decision): removing the in-predicate 41300 diagnostic
leaves tests/compute/byte-address-buffer-align-error.slang expecting an error
that slice 3 reintroduces (relaxed, as 41301). That test is transiently red in
this slice under a 4-way (Option X) split; a 2+3 merge (Option Y) avoids it.

Co-authored-by: Harsh Aggarwal <haaggarwal@nvidia.com>
…gnment

Add the byte-address alignment value checks to the legalizer in
validateExplicitAlignment: a non-zero `alignment` promise must be a power of two
(41301) and at least the access type's scalar-component alignment (41300), and a
compile-time-constant `location` must be a multiple of an otherwise-valid
promise (41303). Compile-time-constness itself is enforced by the `constexpr`
`alignment` parameter (front-end diagnostic 40012), so no dedicated diagnostic
is needed here.

Introduce the `getNaturalAlignment` IR op (`__naturalAlignmentOf<T>`) — the
largest power of two dividing `T`'s natural stride — and switch the implicit
single-argument `*Aligned` accessors to forward it instead of the stride, so a
`float3` (12-byte stride, not a power of two) forwards a valid alignment of 4
and is no longer spuriously rejected by 41303.

This merges the consistency (point 3) and validation/natural-alignment (issue
#11592) work into one change so the pre-existing alignment diagnostic transitions
from E41300 to E41301 in a single coherent step.

Co-authored-by: Harsh Aggarwal <haaggarwal@nvidia.com>
@nv-slang-bot

nv-slang-bot Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

@jkwak-work — a sequencing question on the #11545 decomposition before I rework this slice.

This PR (Slice 2) currently carries diagnostic 41303 (validateExplicitAlignment) so the promise-first wide-vs-scalarized decision is self-contained. On closer analysis that placement is not sound at this layer: simplifyIR folds the single-argument accessors' __naturalStrideOf<T> operand to a literal before legalization runs, so a natural-stride promise is indistinguishable from an explicit one. 41303 then false-rejects valid single-argument loads — e.g. LoadAligned<float3>(16) / Load3Aligned(16) (rejected because uint3's stride 12 ∤ 16, though offset 16 is fine) — and the only way to keep the tests green here is a masking re-expression (Load3Aligned(16, 4)).

41303 really couples with the __naturalAlignmentOf producer fix (a true power-of-two alignment) planned for Slice 3 (#11592), which is where the IR can tell an explicit promise from a forwarded natural stride.

Recommendation: defer 41303 to Slice 3 and make this PR a pure consistency change (constant vs. runtime location with the same promise lower identically) that depends on #11592. This reverses the originally-confirmed Slice 2 → Slice 3 ordering, so I'd like your confirmation on the re-sequence before reworking. Happy either way — just want the slice ordering to be your call.


Update — CI triage on the current diff (8adee03c):


Note (2026-06-22): the shared base fix/issue-11591 currently reflects a stale combined alignment stack following a base-branch collision with the stacked PR #11596 (Slice 3, which is based on this branch). Work on this branch is paused pending @jkwak-work's #11545 decomposition decision; once the structure is chosen (one combined PR, or Slice 2 + Slice 3 as separate stacked PRs) the branch will be rebased onto current master accordingly. The Slice-2 consistency analysis above stands; only the branch state is stale.

🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify.

@nv-slang-bot nv-slang-bot Bot force-pushed the fix/issue-11591 branch from 8adee03 to eda63ff Compare June 22, 2026 16:29
@nv-slang-bot nv-slang-bot Bot changed the title ByteAddressBuffer alignment: decide wide-vs-scalarized from the alignment promise alone [2/3] ByteAddressBuffer: decide wide-vs-scalarized + validate the alignment contract Jun 22, 2026
@jkwak-work

Copy link
Copy Markdown
Collaborator

@nv-slang-bot , the first slice of this set of PRs is merged to ToT.
Please rebase to contain the 1/3 PR.

@nv-slang-bot

nv-slang-bot Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

On it — rebasing #11595 onto current ToT (2b14ffd06) to build on the merged 1/3 (#11594), dropping the now-redundant Slice-1 commit; SLANG_RELEASE_ASSERT is in. Will report the rebased state and how Slice-3 (#11596) relates — fold vs. keep stacked — for your confirmation.

🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify.

@jkwak-work jkwak-work marked this pull request as ready for review June 27, 2026 00:53
@jkwak-work jkwak-work requested a review from a team as a code owner June 27, 2026 00:53
@jkwak-work jkwak-work requested review from bmillsNV and removed request for a team June 27, 2026 00:53
@jkwak-work

Copy link
Copy Markdown
Collaborator

@nv-slang-bot , please resolve the merge conflict with master and rebase.

@nv-slang-bot

nv-slang-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor Author

On it, @jkwak-work — the rebase onto current ToT is staged (dropping the now-merged 1/3 commit, SLANG_RELEASE_ASSERT preserved). I'm re-pointing it at the latest master to clear the conflict and building it clean before I force-push the (still draft) branch. Will confirm here once pushed.

🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify.

@nv-slang-bot

nv-slang-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor Author

@jkwak-work — the rebase is done and verified locally, but I hit a hard wall pushing it: this bot account is a GitHub App without workflows permission, and rebasing onto current master necessarily carries master's recent .github/workflows/* updates (29 workflow files changed in this window — publish-ci-container-images.yml added, falcor-*.yml removed, ci.yml updated, etc.). GitHub rejects the push — "refusing to allow a GitHub App to create or update workflow .github/workflows/ci.yml without workflows permission" — for both a force-update of this branch and a fresh branch. So I can't land it from this account.

What the rebase produced (verified against current ToT 60d15fef0):

  • Two commits on top of 60d15fef0: Decide ByteAddressBuffer wide-vs-scalarized from the alignment promise alone + Validate ByteAddressBuffer alignment contract and forward natural alignment. The now-merged 1/3 commit is dropped.
  • SLANG_RELEASE_ASSERT(alignLit) preserved (both sites).
  • Stable-name collision resolved append-only: my getNaturalAlignment id 864 (now taken by Abort) → 867.
  • Build green; byte-address compute/diagnostic suite 29/29 pass (15 GPU-target cases ignored on this no-GPU runner).
  • One 1-line test-only fix folded in: my byte-address-buffer-singlearg-float3-11592.slang CHECKed decimal 2.0/3.0/4.0, but -output-using-type renders the untyped RWByteAddressBuffer as raw uint words — the values round-trip correctly, so the CHECKs are now the IEEE-754 bit patterns 40000000/40400000/40800000 (no assertion weakened).

Could you either rebase/push this branch yourself (you have workflows permission), or let me know if there's a workflow-scoped token I should use? I can also hand the exact two commits as a git am-able patch for a one-step apply onto master, if that's easier. I left the PR's ready/draft state untouched and did not touch tests/bugs/gh-9931.slang (its Store<...>(4,h,8) is the separate 4 % 8 != 0 case under your E41303 call).

🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr: breaking change PRs with breaking changes

Projects

None yet

1 participant