[DNI] Add -fgl-remap-z to remap SV_Position.z for GLSL vertex output#11789
[DNI] Add -fgl-remap-z to remap SV_Position.z for GLSL vertex output#11789nv-slang-bot[bot] wants to merge 5 commits into
Conversation
Adds a new compiler flag `-fgl-remap-z` that remaps SV_Position.z from the OpenGL clip-space depth range [-w, w] to the standard [0, w] range via z' = (z + w) / 2, for shaders authored against the OpenGL [-1, 1] NDC depth convention. Scoped to the textual GLSL target on the vertex stage only; reuses the IRGLPositionOutputDecoration and mirrors the -fvk-invert-y position-fixup machinery. Closes #11599
- slang-emit.cpp: iterate getEntryPointIndices() and resolve each via getEntryPoint(index). getEntryPointCount() counts selected indices, so a bare position index was the wrong argument to getEntryPoint() for any non-first selected entry point. - slang-ir-vk-invert-y.cpp: remove the silent non-vector store skip in remapZOfPositionOutput; the position output always lowers to a full-float4 store, so _remapZOfVector's existing SLANG_ASSERT is the loud check (matching invertYOfPositionOutput), rather than masking an impossible shape. Document that the pass is store-only because the affine remap is not involutive. - gl-remap-z.slang: add a no-flag run asserting the position is written unmodified, locking the off-by-default opt-in.
remapZOfPositionOutput rewrites every position output in the linked module, so the scheduling gate must guarantee the module is a lone vertex shader. Whole-program GLSL codegen reaches linkAndOptimizeIR with getEntryPointCount() > 1 (emitEntryPointsSourceFromIR), where a mixed vertex+mesh/geometry request would previously pass the "any entry point is vertex" check and then remap non-vertex position outputs too. Tighten the gate to getEntryPointCount() == 1 && that single entry point is Stage::Vertex; leave mixed-stage whole-program requests untouched. Also tighten the FileCheck/COMPOSE patterns to assert the / 2.0 halving so the test cannot pass on a bare (z + w) without the scale factor.
jkwak-work
left a comment
There was a problem hiding this comment.
Looks good to me.
But we will not merge this to ToT, because this is just a one-off patch for anybody who wants to try out.
Slang doesn't support the legacy behavior of GLSL.
|
Understood, and thank you for the review and the clear direction. 👍 Acknowledging the disposition: this stays a one-off, cherry-pickable reference for anyone who wants to try the OpenGL Leaving the PR open as that reference unless you'd prefer it closed — happy to close on your word. 🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify. |
State the real precondition -- a gated GLSL vertex gl_Position is not read back within the shader after the output write -- as the justification for omitting invertYOfPositionOutput's IRLoad branch, instead of appealing to non-involutivity (which only explains why _remapZOfVector cannot be reused for a read-back, not why the omission itself is correct). Comment-only; no functional change.
|
/regenerate-cmdline-ref |
|
CI status note (for anyone cherry-picking this reference): The only failing check is One-line fix for anyone with a build: (or I couldn't run it here — no local build (disk-full, disclosed above), and the bot's 🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify. |
|
/regenerate-cmdline-ref |
|
🌈 Regenerated command line reference, please merge the changes from this PR |
Motivation
Shaders authored against the OpenGL clip-space convention emit
SV_Position.zin the
[-w, w]range (NDC depth[-1, 1]after the perspective divide). Thestandard convention used by Vulkan/SPIR-V/D3D/Metal — and the default the GLSL
backend targets — is
[0, w](NDC depth[0, 1]). When such a shader iscross-compiled to textual GLSL for a desktop-GL pipeline that was set up for
the
[0, 1]convention (e.g. viaglClipControl(... GL_ZERO_TO_ONE)or a hostthat does its own depth handling), the depth comes out wrong with no per-shader
knob to correct it.
This is not a DXC-parity gap (D3D, Vulkan, and Metal all already share the
[0, 1]depth range) — it is the one place desktop GL differs, so it needs aGLSL-scoped opt-in rather than a target-wide behavior change.
Concretely, given:
-target glsl ... -fgl-remap-znow emits, just before the write togl_Position:i.e.
z' = (z + w) / 2, mapping[-w, w]to[0, w].Proposed solution
Add a new compiler option
-fgl-remap-zthat runs a small IR fix-up pass on theposition output, directly mirroring the existing
-fvk-invert-y/-fvk-use-dx-position-wmachinery. The remap is affine and reads both the z(index 2) and w (index 3) components of the position vector, so unlike the
single-component y inversion it must operate on the full
float4.Scope is deliberately narrow, per maintainer direction on #11599:
isKhronosTarget, so thepass is gated on
target == CodeGenTarget::GLSLinside that block to keepSPIR-V/Vulkan untouched.
outputs, it runs only when codegen is for exactly one entry point and that
entry point is a vertex shader; any other case (including a mixed-stage
whole-program request) is skipped. The stage is checked at the scheduling site,
keeping the IR pass itself stage-agnostic and simple.
[-1, 1] -> [0, 1]only (z' = (z + w) / 2). The reverse andbidirectional cases are out of scope.
The pass reuses the existing
IRGLPositionOutputDecoration(no new decoration)and is off by default, so it is a pure opt-in with no effect on existing output.
Change summary
include/slang.hCompilerOptionName::GLSLRemapZ = 153immediately beforeCountOf(ABI-safe append; explicit next value).source/slang/slang-options.cpp-fgl-remap-zcommand-line option and route it through the existing boolean-option dispatch group.source/slang/slang-compiler-options.cppwriteCommandLineArgs(boolean group).source/slang/slang-ir-vk-invert-y.{h,cpp}remapZOfPositionOutputpass +_remapZOfVectorhelper computingz' = (z + w) / 2, modeled oninvertYOfPositionOutput/_invertYOfVector.source/slang/slang-emit.cpptests/cross-compile/gl-remap-z.slang-fvk-invert-y.Concepts and vocabulary
IRGLPositionOutputDecoration— marks the global that lowers togl_Position; both the invert-y and this remap pass find their store target byscanning for it (and following
getElementPtrchains).[0, w]<-> GL[-w, w]:z_gl = 2*z - w, inversez_std = (z + w) / 2. This PR implements only the GL->standard inverse.isKhronosTarget— true for both textual GLSL and SPIR-V; the innertarget == CodeGenTarget::GLSLcheck is what excludes SPIR-V here.Process report
include/slang.h— A new publicCompilerOptionNameis required to carrythe boolean. Appended as
GLSLRemapZ = 153(the next value afterTraceCoverageBoolean = 152) immediately before theCountOfsentinel, perthe ABI rule for this enum. The historical
CountOfParsableOptions = 111decoy sentinel is intentionally not touched.
source/slang/slang-options.cpp— Adds the option's help-table entry andthe
case OptionKind::GLSLRemapZ:in the boolean dispatch, alongsideVulkanInvertY/VulkanUseDxPositionW.OptionKindis a typedef ofCompilerOptionName, so this is the same enum.source/slang/slang-compiler-options.cpp— AddsGLSLRemapZto theboolean group in
writeCommandLineArgsso the option survives serialization,matching its siblings.
source/slang/slang-ir-vk-invert-y.cpp—_remapZOfVectorreads z and wwith
emitSwizzle, computes(z + w) / 2withemitAdd/emitDiv/getFloatValue, and writes z back withemitSwizzleSeton the full vector.remapZOfPositionOutputreusesinvertYOfPositionOutput's store/getElementPtrtraversal. Input-shape check: the position output is lowered as a single
full-
float4store, so the store value is always a vector; rather than guardingagainst a non-vector store (which would silently mask an impossible shape),
_remapZOfVectorSLANG_ASSERTs the vector invariant — identical to how_invertYOfVectoralready asserts it. The shape is therefore a genuine, validinput produced by lowering, and the producer is correct and left untouched.
Unlike y-negation, the remap is store-only:
z' = (z + w) / 2is not itsown inverse, and the scope is the written stage output, so (unlike invert-y)
there is intentionally no read-back/load branch.
source/slang/slang-emit.cpp—remapZOfPositionOutputrewrites everyposition output in the linked module, so the gate must guarantee the module is
a single vertex shader: the pass runs only when
target == CodeGenTarget::GLSL,the
GLSLRemapZoption is set,getEntryPointCount() == 1, and that singleentry point's stage is
Stage::Vertex. This is deliberately strict — amixed-stage whole-program GLSL request (which reaches this path with
getEntryPointCount() > 1, e.g.vertex + mesh) is left untouched rather thanrisk remapping a non-vertex stage's position. The GLSL/vertex gate lives here,
not in the IR pass, so the IR pass stays target/stage-agnostic.
tests/cross-compile/gl-remap-z.slang— Pins the GLSL emit shape (a tempwhose
.zis set from az/wexpression that is divided by 2.0, thenwritten to
gl_Position); a second run that composes-fvk-invert-y -fgl-remap-z, asserting (CHECK-DAG, order-independent) that y is additivelyinverted and z is
(z + w) / 2-remapped — the two transforms touchindependent components and compose cleanly; and a third run without the flag
that asserts the position is written unmodified (no
/ 2.0), locking theoff-by-default opt-in.
Notes for the reviewer
-fgl-remap-zis a proposal scoped to the GLSL target;happy to rename to whatever fits the option-naming convention — not blocking on it.
GLSL-vertex-only opt-in with no effect on existing output, intended to be easy
to cherry-pick per the request on the issue.
mesh/other stages are unaffected. Behavior for those stages is unchanged.
so this change was not compiled locally — final build and test verification
is left to CI. To compensate, every touched API was checked against HEAD
(
CodeGenContext::getEntryPointCount/getEntryPoint/getSingleEntryPointIndex,EntryPoint::getStage,Stage::Vertex,SLANG_PASS, theIRBuilderemitsignatures, the ABI append), and the GLSL swizzleSet emit shape the test pins
was sanity-checked against the existing
-fvk-invert-youtput from a prebuiltslangc. C++ changes passclang-formatlocally.Closes #11599