Skip to content

[DNI] Add -fgl-remap-z to remap SV_Position.z for GLSL vertex output#11789

Open
nv-slang-bot[bot] wants to merge 5 commits into
masterfrom
fix/issue-11599
Open

[DNI] Add -fgl-remap-z to remap SV_Position.z for GLSL vertex output#11789
nv-slang-bot[bot] wants to merge 5 commits into
masterfrom
fix/issue-11599

Conversation

@nv-slang-bot

@nv-slang-bot nv-slang-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Motivation

Shaders authored against the OpenGL clip-space convention emit SV_Position.z
in the [-w, w] range (NDC depth [-1, 1] after the perspective divide). The
standard convention used by Vulkan/SPIR-V/D3D/Metal — and the default the GLSL
backend targets — is [0, w] (NDC depth [0, 1]). When such a shader is
cross-compiled to textual GLSL for a desktop-GL pipeline that was set up for
the [0, 1] convention (e.g. via glClipControl(... GL_ZERO_TO_ONE) or a host
that does its own depth handling), the depth comes out wrong with no per-shader
knob to correct it.

This is not a DXC-parity gap (D3D, Vulkan, and Metal all already share the
[0, 1] depth range) — it is the one place desktop GL differs, so it needs a
GLSL-scoped opt-in rather than a target-wide behavior change.

Concretely, given:

struct VOutput { float4 v : SV_Position; }
VOutput main() { VOutput o; o.v = float4(1, 2, 3, 4); return o; }

-target glsl ... -fgl-remap-z now emits, just before the write to gl_Position:

vec4 _S1 = output_0.v_0;
_S1.z = (output_0.v_0.z + output_0.v_0.w) / 2.0;
gl_Position = _S1;

i.e. z' = (z + w) / 2, mapping [-w, w] to [0, w].

Proposed solution

Add a new compiler option -fgl-remap-z that runs a small IR fix-up pass on the
position output, directly mirroring the existing -fvk-invert-y /
-fvk-use-dx-position-w machinery. The remap is affine and reads both the z
(index 2) and w (index 3) components of the position vector, so unlike the
single-component y inversion it must operate on the full float4.

Scope is deliberately narrow, per maintainer direction on #11599:

  • Target: textual GLSL only. SPIR-V also passes isKhronosTarget, so the
    pass is gated on target == CodeGenTarget::GLSL inside that block to keep
    SPIR-V/Vulkan untouched.
  • Stage: vertex only. Because the pass rewrites the whole module's position
    outputs, it runs only when codegen is for exactly one entry point and that
    entry point is a vertex shader; any other case (including a mixed-stage
    whole-program request) is skipped. The stage is checked at the scheduling site,
    keeping the IR pass itself stage-agnostic and simple.
  • Direction: [-1, 1] -> [0, 1] only (z' = (z + w) / 2). The reverse and
    bidirectional cases are out of scope.

The pass reuses the existing IRGLPositionOutputDecoration (no new decoration)
and is off by default, so it is a pure opt-in with no effect on existing output.

Change summary

File What
include/slang.h Append CompilerOptionName::GLSLRemapZ = 153 immediately before CountOf (ABI-safe append; explicit next value).
source/slang/slang-options.cpp Register the -fgl-remap-z command-line option and route it through the existing boolean-option dispatch group.
source/slang/slang-compiler-options.cpp Round-trip the option in writeCommandLineArgs (boolean group).
source/slang/slang-ir-vk-invert-y.{h,cpp} New remapZOfPositionOutput pass + _remapZOfVector helper computing z' = (z + w) / 2, modeled on invertYOfPositionOutput / _invertYOfVector.
source/slang/slang-emit.cpp Schedule the pass for GLSL + vertex only, inside the Khronos/HLSL position-fixup block.
tests/cross-compile/gl-remap-z.slang FileCheck test for the GLSL emit, including composition with -fvk-invert-y.

Concepts and vocabulary

  • IRGLPositionOutputDecoration — marks the global that lowers to
    gl_Position; both the invert-y and this remap pass find their store target by
    scanning for it (and following getElementPtr chains).
  • Affine z remap — D3D/VK [0, w] <-> GL [-w, w]: z_gl = 2*z - w, inverse
    z_std = (z + w) / 2. This PR implements only the GL->standard inverse.
  • isKhronosTarget — true for both textual GLSL and SPIR-V; the inner
    target == CodeGenTarget::GLSL check is what excludes SPIR-V here.

Process report

  • include/slang.h — A new public CompilerOptionName is required to carry
    the boolean. Appended as GLSLRemapZ = 153 (the next value after
    TraceCoverageBoolean = 152) immediately before the CountOf sentinel, per
    the ABI rule for this enum. The historical CountOfParsableOptions = 111
    decoy sentinel is intentionally not touched.
  • source/slang/slang-options.cpp — Adds the option's help-table entry and
    the case OptionKind::GLSLRemapZ: in the boolean dispatch, alongside
    VulkanInvertY / VulkanUseDxPositionW. OptionKind is a typedef of
    CompilerOptionName, so this is the same enum.
  • source/slang/slang-compiler-options.cpp — Adds GLSLRemapZ to the
    boolean group in writeCommandLineArgs so the option survives serialization,
    matching its siblings.
  • source/slang/slang-ir-vk-invert-y.cpp_remapZOfVector reads z and w
    with emitSwizzle, computes (z + w) / 2 with emitAdd / emitDiv /
    getFloatValue, and writes z back with emitSwizzleSet on the full vector.
    remapZOfPositionOutput reuses invertYOfPositionOutput's store/getElementPtr
    traversal. Input-shape check: the position output is lowered as a single
    full-float4 store, so the store value is always a vector; rather than guarding
    against a non-vector store (which would silently mask an impossible shape),
    _remapZOfVector SLANG_ASSERTs the vector invariant — identical to how
    _invertYOfVector already asserts it. The shape is therefore a genuine, valid
    input produced by lowering, and the producer is correct and left untouched.
    Unlike y-negation, the remap is store-only: z' = (z + w) / 2 is not its
    own inverse, and the scope is the written stage output, so (unlike invert-y)
    there is intentionally no read-back/load branch.
  • source/slang/slang-emit.cppremapZOfPositionOutput rewrites every
    position output in the linked module, so the gate must guarantee the module is
    a single vertex shader: the pass runs only when target == CodeGenTarget::GLSL,
    the GLSLRemapZ option is set, getEntryPointCount() == 1, and that single
    entry point's stage is Stage::Vertex. This is deliberately strict — a
    mixed-stage whole-program GLSL request (which reaches this path with
    getEntryPointCount() > 1, e.g. vertex + mesh) is left untouched rather than
    risk remapping a non-vertex stage's position. The GLSL/vertex gate lives here,
    not in the IR pass, so the IR pass stays target/stage-agnostic.
  • tests/cross-compile/gl-remap-z.slang — Pins the GLSL emit shape (a temp
    whose .z is set from a z/w expression that is divided by 2.0, then
    written to gl_Position); a second run that composes -fvk-invert-y -fgl-remap-z, asserting (CHECK-DAG, order-independent) that y is additively
    inverted and z is (z + w) / 2-remapped — the two transforms touch
    independent components and compose cleanly; and a third run without the flag
    that asserts the position is written unmodified (no / 2.0), locking the
    off-by-default opt-in.

Notes for the reviewer

  • Flag name is open. -fgl-remap-z is a proposal scoped to the GLSL target;
    happy to rename to whatever fits the option-naming convention — not blocking on it.
  • Cherry-pickable / self-contained. This is an additive, off-by-default,
    GLSL-vertex-only opt-in with no effect on existing output, intended to be easy
    to cherry-pick per the request on the issue.
  • Mesh shaders (-fvk-invert-y doesn't work on Mesh shader position output. #5761): out of scope here — the gate is vertex-stage only, so
    mesh/other stages are unaffected. Behavior for those stages is unchanged.
  • Build/verification disclosure: the local build environment is disk-full,
    so this change was not compiled locally — final build and test verification
    is left to CI. To compensate, every touched API was checked against HEAD
    (CodeGenContext::getEntryPointCount/getEntryPoint/getSingleEntryPointIndex,
    EntryPoint::getStage, Stage::Vertex, SLANG_PASS, the IRBuilder emit
    signatures, the ABI append), and the GLSL swizzleSet emit shape the test pins
    was sanity-checked against the existing -fvk-invert-y output from a prebuilt
    slangc. C++ changes pass clang-format locally.

Closes #11599

nv-slang-bot Bot added 3 commits June 26, 2026 21:17
Adds a new compiler flag `-fgl-remap-z` that remaps SV_Position.z from
the OpenGL clip-space depth range [-w, w] to the standard [0, w] range
via z' = (z + w) / 2, for shaders authored against the OpenGL [-1, 1]
NDC depth convention. Scoped to the textual GLSL target on the vertex
stage only; reuses the IRGLPositionOutputDecoration and mirrors the
-fvk-invert-y position-fixup machinery.

Closes #11599
- slang-emit.cpp: iterate getEntryPointIndices() and resolve each via
  getEntryPoint(index). getEntryPointCount() counts selected indices, so a
  bare position index was the wrong argument to getEntryPoint() for any
  non-first selected entry point.
- slang-ir-vk-invert-y.cpp: remove the silent non-vector store skip in
  remapZOfPositionOutput; the position output always lowers to a full-float4
  store, so _remapZOfVector's existing SLANG_ASSERT is the loud check
  (matching invertYOfPositionOutput), rather than masking an impossible shape.
  Document that the pass is store-only because the affine remap is not involutive.
- gl-remap-z.slang: add a no-flag run asserting the position is written
  unmodified, locking the off-by-default opt-in.
remapZOfPositionOutput rewrites every position output in the linked module,
so the scheduling gate must guarantee the module is a lone vertex shader.
Whole-program GLSL codegen reaches linkAndOptimizeIR with
getEntryPointCount() > 1 (emitEntryPointsSourceFromIR), where a mixed
vertex+mesh/geometry request would previously pass the "any entry point is
vertex" check and then remap non-vertex position outputs too. Tighten the
gate to getEntryPointCount() == 1 && that single entry point is Stage::Vertex;
leave mixed-stage whole-program requests untouched.

Also tighten the FileCheck/COMPOSE patterns to assert the / 2.0 halving so
the test cannot pass on a bare (z + w) without the scale factor.
@jkwak-work jkwak-work self-assigned this Jun 26, 2026
@nv-slang-bot nv-slang-bot Bot added the pr: non-breaking PRs without breaking changes label Jun 26, 2026
@jkwak-work jkwak-work marked this pull request as ready for review June 26, 2026 22:08
@jkwak-work jkwak-work requested a review from a team as a code owner June 26, 2026 22:08
@jkwak-work jkwak-work requested review from bmillsNV and removed request for a team June 26, 2026 22:08
@jkwak-work jkwak-work changed the title Add -fgl-remap-z to remap SV_Position.z for GLSL vertex output [DNI] Add -fgl-remap-z to remap SV_Position.z for GLSL vertex output Jun 26, 2026

@jkwak-work jkwak-work left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
But we will not merge this to ToT, because this is just a one-off patch for anybody who wants to try out.

Slang doesn't support the legacy behavior of GLSL.

@nv-slang-bot

nv-slang-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Understood, and thank you for the review and the clear direction. 👍

Acknowledging the disposition: this stays a one-off, cherry-pickable reference for anyone who wants to try the OpenGL [-w, w][0, w] depth remap on the textual-GLSL path, and is not intended for ToT — Slang does not support the legacy GLSL depth convention. The change is deliberately additive, off-by-default, and GLSL-vertex-only, so it carries no effect on existing output and remains easy to cherry-pick as-is.

Leaving the PR open as that reference unless you'd prefer it closed — happy to close on your word.

🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify.

State the real precondition -- a gated GLSL vertex gl_Position is not read
back within the shader after the output write -- as the justification for
omitting invertYOfPositionOutput's IRLoad branch, instead of appealing to
non-involutivity (which only explains why _remapZOfVector cannot be reused
for a read-back, not why the omission itself is correct). Comment-only; no
functional change.
@nv-slang-bot

nv-slang-bot Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

/regenerate-cmdline-ref

@nv-slang-bot

nv-slang-bot Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor Author

CI status note (for anyone cherry-picking this reference):

The only failing check is check-cmdline-ref (and its check-ci rollup) — not a code defect. It's that the auto-generated docs/command-line-slangc-reference.md wasn't regenerated after -fgl-remap-z was added to the slangc -help table, so the CI diff-check trips.

One-line fix for anyone with a build:

slangc -help-style markdown -h > docs/command-line-slangc-reference.md

(or /regenerate-cmdline-ref from a maintainer whose account has slash-command dispatch rights).

I couldn't run it here — no local build (disk-full, disclosed above), and the bot's /regenerate-cmdline-ref comment did not dispatch the workflow. I'm deliberately not hand-editing the auto-generated file, since without a build I can't diff-verify a byte-exact match and a near-miss would keep the check red. All correctness/build/test checks are unaffected; this is purely doc staleness.

🤖 Generated by an automated Slang coworker — may be inaccurate. A human maintainer should verify.

@jkwak-work

Copy link
Copy Markdown
Collaborator

/regenerate-cmdline-ref

@slangbot

Copy link
Copy Markdown
Contributor

🌈 Regenerated command line reference, please merge the changes from this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr: non-breaking PRs without breaking changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clip-space z remapping option

2 participants