You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<li>Comparisons ignore the sign of 0 (so +0 equals -0).</li>
353
353
<LI>The comparison NE, when either or both operands is NaN returns TRUE. </li>
354
354
<LI>Comparisons of any non-NaN value against +/- INF return the correct result. </li>
355
+
<li>min(x,NaN) == min(NaN,x) == x (same for max). This used to be a deviation from IEEE 754 but now aligns with IEEE-754-2019's minimumNumber and maximumNumber operations.</li>
355
356
</ul>
356
357
<h4 id="FP32SpecialCases">Complete Listing of Deviations or Additional Requirements vs. IEEE-754</h4>
357
358
<ul>
@@ -370,20 +371,9 @@ <h4 id="FP32SpecialCases">Complete Listing of Deviations or Additional Requireme
370
371
<li>NaN input to an operation obviously always produces NaN on output, however the exact bit pattern
371
372
of the NaN is not required to stay the same (unless the operation is a raw "mov" instruction which
372
373
does not alter data at all.)</li>
373
-
<p>The IEEE-754R specification for floating point min and max operations states that if one of the inputs to min or max is a
374
-
"quiet" NaN, then the result of the operation is the other parameter. For example:</p>
375
-
<p>min(x,QNaN) == min(QNaN,x) == x (same for max) </p>
376
-
<p>A recent revision of the IEEE-754R specification seems to have adopted a different behavior
377
-
for min and max when one input is a "signaling" SNaN value vs if it was QNaN: </p>
378
-
<p>min(x,SNaN) == min(SNaN,x) == QNaN (same for max)</p>
379
-
<p>This latter change was not in place until after D3D10 had shipped, and even after the D3D11 specifications had become fairly mature and locked down.
380
-
So, even though the intent in general for D3D is to follow the standards for arithmetic: IEEE-754 and IEEE-754R, in this case there is a deviation.
381
-
Future D3D versions may consider relaxing the rules allow either behavior, although compatibility will be a concern in addition having to
382
-
justify the value of distinguishing QNaN vs SNaN in general. As for D3D11, it cannot change behavior here at this point, so it matches D3D10 as follows:</p>
383
-
<p>The arithmetic rules in D3D10+ do not make any distinctions between "quiet" and "signaling" NaN values (QNaN vs SNaN). All "NaN" values are handled the same way.
384
-
In the case of min() and max(), the D3D behavior for any NaN value is like how QNaN is handled in IEEE-754R definition above.
385
-
(For completeness - if both inputs are NaN, any NaN value is returned.)</p>
386
-
<li>Another new IEEE 754R rule is that min(-0,+0) == min(+0,-0) == -0,
374
+
<li>The arithmetic rules in D3D10+ do not make any distinctions between "quiet" and "signaling" NaN values (QNaN vs SNaN). All "NaN" values are handled the same way.</li>
375
+
<li>If both inputs to min() or max() are NaN, any NaN is returned.</li>
376
+
<li>A IEEE 754R rule is that min(-0,+0) == min(+0,-0) == -0,
387
377
and max(-0,+0) == max(+0,-0) == +0, which honor the sign, in contrast
388
378
to the comparison rules for signed zero (stated above). D3D11 recommends the
389
379
IEEE 754R behavior here, but it will not be enforced; it is permissible
This enables early depth culling and depth modification to be used together.</p>
13004
12995
<!--REM-->
13005
12996
<p>Enabling oDepth in a pixel shader disables early z culling. Early depth culling dramatically improves performance when there is medium to significant overdraw.
13006
-
Rather than having the pixel shader arbitrarily change the depth value, the shader could provide information on whether the output depth value is always less than
13007
-
or greater than the rasterizer depth value. In addition to providing the information of that oDepth is always "greater or equal to" or "less or equal to" the
13008
-
rasterizer depth, the shader compiler adds instructions to the shader to guarantee the direction indicated. This allows the depth value to be affected by the
12997
+
Rather than having the pixel shader arbitrarily change the depth value, the shader can make a promise that the output depth value is always less than
12998
+
or greater than the rasterizer depth value. This allows the depth value to be affected by the
13009
12999
shader and allows early depth culling when the declared conservative depth mode and depth comparison mode are compatible.</p>
13010
13000
<!--/REM-->
13011
13001
13012
13002
<p>If a Shader intends to use conservative depth writes, it must be <a href="#inst_ConservativeoDepthDCL">declared</a> statically in the Shader with parameters
13013
13003
<a href="#interpretedvalue_DEPTH_GREATER_EQUAL">SV_DepthGreaterEqual</a> or <a href="#interpretedvalue_DEPTH_LESS_EQUAL">SV_DepthLessEqual</a>.
13014
-
If the shader chooses SV_DepthGreaterEqual or SV_DepthLessEqual, then a guarantee is made that the shader never
13015
-
writes smaller or larger values (respectively) than the rasterizer depth value by inserting instructions that either max or min the desired output depth
13016
-
value with the rasterizer depth. If the desired output value would be in violation of the defined conservative depth type, then the rasterizer depth is used.</p>
13004
+
If the shader chooses SV_DepthGreaterEqual or SV_DepthLessEqual, then a the shader promises that it never writes smaller or larger values (respectively) than the
13005
+
rasterizer depth value. Breaking this promise results in undefined behavior.</p>
13017
13006
13018
13007
<p>The valid range is indentical to that for standard oDepth.</p>
<p>This instruction enforces the guarantee that the output depth value of the pixel shader is less than or equal to the rasterizer depth value.
13032
-
Now that the value is known to be equal to or in front of the depth values defined by the primitive, then early depth cull can be enabled when the
13033
-
depth comparison mode is "greater" or "greater or equal".</p>
13015
+
<p>If the shader declares the depth output as SV_DepthLessEqual, the system assumes it can enable early depth cull when the depth
13016
+
comparison mode is "greater" or "greater or equal".</p>
13034
13017
13035
13018
<p>Using SV_DepthGreaterEqual and SV_DepthLessEqual is valid with any depth mode, but the early depth cull will be disabled if the knowledge of is
13036
-
GreaterEqual/LessEqual is not compatible with the early depth cull optimization. The min/max test against the rasterizer depth always occurs, but the benefits
13037
-
of the guarantee are only useful with the correct depth test mode.
13019
+
GreaterEqual/LessEqual is not compatible with the early depth cull optimization.
13038
13020
</p>
13039
13021
13040
-
<h4>Rasterizer Depth Value Used in Clamp</h4>
13041
-
<p>For either clamp described above, RasterizerDepthValue is the centroid depth value if the shader is executing at pixel-frequency.
13042
-
It is enforced by the HLSL compiler that if the shader inputs depth and outputs one of the above clamped depth values,
13043
-
the input depth must be interpolated as linear_noperspective_centroid in pixel-frequency execution (if position is input at all).
13044
-
If the shader does not input position, for pixel-frequency execution the centroid depth is used for conservative depth clamping,
13045
-
and for sample-frequency execution the per-sample depth is used for per-sample conservative depth clamping.</p>
13022
+
<h4>Rasterizer Depth Value Implementations May Use to Clamp Conservative Depth</h4>
13023
+
<p>Implementations may choose to pick a particular behavior when the app breaks the promises described above by clamping to the
13024
+
rasterizer depth. The rasterizer depth value that would be used to clamp against is the centroid depth value if the shader is executing at pixel-frequency
13025
+
or sample depth at sample-frequency.
13026
+
To help enable this, it is enforced by the HLSL compiler that if the shader inputs depth and outputs one of the above depth values,
13027
+
the input depth must be interpolated as linear_noperspective_centroid or linear_noperspective_sample (if position is input at all).</p>
13028
+
13029
+
<!--REM-->
13030
+
<p>This clamping was originally intended to be performed by either the HLSL compiler or by the driver (spec was ambiguous), avoiding undefined behavior.
13031
+
But it appears tests weren't authored to verify clamping, so it turns out the compiler and many implementations haven't clamping for years (and it isn't worth starting to clamp now that this was noticed in 2025).
13032
+
13033
+
The compiler enforcement of interpolation mode described here was always present on the assumption that clamping is happening, and it isn't being removed.</p>
13034
+
<!--/REM-->
13046
13035
13047
13036
<p>The purpose for requiring centroid in pixel-frequency execution is that it guarantees the clamp is done against a safe depth value
13048
13037
within the gamut of the covered samples, thus not violating any traditional depth optimizations. More ideal would have been to
0 commit comments