Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR includes initial testing for the SPV_EXT_arithmetic_fence extension.
This test works by performing the calculation
big + small - big
, wherebig
is a large floating-point number andsmall
is a small floating-point number. Both the add and the subtract have theFPFastMathMode=Fast
decoration, so absent any arithmetic fences, many implementations will reassociate this calculation tobig - big + small
, and further optimize it to justsmall
.The test puts an arithmetic fence around
big + small
, so the calculation is actuallyarithmetic_fence(big + small) - big
. Since the arithmetic fence blocks reassociation, and there is not enough precision to representbig + small
, with the artihmetic fence the calculation must produce0
. If any of the calculations produce a nonzero result, then the arithmetic fence is probably ignored, and the test fails.There are separate test cases for fp32, fp64, and fp16 arithmetic fences, depending on the floating-point types supported by the device. Each test case tests the arithmetic fence on scalars and all supported vector sizes.
I'll keep this PR as a draft while we work out how to check whether the SPV_EXT_arithmetic_fence extension is supported.