Skip to content

Latest commit

 

History

History
373 lines (306 loc) · 18.9 KB

File metadata and controls

373 lines (306 loc) · 18.9 KB
title INF-0006 - Long Vector Execution Test Plan
params
authors sponsors status
alsepkow
Alex Sepkowski
alsepkow
Alex Sepkowski
Accepted
  • Impacted Projects: DXC

Introduction

This test plan covers testing all HLSL intrinsics that can take long vectors as parameters. And more specifically, it only covers testing scenarios which will get coverage from a graphics driver supporting DXIL.

These tests will verify that all DXIL opcodes and LLVM instructions which can be reached using valid HLSL in SM 6.9 can compile, run, and produce correct output when given long and native vector inputs. They will not verify that the generated DXIL is vectorized.

All tests are to be included in the HLK test binary which ships with the OS. This test binary is only built in the OS repo and based off of the ExecutionTests source code in the DXC repo. There is a script in the WinTools repo which generates and annotates the HLK tests.

We break coverage down into five test categories.

  1. Implement DXIL OpCode tests:

    • At the bottom of this document there are tables containing all HLSL operators (more on those in '3. HLSL Operator Tests') and HLSL intrinsics that can be used with long vectors. The HLSL intrinsics tables have a DXIL OpCode and LLVM instruction columns. These columns contain the intrinsic's mapped DXIL OpCodes as well as their LLVM instructions. All intrinsics have at least one DXIL OpCode or one LLVM instruction.

      Many intrinsics have trivial mappings. Atan is an example of an intrinsic with a trivial mapping. Other intrinsics have multiple DXIL OpCodes. Some intrinsics will use all listed DXIL OpCodes and/or LLVM instructions, while others will have additional logic which determines which OpCodes/Instructions are used. If an intrinsic relies on additional logic to determine which OpCodes/Instructions are used then the OpCode/Instructions will be enclosed in '[]' brackets. The sign intrinsic is an example of an intrinsic with additional logic. If an OpCode/Instruction is not enclosed in '[]' then it is used in all paths for that intrinsic.

  2. Implement LLVM Instruction tests:

    • These are the test cases for the LLVM Instructions listed in the table at the bottom of this document.
    • Because we will use HLSL intrinsics to get coverage for the DXIL OpCode tests we speculate that we will get most of the coverage needed for the LLVM Instruction tests. After implementing the DXIL OpCode tests we should be able to do a coverage audit and ammend test cases, or write simple additional ones, as needed.
    • Just as in '1. Implement DXIL OpCode Tests' some cases have multiple instructions listed. '[]' brackets are used in the same manner. And there may also be multiple instructions.
    • Additional OpCodes/Instructions are logic based (i.e float or int specific).
  3. HLSL Operator tests:

    • HLSL Operators Table in this document lists the HLSL Operators which can take long vectors as arguments.
    • Many of these operators can and will get coverage by default in the DXIL OpCode tests. But we will audit coverage and ammend test case, or write simple additional ones, as needed.
  4. Standard loading and storing of long vectors

    • These could be covered in test categories 1 and 2. But I propose we break out individual cases to ensure that we have more granular coverage.
    • Ensure we have some basic tests doing standard loading/storing of long vectors across Buffer types to test and Vector element data types to test.
    • Additionally, the above buffer types and data types should be tested by loading from a ResourceDescriptorHeap
  5. 'Creative' test cases:

Buffer types to test

  • Raw Buffers (Byte Address Buffers)
  • Structured Buffers (StructuredBuffer<T>)

Vector element data types to test

Testing will cover the following vector element data types:

  • bool, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t, float16_t, float32_t, and float64_t.

Note on packed data types: packed_int16_t and packed_uint16_t do not require HLK coverage. Although tests for these types were originally planned (see DXC issue #7683), we determined that DXIL represents packed data types as 32-bit types. Because no distinct DXIL-level behavior exists for them, dedicated HLK coverage is unnecessary.

Vector sizes and alignments to test

General sizes to test are in the range [3, 1024]. It is worth noting that the new form of rawBufferLoad will be updated to vectorize sizes < 5.

Test sizes

  • vector<TYPE, 3> : Testing one below previous vector limit. Early testing found some issues here so it was added.
  • vector<TYPE, 4> : Previous limit.
  • vector<TYPE, 5> : Testing one above previous vector limit.
  • vector<TYPE, 16> : This size of 'vector' previously only appeared as matrices.
  • vector<TYPE, 17> : Larger than any vector previously possible.
  • vector<TYPE, 35> : Arbitrarily picked.
  • vector<TYPE, 100> : Arbitrarily picked.
  • vector<TYPE, 256> : Arbitrarily picked.
  • vector<TYPE, 1024> : The new max size of a vector.
  • These sizes will be tested across Vector element data types to test

Some noteable alignment cases

  • 128 bit boundaries : Memory access for Shader Model 5.0 and earlier operate on 128-bit slots aligned on 128-bit boundaries. An example is vector<half, 7>, vector<half, 8> and vector<half, 9>. 112 bits, 128 bits, and 144 bits respectively. This boundary will tested for with 32-bit and 64-bit sized values as well.

  • Most GPUs operate on at least 32-bits at once, so what happens if you use 16-bit values and an odd number of elements. Could accessing the last element expose issues where we could overwrite the next variable if it is assuming alignment?

High level test design

  1. The test will leverage the existing XML infrastructure currently used by the existing execution tests. There are two XML files. This general design pattern exists today in the execution tests.

    • 1st XML: Used to define shader source code and metadata about that shader code. This XML file is parsed using a private class. This private class helps facilitate creation of D3D resources and execution of the shader.
    • 2nd XML: Describes metadata about the specific test cases. Used by the TAEF infrastructure for TAEF Data Driven Testing
  2. Test inputs will be hard coded in a c++ header file. This was chosen over definining inputs in the second XML as this is cleaner and easier to parse for different data types. This c++ header method also avoids needing to repeat the data set in the XML for each individual test cast. Inputs will use 'value sets' which will typically be much smaller than the desired vector test size. Values will be repeated cyclically until the vector is full. For example, a value set {1, 2, 3} used to populate a vector<int, 1024> will produce the pattern <1, 2, 3, 1, 2, 3, ...>, repeating the sequence until all 1024 elements are filled. This approach provides predictable test data while keeping input definitions manageable.

  3. Expected outputs are computed for each test case at run time.

  4. All new long vector test code is factored out into its own files.

Implementation phases

Do the test work in two simple phases.

  1. Implement and validate (locally against WARP) for all test categories.
  2. HLK related work:
  • Add a SM 6.9 HLK requirement. Includes updating the HLK requirements doc.
  • Update mm_annotate_shader_op_arith_table.py to annotate the new test cases with HLK GUIDS and requirements
  • Add new tests to HLK playlist

Shipping

Note that because DXC and the Agility SDK are both undocked from Windows it is our normal operating behavior for the HLK tests to become available with a later TBD OS release. The good news is that this doesn't prevent the tests from being available much earlier in the DXC repo. It just means that they are simply TAEF tests in the DXC repo. An HLK test includes an extra level of infrastructure for test gating, selection, and result submission for WHQL signing of drivers.

  1. Tests will be shared privately with IHVs along with the latest DXC and latest Agility SDK for testing and validation. IHVs will also be able to build and run the tests from the public DXC repo themselves. If needed Microsoft can share further instructions when the tests are available.

  2. The tests will ship with the HLK at a TBD date in a later OS release.

Test Validation Requirements

The following statements must be true and validated for this work to be considered completed.

  • All new test cases pass when run locally against a WARP device
  • All new test cases must verify applicable outputs for correctness.
  • All new test cases are confirmed to be present in HLK Studio and selectable to be run when a target device satisfies the HLK ShaderModel 6.9 requirement.
  • All new tests/test cases are added to the official WHQL HLK playlist for the OS release that the HLK tests will ship with.
  • Tests will be annoated to show which DXIL OpCode, LLVM Instructions, and HLSL operators they are intended to get coverage for.

Notes

  • Private test binaries/collateral will be shared with IHVs for validation purposes. This will enable IHVs to verify long vector functionality without waiting for an OS/HLK release.

HLSL-Operators

✅ - Means there was an explicit test case implemented for the intrinsic. ☑️ - Means the intrinsic gets coverage via other intrinsics. For example 'exp2' just uses the DXIL Opcode for Exp.

HLSL Operators

These operators generate LLVM instructions which use vectors.

Operator table from Microsoft HLSL Operators

Completed Operator Name Operator Notes
Addition +
Subtraction -
Multiplication *
Additive and Multiplicative Operators +, -, *, /, %
Array Operator [i] llvm:ExtractElementInst OR llvm:InsertElemtInst
☑️ Assignment Operators =, +=, -=, *=, /=, %=
Bitwise Operators ~, <<, >>, &, |, ^, Only valid on int and uint vectors
☑️ Bitwise Assignment Operators <<=, >>=, &=, |=, ^= Only valid on int and uint vectors
Boolean Math Operators & &, || , ?:
Cast Operator (type) No direct operator, difference in GetElementPointer or load type
Comparison Operators <, >, ==, !=, <=, >=
☑️ Prefix or Postfix Operators ++, --
☑️ Unary Operators !, -, +

Mappings of HLSL Intrinsics to DXIL OpCodes or LLVM Instructions

Trigonometry

Completed Intrinsic DXIL OpCode LLVM Instruction Basic Op Type Notes
acos Acos Unary range: -1 to 1
asin Asin Unary range: -pi/2 to pi/2. Floating point types only.
atan Atan Unary range: -pi/2 to pi/2.
cos Cos Unary no range requirements.
cosh Hcos Unary no range requirements.
sin Sin Unary no range requirements.
sinh Hsin Unary no range requirements.
tan Tan Unary no range requirements.
tanh Htan Unary no range requirements.
atan2 Atan FDiv, FAdd, FSub, FCmpOLT, FCmpOEQ, FCmpOGE, FCmpOLT, And, Select Unary Not required.

Math

Completed Intrinsic DXIL OpCode LLVM Instruction Basic Op Type Notes
abs [Imax], [Fabs] Unary Imax for ints. Fabs for floats.
ceil Round_pi Unary
exp Exp Unary
floor Round_ni Unary
fma Fma Ternary All three inputs are of the same type. Any inputs that are long vectors must have the same number of dimensions.
frac rc Unary
frexp FCmpUNE, SExt, BitCast, And, Add, AShr, SIToFP, Store, And, Or Unary Has a return value in addition to an output parameter.
☑️ ldexp Exp FMul Binary Not required. Covered by floating point multiplication and exp.
☑️ lerp FSub, FMul, FAdd Ternary Not required. FSub, FMul, and FAdd are all well covered.
log Log FMul Unary All three inputs are of the same type. Any inputs that are long vectors must have the same number of dimensions.
mad IMad Ternary
max IMax Binary
min IMin Binary
☑️ pow [Log, Exp] [FMul] , [FDiv] Binary Not required. Ops well covered by other tests.
☑️ rcp FDiv Unary Not required. Covered by floating point division.
round Round_ne Unary
rsqrt Rsqrt Unary
sign ZExt, Sub, [ICmpSLT], [FCmpOLT] Unary
☑️ smoothstep Saturate FMul, FSub, FDiv Ternary
sqrt Sqrt Unary
☑️ step FCmpOLT, Select Binary
trunc Round_z Unary
☑️ clamp FMax, FMin, [UMax, UMin] , [IMax, Imin] Ternary
☑️ exp2 Exp Unary Not required. Covered by exp.
☑️ log10 Log FMul Unary Not required. Covered by log.
☑️ log2 Log Unary Not required. Covered by log.

Float Ops

Completed Intrinsic DXIL OpCode LLVM Instruction Basic Op Type Notes
f16tof32 LegacyF16ToF32 Unary. N/A. Legacy ops omitted from long vector support.
f32tof16 LegacyF32ToF16 Unary. N/A. Legacy ops omitted from long vector support.
isfinite IsFinite Unary
️✅ isinf IsInf Unary
isnan IsNan Unary
modf Round_z FSub, Store Has a return value and an ouput value. Unary
☑️ fmod FAbs, Frc FDiv, FNeg, FCmpOGE, Select, FMul Binary

Bitwise Ops

Completed Intrinsic DXIL OpCode LLVM Instruction Basic Op Type Notes
saturate Saturate Unary
reversebits Bfrev Unary
countbits Countbits Unary
firstbithigh FirstbitSHi Unary
firstbitlow FirstbitLo Unary

Logic Ops

Completed Intrinsic DXIL OpCode LLVM Instruction Basic Op Type Notes
select Select, [ExtractElement, InsertElement] Ternary
and And, [ExtractElement, InsertElement] Not required. Covered by select. Binary
or Or, [ExtractElement, InsertElement] Not required. Covered by select. Binary

Reductions

Completed Intrinsic DXIL OpCode LLVM Instruction Basic Op Type Notes
all [FCmpUNE], [ICmpNE] , [ExtractElement, And] Unary
any [FCmpUNE], [ICmpNE] , [ExtractElement, Or] Unary
dot ExtractElement, Mul Binary

Derivative and Quad Operations

Completed Intrinsic DXIL OpCode LLVM Instruction Basic Op Type Notes
ddx DerivCoarseX Unary
ddx_fine DerivFineX Unary
ddy DerivCoarseY Unary
ddy_fine DerivFineY Unary
☑️ fwidth QuadReadLaneAt Unary
QuadReadLaneAcrossX QuadOp Unary
QuadReadLaneAcrossY QuadOp Uses different QuadOp parameters leading to different behavior. Unary
QuadReadLaneAcrossDiagonal QuadOp Uses different QuadOp parameters leading to different behavior. Unary
☑️ ddx_coarse DerivCoarseX Not required. Covered by ddx Unary
☑️ ddy_coarse DerivCoarseY Not requried. Covered by ddy Unary

WaveOps

Completed Intrinsic DXIL OpCode LLVM Instruction Basic Op Type Notes
WaveActiveBitAnd WaveActiveBit Binary
WaveActiveBitOr WaveActiveBit Binary
WaveActiveBitXor WaveActiveBit Binary
WaveActiveProduct WaveActiveOp Binary
WaveActiveSum WaveActiveOp Binary
WaveActiveMin WaveActiveOp Binary
WaveActiveMax WaveActiveOp Binary
WaveMultiPrefixBitAnd WaveMultiPrefixOp Binary
WaveMultiPrefixBitOr WaveMultiPrefixOp Binary
WaveMultiPrefixBitXor WaveMultiPrefixOp Binary
WaveMultiPrefixProduct WaveMultiPrefixOp Binary
WaveMultiPrefixSum WaveMultiPrefixOp Binary
WavePrefixSum WavePrefixOp Binary
WavePrefixProduct WavePrefixOp Binary
WaveReadLaneAt WaveReadLaneAt Binary
WaveReadLaneFirst WaveReadLaneFirst Unary
WaveActiveAllEqual WaveActiveAllEqual Unary
WaveMatch WaveMatch Unary

Type Casting Operations

Completed Intrinsic DXIL OpCode LLVM Instruction Basic Op Type Notes
asdouble MakeDouble Binary
asfloat BitCast Unary
asfloat16 BitCast Unary
asint BitCast Unary
asint16 BitCast Unary
asuint (from double) SplitDouble Returns void. Has two output arguments. Converts double to two uints Binary
asuint (bitcast) BitCast Bitcast from float/int to uint Unary
asuint16 BitCast Unary