@@ -222,12 +222,12 @@ The matrix is loaded from a raw-buffer, **matrix resource**, starting at
222222** matrix offset** . The ** matrix interpretation** argument specifies the element
223223type of the matrix (see [ Type Interpretations] ), no conversion is performed.
224224The ** matrix M dimension** and ** matrix K dimension** arguments specify the
225- dimensions of the matrix. The** matrix layout** argument specifies the layout
225+ dimensions of the matrix. The ** matrix layout** argument specifies the layout
226226of the matrix (see [ Matrix Layouts] ). If the ** matrix transpose** is non-zero
227227then the matrix is transposed before performing the multiply (see
228228[ Matrix Transpose] ). For row-major and column-major layouts, ** matrix
229229stride** specifies the number of bytes to go from one row/column to the next.
230- For optimal layouts, ** matrix stride** is ignored .
230+ For optimal layouts, ** matrix stride** must be zero .
231231
232232Only non-packed interpretations are valid for matrices.
233233
@@ -279,7 +279,7 @@ that the input vector is an unsigned integer.
279279 following ` ComponentType ` s: ` I16 ` , ` U16 ` , ` I32 ` , ` U32 ` , ` F16 ` , ` F32 ` , ` U8 ` ,
280280 ` I8 ` , ` F8_E4M3 ` , ` F8_E5M2 ` ,
281281
282- ### Vector Outer Product
282+ ### Vector-Vector Outer Product and Accumulate
283283
284284#### Syntax
285285
@@ -312,8 +312,11 @@ The two input vectors are specified via **input vector 1** and **input vector
312312
313313The matrix is accumulated to the writeable raw-buffer specified by ** matrix
314314resource** , with ** matrix offset** , ** matrix interpretation** , ** matrix
315- layout** , and ** matrix stride** behaving as described
316- [ above] ( #matrix-vector-multiply-and-multiply-add-operations ) .
315+ layout** , and ** matrix stride** behaving as described
316+ [ above] ( #matrix-vector-multiply-and-multiply-add-operations ) .
317+
318+ Note that ** matrix layout** must be ` DXILMatrixLayout::OuterProductOptimal ` for
319+ this operation. ** matrix stride** must be 0 (for optimal layouts).
317320
318321The base address of ** matrix resource** and ** matrix offset** must be 128-byte
319322aligned. Also note that the size of the underlying allocation is guaranteed to
@@ -322,8 +325,6 @@ row/column of the matrix is valid memory. Implementations may write to the
322325contents of the padding between the end of the matrix and the 16-byte boundary,
323326so developers should not use this padding space for anything else.
324327
325- The ** matrix stride** is 16-byte aligned.
326-
327328Not all combinations of vector element type and matrix interpretations are
328329supported by all implementations. [ CheckFeatureSupport] can be used to
329330determine which combinations are supported. A list of combinations that are
@@ -336,6 +337,8 @@ guaranteed to be supported on all implementations can be found in
336337 following ` ComponentType ` s: ` I16 ` , ` U16 ` , ` I32 ` , ` U32 ` , ` F16 ` , ` F32 ` , ` U8 ` ,
337338 ` I8 ` , ` F8_E4M3 ` , ` F8_E5M2 ` ,
338339
340+ * ** matrix layout** must be ` DXILMatrixLayout::OuterProductOptimal `
341+
339342
340343### Vector Accumulate
341344
@@ -572,8 +575,9 @@ enum class DXILMatrixLayout : uint {
572575```
573576
574577Optimal layouts are opaque implementation specific layouts, the D3D call
575- `ConvertLinearAlgebraMatrix` can be used to convert the *Matrix* to an
576- optimal layout. Row-Major and Column-Major layouts are also supported.
578+ `ConvertLinearAlgebraMatrix` can be used to convert the *Matrix* to an optimal
579+ layout. Row-Major and Column-Major layouts are also supported. **matrix
580+ stride** must be zero for optimal layouts.
577581
578582
579583### Matrix Transpose
0 commit comments