Skip to content

[0035] LinAlg: Add runtime metadata details: SVI0, PSV0, and RDAT#832

Open
tex3d wants to merge 26 commits into
microsoft:mainfrom
tex3d:linalg-runtime-data
Open

[0035] LinAlg: Add runtime metadata details: SVI0, PSV0, and RDAT#832
tex3d wants to merge 26 commits into
microsoft:mainfrom
tex3d:linalg-runtime-data

Conversation

@tex3d
Copy link
Copy Markdown
Collaborator

@tex3d tex3d commented Mar 27, 2026

This change adds:

  • LinearAlgebra feature use flag to the FeatureInfo DXIL container part (SFI0)
  • extended feature use gathering process/rules and details for PSV0 and RDAT DXIL container parts for feature use beyond LinAlg Tier 1 minimum requirements.

Comment thread proposals/0035-linalg-matrix.md Outdated
Comment on lines +1536 to +1537
* If operation is outside minimum requirements, gather and merge detailed usage
information.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should just always gather the detailed usage, and then only the runtime will need to be taught what the minimum requirements are?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that. It would end up with extra data that doesn't need to be checked, but we could consider this an optimization to reduce container size and perhaps a bit of extra runtime validation cost.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing these two optimizations help with is canonicalization of runtime data, which is important for validation. Serializing unneeded data is an opportunity for more variation in runtime data parts that wouldn't make a difference at runtime.

Currently, the validator does not independently check each element of the runtime data against the module to see whether it's valid. Instead, it expects the runtime data to exactly match what's generated by the validator (with some ordering independence). If we serialize unnecessary data, we have to either keep serializing that unnecessary data in the future or change the way we validate the runtime data to be much more sophisticated for both PSV0 and RDAT.

Comment thread proposals/0035-linalg-matrix.md Outdated
@damyanp damyanp requested a review from jenatali March 27, 2026 03:35
Comment thread proposals/0035-linalg-matrix.md Outdated
struct PSVLinAlgMatrixShapeArrayReference {
uint32_t ShapesIndex; // Index into SemanticIndexTable where array of indexes
// into LinAlgMatrixOperationShape table is located
uint32_t Count;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this Count signify?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size of the array. For each component type combination, this lists the set of unique dimensions used. The ShapesIndex is an index into the SemanticIndex buffer, which contains indices into the shared Shapes table, and Count is the number of shapes to index.

Perhaps the confusion is the name ShapesIndex, since this is an index into an index array buffer, not directly into the shape records.

uint32_t WaveMatrixMultiplyCount;
uint32_t ThreadGroupMatrixMultiplyCount;
uint32_t OuterProductCount;
uint32_t AccumulateStoreCount;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we're better off encoding the use information as typed unions rather than a bunch of counts followed by separate structure definitions. I know that a typed union could be more data, but it is also trivially easy to generate and parse.

Comment thread proposals/0035-linalg-matrix.md Outdated
The `OperandType` and `ResultType` fields will encode one of the values defined
in the [`DXIL::ComponentType` enumeration](#dxil-enumerations).
```cpp
RDAT_ENUM_START(LinAlgThreadVectorMatrixMultiplyFlag, uint8_t)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure that a bunch of code using DXC's RDAT macros is something we should rely on for specification. This isn't meaningful outside the DXC codebase and really lacks a lot of context.

I'd prefer if we defined the new structures that we're encoding either C-structs (similar to how we've done with PSV0), or as a markdown table something more like:

LinAlgMatrixOperationShape: Size 12 bytes

Offset Type Contents
0 32-bit unsigned integer M dimension (rows in matrix A)
4 32-bit unsigned integer N dimension (columns in matrix B)
8 32-bit unsigned integer K dimension (columns in matrix A / rows in matrix B)

If instead you prefer to define these as C structs, we also shouldn't just have a blob of C struct definitions, we should include explanations of the structure members and what each structure means.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an intermediate step, I've provided macro definitions and comments to clarify the record definitions for now. One reason I don't yet want to switch to a completely different definition structure is that it will be much harder to keep it in sync with the real definitions being used to iterate and verify correctness.

Comment thread proposals/0035-linalg-matrix.md Outdated
Comment on lines +1569 to +1571
1. Do we need to gather more usage information for operations other than the
ones listed above when the matrix isn't captured by one of these operations?
2. Do we need to capture component conversions with CopyConvertMatrix?
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to require all drivers to be able to load and store all possible SM6.10 component types? I'm waffling on whether that seems trivial or onerous.

Copy link
Copy Markdown
Contributor

@pow2clk pow2clk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few potential clarifications and consistency requests

Comment thread proposals/0035-linalg-matrix.md Outdated
Comment thread proposals/0035-linalg-matrix.md Outdated
* `ThreadVectorMatrixMultiply`
* Iterate `LinAlgMatVecMul`/`LinAlgMatVecMulAdd` calls and gather shapes and
types.
* Matrix operand must be `MatrixLoadFromDescriptor` for transposed flag.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. It sounds like the transposed flag takes a matrix operand and it has to be MatrixLoadFromDescriptor?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had misunderstood this. The Matrix operand for ThreadScope is always MatrixLoadFromDescriptor or Phi right?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to trace the matrix operand back to a MatrixLoadFromDescriptor, otherwise we can't determine the matrix layout, which indicates whether this matrix is considered loaded as transposed for this operation (via MulOptimalTranspose layout). We might need to disallow any complex paths obfuscating this def-use chain, such as phi, at least for now.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the wording. Let me know if it's still confusing.

Comment thread proposals/0035-linalg-matrix.md Outdated
Comment thread proposals/0035-linalg-matrix.md Outdated
Comment thread proposals/0035-linalg-matrix.md Outdated
Comment thread proposals/0035-linalg-matrix.md Outdated
Comment thread proposals/0035-linalg-matrix.md Outdated
Comment thread proposals/0035-linalg-matrix.md Outdated
Comment thread proposals/0035-linalg-matrix.md Outdated
Open questions:

1. Do we need to gather more usage information for operations other than the
ones listed above when the matrix isn't captured by one of these operations?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With Conversion of InputVectors separated out from MatVec, we'll probably have a runtime API for supported conversions and can capture the parameters of that operation here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this API is still TBD, I'll update this as necessary to supply that information. For now, there's an open question:

Do we need to capture component conversions with CopyConvertMatrix?

Which I think should cover the need to sort this out for vector conversions as well. At least that's the intent.

@tex3d tex3d force-pushed the linalg-runtime-data branch 2 times, most recently from caf21ca to 9a66beb Compare April 9, 2026 18:02
@tex3d tex3d force-pushed the linalg-runtime-data branch from 9a66beb to 2b0b5d6 Compare April 9, 2026 18:02
Copy link
Copy Markdown
Collaborator

@inbelic inbelic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits from a first pass through

> 2) Do we need both operand types, or should we expect the operands to be the
> same type?
> 3) What flags do we need?
* Another approach coulld be to follow the RDAT pattern where we have a list of
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
* Another approach coulld be to follow the RDAT pattern where we have a list of
* Another approach could be to follow the RDAT pattern where we have a list of


#define RDAT_ENUM_START(eTy, sTy) enum class eTy : sTy {
#define RDAT_ENUM_VALUE(name, value) name = value,
#define RDAT_ENUM_VALUE_ALIAS(name, value) name = value,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#define RDAT_ENUM_VALUE_ALIAS(name, value) name = value,

seems unused

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

8 participants