Skip to content

Integrated matrix extension#2

Closed
joseemoreira wants to merge 3 commits intoriscv:integrated-matrix-extensionfrom
joseemoreira:integrated-matrix-extension
Closed

Integrated matrix extension#2
joseemoreira wants to merge 3 commits intoriscv:integrated-matrix-extensionfrom
joseemoreira:integrated-matrix-extension

Conversation

@joseemoreira
Copy link
Copy Markdown
Collaborator

No description provided.

MUL = LMUL / λ², where λ is the K dimension given by the `lambda[2:0]` field in `vtype`.
The C register group may start at any vector register index.
Its register group multiplier CMUL is determined by the tile geometry:
CMUL = VLENE / λ², where λ is the K dimension given by the `lambda[2:0]` field in `vtype`.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you mean "VLEN" not "VLENE"?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean VLENE = VLEN/SEW. Isn't that the right terminology? The shape of the tiles is sigma x lambda, where sigma = VLENE/lambda. The multiplier for C is sigma/lambda, so VLENE/lambda^2. After all, for a given lambda, sigma increases as SEW decreases.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find VLENE anywhere in the specificatin. It looks like the specification always uses VLEN/SEW to denote the number of elements per single register.

I'll merge with this fixup (i.e., replace VLENE with (VLEN/SEW).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we talk a lot about VLENE but it does not appear in the spec. VLEN/SEW (which is how I would define VLENE) is the preferred (only) form. Thanks for catching this and merging.

* The _input matrices_ A and B are stored in vector register groups with element width determined by the instruction:
equal to SEW for non-widening variants, SEW/2 for widening, and SEW/4 for quad-widening variants.
The K dimension of the multiply equals λ for non-widening instructions, 2λ for widening, and 4λ for quad-widening; LMUL scales the A and B register groups along the K dimension only and does not affect C.
equal to SEW for non-packing variants, SEW/2 for double-packing, and SEW/4 for quad-packing variants.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use widening to describe the abstract operations and have a separate subsection explaining the concept of "packing"? I.e., the instructions will be widening, but the storage-format will be packed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that works. The arithmetic is "widening", whereas the storage format is "packed". I saw your other email on that.

@ptomsich ptomsich force-pushed the integrated-matrix-extension branch from d5538c4 to 735ba1d Compare February 23, 2026 11:47
@ptomsich
Copy link
Copy Markdown
Collaborator

Manually merged as 305f6aa

@ptomsich ptomsich closed this Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants