Integer multiply-add instructions

Unlike the float case where the fused-vs-unfused issue creates complications (PR #79) in the integer case there is no downside to using single-instruction multiply-add. These are vital to getting above 50% of peak performance in key use cases such as matrix multiplication.

In general, these will support different combinations of bit-widths for the accumulator vs the mul operands.

A variant of this is the dot-product instructions discussed in PR #127. We need both these dot-product instructions, and general element-wise integer multiply-add.

Note that these are often used in kernels that are using nearly all available SIMD registers. That is why an approach of not exposing mul-add instructions in WebAsm and trying to let the compiler still transform code to use them, would often result in unwanted spillage. In fact, the source code will often be tailored to use a specific number of SIMD registers in the first place; not offering a multiply-add instruction to the source, requiring it to use separate Mul and Add with intermediate registers, would hinder that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integer multiply-add instructions #224

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integer multiply-add instructions #224

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions