Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

Integer multiply-add instructions #224

Open
@bjacob

Description

@bjacob

Unlike the float case where the fused-vs-unfused issue creates complications (PR #79) in the integer case there is no downside to using single-instruction multiply-add. These are vital to getting above 50% of peak performance in key use cases such as matrix multiplication.

In general, these will support different combinations of bit-widths for the accumulator vs the mul operands.

A variant of this is the dot-product instructions discussed in PR #127. We need both these dot-product instructions, and general element-wise integer multiply-add.

Note that these are often used in kernels that are using nearly all available SIMD registers. That is why an approach of not exposing mul-add instructions in WebAsm and trying to let the compiler still transform code to use them, would often result in unwanted spillage. In fact, the source code will often be tailored to use a specific number of SIMD registers in the first place; not offering a multiply-add instruction to the source, requiring it to use separate Mul and Add with intermediate registers, would hinder that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions