Skip to content

[WebGPU EP] Add EINSUM implementation #24358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

feich-ms
Copy link

@feich-ms feich-ms commented Apr 9, 2025

Description

This PR added the native implementation of einsum operator, based and expanded on existing einsum.ts. All the test cases in einsum_test.cc have been passed.

The equation attribute value of einsum op is a string which consists of left hand side (LHS) and optionally right hand side (RHS) separated by '->'. Ex.

  • "ij->ji" matrix transpose
  • "ii->i" diagonal elements of a square matrix
  • "ij->" sum over all elements of a matrix
  • "ij,jk->ik" explicit matrix multiplication
  • "ij,jk" implicit matrix multiplication
  • "ij,jk->" matrix multiplication and sum over all elements
  • "ij,jk,kl->il" three matrix multiplication
  • "...ij,...jk->...ik" batched matmul with broadcasting
  • ",...i->...i" matrix element multiplication with one scalar
  • "abc,cd->abc" keep the original abc matrix shape but matmul and sum over along d

LHS consists of a sequence of terms separated by commas. Each term corresponds to an input variable.
Each symbol corresponds to a dimension in the input variable. The symbol can be either a letter, 'a' to 'z' or 'A' to
'Z' or '...' to represent arbitrary dimensions or empty to represent a scalar.

Empty RHS are handleed differently for implicit vs explicit modes.

  • Implicit mode - arrow is not in the equation where the equation "ij,jk" equals to "ij,jk->ik" which is actually a matrix multiplication.
  • Explicit mode - arrow is in the equation where the equation "ij,jk->" contains two steps, first step is a matrix multiplication just like the implicit mode, and the second step is to sum up the matrix produced by the first step to a scalar.

For all the test cases, pls refer to einsum_test.cc

@satyajandhyala satyajandhyala added the ep:WebGPU ort-web webgpu provider label Apr 9, 2025
@feich-ms feich-ms force-pushed the user/feich-ms/migrate_einsum_op_to_native branch from ce7ce7f to 62da03b Compare April 11, 2025 01:27
@feich-ms feich-ms marked this pull request as ready for review April 18, 2025 03:22
@feich-ms
Copy link
Author

@satyajandhyala @xiaofeihan1 @qjia7 @guschmue @fs-eire, pls help to reiview, thanks.

input_tensors.push_back(context.Input<Tensor>(i));
}

EinsumEquation equation(input_tensors, equation_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The equation_ is an attribute, which means it is constant and will not change for each calls to Einsum::ComputeInternal in this context. Is there a way to move some preparing steps into the kernel constructor so that we don't have to do this multiple times?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I dived deep into the logic and looks like it's not easy to move the eqaution parsing logic into the kernel constructor, the reason is that the equation parsing heavily depends on the input shapes, especially depends on the input dimensions to handle cases with "..." in the equation, like ...ij,...jk->...ik. When parsing the equation from left to right, we need to know the input dimensions to fill up the ellipsis dimensions. Another case that heaily depends on inputs is something like "ij,jk" which need to fill up the output shape implicitly. What we can move into kernel constructor is only the term string parsing which is low cost and then we still loop the parsed term strings and process term in the ComputeInternal function, so I think it makes little sense to make these changes given it brings little improvements. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:WebGPU ort-web webgpu provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants