-
Notifications
You must be signed in to change notification settings - Fork 3.2k
[WebGPU EP] Add EINSUM implementation #24358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[WebGPU EP] Add EINSUM implementation #24358
Conversation
ce7ce7f
to
62da03b
Compare
@satyajandhyala @xiaofeihan1 @qjia7 @guschmue @fs-eire, pls help to reiview, thanks. |
input_tensors.push_back(context.Input<Tensor>(i)); | ||
} | ||
|
||
EinsumEquation equation(input_tensors, equation_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The equation_
is an attribute, which means it is constant and will not change for each calls to Einsum::ComputeInternal
in this context. Is there a way to move some preparing steps into the kernel constructor so that we don't have to do this multiple times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. I dived deep into the logic and looks like it's not easy to move the eqaution parsing logic into the kernel constructor, the reason is that the equation parsing heavily depends on the input shapes, especially depends on the input dimensions to handle cases with "..." in the equation, like ...ij,...jk->...ik. When parsing the equation from left to right, we need to know the input dimensions to fill up the ellipsis dimensions. Another case that heaily depends on inputs is something like "ij,jk" which need to fill up the output shape implicitly. What we can move into kernel constructor is only the term string parsing which is low cost and then we still loop the parsed term strings and process term in the ComputeInternal function, so I think it makes little sense to make these changes given it brings little improvements. What do you think?
Description
This PR added the native implementation of einsum operator, based and expanded on existing einsum.ts. All the test cases in einsum_test.cc have been passed.
The equation attribute value of einsum op is a string which consists of left hand side (LHS) and optionally right hand side (RHS) separated by '->'. Ex.
LHS consists of a sequence of terms separated by commas. Each term corresponds to an input variable.
Each symbol corresponds to a dimension in the input variable. The symbol can be either a letter, 'a' to 'z' or 'A' to
'Z' or '...' to represent arbitrary dimensions or empty to represent a scalar.
Empty RHS are handleed differently for implicit vs explicit modes.
For all the test cases, pls refer to einsum_test.cc