-
Notifications
You must be signed in to change notification settings - Fork 51
Add quick-tune support for Attention #2169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
mlir/include/mlir/Dialect/Rock/Tuning/QuickTuningPerfconfigs.inc
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: own file for attention/g+g?
|
|
||
| #ifdef Attn_LOOKUP_TABLE_GEN | ||
|
|
||
| {"gfx900_attention_f32", {PopulateParamsAttn::initParametersAttentionGfx900, PopulateParamsAttn::nInitParametersAttentionGfx900}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess temporarily, gemm+gemm kernels will use attention quick tuning list but eventually, we'll have a list for gemm_gemm? (and potentially conv+gemm). We already have a tier1-gemmgemm list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the ticket is in progress https://github.com/ROCm/rocMLIR-internal/issues/2019
- Retire InitParamsNonAccel and InitParamsAccel - Combine MfmaGemmParamsAttr and WmmaGemmParamsAttr into AccelGemmParamsAttr
Motivation
Implement proper quick tuning support for Attention operations.
Technical Details
Resolves https://github.com/ROCm/rocMLIR-internal/issues/1887
Resolves https://github.com/ROCm/rocMLIR-internal/issues/1447
Test Plan
Test Result
Submission Checklist