-
Notifications
You must be signed in to change notification settings - Fork 1.1k
xe: grp_gemm: Add microkernel based implementation of grouped GEMM #4505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: mzhukova/main/poc-grouped-mem
Are you sure you want to change the base?
Conversation
Only supports power of 2 shapes and f16xf16:f32 types
- Adjust global work size calculation for accurate group partitioning - Correct m_all calculation to divide by num_groups for per-group M
| @@ -0,0 +1,168 @@ | |||
|
|
|||
| #include "dnnl_test_common.hpp" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is missing from benchdnn, so that this test is required..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing is missing. I want more control over how the data is initialized so that its easier to test/debug new configurations. I will probably remove this in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want more control over how the data is initialized
Could you clarify, so that I could possibly make it better/easier?
I will probably remove this in the future
I agree, we would need this removed prior merging if it is covered by benchdnn..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's difficult to initialize the input data with known values to identify incorrect results. For example I want to find out why I am getting an incorrect value at a certain location in the destination buffer. With my own tests I can make my inputs to be known values to debug. I don't know of a way to do that with benchdnn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does --buffer-prefix work?
be4fed9 to
0ea335d
Compare
This pull request introduces support for a new "grouped micro GEMM" (General Matrix Multiply) implementation, along with corresponding kernel code, and enhancements to the testing utilities to support grouped memory.