Skip to content

add int8/tf32 transpose A copy traits #319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: sycl-develop
Choose a base branch
from

Conversation

taozha2
Copy link
Collaborator

@taozha2 taozha2 commented Apr 21, 2025

No description provided.

@taozha2
Copy link
Collaborator Author

taozha2 commented Apr 21, 2025

@aacostadiaz @joeatodd This PR add int8 and tf32 transpose copy traits support including both A and B. all transpose case for bf16/fp16/int8/uint8/tf32 here https://github.com/taozha2/cutlass-fork/blob/zt/gemm_layout_data_type/examples/sycl/pvc/pvc_gemm.cpp#L432 passed(MUST based on latest driver https://ubit-gfx.intel.com/build/21406574 or later)..
But as i have talked with @aacostadiaz before, latest code base have regression for int8/tf32 transpose gemm support, i think all transpose case should be passed if you fixed that regression.
please review this PR and merge it.

@taozha2 taozha2 changed the title add int8/tf32 transpose A intrinsic add int8/tf32 transpose A copy traits Apr 21, 2025
@taozha2
Copy link
Collaborator Author

taozha2 commented May 13, 2025

@aacostadiaz @mehdi-goli can you merge this PR?

@taozha2 taozha2 force-pushed the zt/transA branch 2 times, most recently from ac617ff to 7a9d570 Compare May 19, 2025 03:21
@t4c1 t4c1 mentioned this pull request May 27, 2025
@t4c1
Copy link
Collaborator

t4c1 commented May 28, 2025

@taozha2 can you merge in sycl-develop branch?

@taozha2 taozha2 force-pushed the zt/transA branch 2 times, most recently from b0f693b to 23e2f44 Compare May 28, 2025 09:29
@taozha2 taozha2 marked this pull request as draft May 28, 2025 09:30
@taozha2 taozha2 force-pushed the zt/transA branch 3 times, most recently from f157825 to 174e71e Compare May 29, 2025 01:45
@taozha2 taozha2 marked this pull request as ready for review May 29, 2025 01:52
@taozha2 taozha2 requested review from t4c1 and mehdi-goli May 29, 2025 01:52
@taozha2
Copy link
Collaborator Author

taozha2 commented May 29, 2025

@taozha2 can you merge in sycl-develop branch?

@mehdi-goli @aacostadiaz @t4c1 I have rebased the PR, you can review it

Copy link
Collaborator

@joeatodd joeatodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to avoid the _cacheopts_ variant since we don't use it elsewhere, but if it's needed for some reason then 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants