Skip to content

webgpu: two optimizations for the subgroup Gemm/MatMul kernels:#29271

Draft
xhcao wants to merge 1 commit into
microsoft:mainfrom
xhcao:double-shared-memory
Draft

webgpu: two optimizations for the subgroup Gemm/MatMul kernels:#29271
xhcao wants to merge 1 commit into
microsoft:mainfrom
xhcao:double-shared-memory

Conversation

@xhcao

@xhcao xhcao commented Jun 26, 2026

Copy link
Copy Markdown
Contributor
  1. Double-buffer the B tile in workgroup memory when type is float16.
  2. Load A as vec4 when K % 4 == 0.

Description

Motivation and Context

1. Double-buffer the B tile in workgroup memory when type is float16.
2. Load A as vec4 when K % 4 == 0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant