Skip to content

[LLVMCPU] Lower explicit workgroup-local allocs through dispatch ABI#24279

Open
benvanik wants to merge 1 commit intomainfrom
users/benvanik/batteries-5
Open

[LLVMCPU] Lower explicit workgroup-local allocs through dispatch ABI#24279
benvanik wants to merge 1 commit intomainfrom
users/benvanik/batteries-5

Conversation

@benvanik
Copy link
Copy Markdown
Collaborator

Add #iree_codegen.workgroup_local as a codegen memory-space attribute for authored LLVMCPU dispatch allocations that should use HAL workgroup local memory instead of the thread stack.

This deliberately matches the GPU-side semantic split: workgroup/shared memory is represented as memref.alloc in a workgroup memory space, while memref.alloca remains private stack scratch. LLVMCPU uses an IREE-specific memory-space attribute instead of #gpu.address_space, but keeps the same alloc/dealloc ownership model.

The assignment pass is intentionally narrow: it only handles entry-block memref.alloc operations in HAL executable exports, computes byte ranges with the target data layout and ABI alignment, rejects dynamic or unsupported layouts, and refuses to overwrite existing local-memory assignments or predeclared export requirements.

ConvertToLLVM consumes the assigned range by building memref descriptors from the HAL workgroup local-memory pointer. Matching memref.dealloc operations are erased because the storage is owned by the dispatch frame.

Add #iree_codegen.workgroup_local as a codegen memory-space attribute for authored LLVMCPU dispatch allocations that should use HAL workgroup local memory instead of the thread stack.

This deliberately matches the GPU-side semantic split: workgroup/shared memory is represented as memref.alloc in a workgroup memory space, while memref.alloca remains private stack scratch. LLVMCPU uses an IREE-specific memory-space attribute instead of #gpu.address_space<workgroup>, but keeps the same alloc/dealloc ownership model.

The assignment pass is intentionally narrow: it only handles entry-block memref.alloc operations in HAL executable exports, computes byte ranges with the target data layout and ABI alignment, rejects dynamic or unsupported layouts, and refuses to overwrite existing local-memory assignments or predeclared export requirements.

ConvertToLLVM consumes the assigned range by building memref descriptors from the HAL workgroup local-memory pointer. Matching memref.dealloc operations are erased because the storage is owned by the dispatch frame.
@benvanik benvanik added the codegen/llvm LLVM code generation compiler backend label Apr 28, 2026
@benvanik benvanik marked this pull request as ready for review April 28, 2026 03:49
@benvanik benvanik removed request for Max191 and qedawkins April 28, 2026 03:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

codegen/llvm LLVM code generation compiler backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant