[LLVMCPU] Lower explicit workgroup-local allocs through dispatch ABI by benvanik · Pull Request #24279 · iree-org/iree

benvanik · 2026-04-28T03:24:18Z

Add #iree_codegen.workgroup_local as a codegen memory-space attribute for authored LLVMCPU dispatch allocations that should use HAL workgroup local memory instead of the thread stack.

This deliberately matches the GPU-side semantic split: workgroup/shared memory is represented as memref.alloc in a workgroup memory space, while memref.alloca remains private stack scratch. LLVMCPU uses an IREE-specific memory-space attribute instead of #gpu.address_space, but keeps the same alloc/dealloc ownership model.

The assignment pass is intentionally narrow: it only handles entry-block memref.alloc operations in HAL executable exports, computes byte ranges with the target data layout and ABI alignment, rejects dynamic or unsupported layouts, and refuses to overwrite existing local-memory assignments or predeclared export requirements.

ConvertToLLVM consumes the assigned range by building memref descriptors from the HAL workgroup local-memory pointer. Matching memref.dealloc operations are erased because the storage is owned by the dispatch frame.

Add #iree_codegen.workgroup_local as a codegen memory-space attribute for authored LLVMCPU dispatch allocations that should use HAL workgroup local memory instead of the thread stack. This deliberately matches the GPU-side semantic split: workgroup/shared memory is represented as memref.alloc in a workgroup memory space, while memref.alloca remains private stack scratch. LLVMCPU uses an IREE-specific memory-space attribute instead of #gpu.address_space<workgroup>, but keeps the same alloc/dealloc ownership model. The assignment pass is intentionally narrow: it only handles entry-block memref.alloc operations in HAL executable exports, computes byte ranges with the target data layout and ABI alignment, rejects dynamic or unsupported layouts, and refuses to overwrite existing local-memory assignments or predeclared export requirements. ConvertToLLVM consumes the assigned range by building memref descriptors from the HAL workgroup local-memory pointer. Matching memref.dealloc operations are erased because the storage is owned by the dispatch frame.

benvanik requested review from MaheshRavishankar and hanhanW April 28, 2026 03:24

benvanik added the codegen/llvm LLVM code generation compiler backend label Apr 28, 2026

benvanik marked this pull request as ready for review April 28, 2026 03:49

benvanik requested review from Max191 and qedawkins as code owners April 28, 2026 03:49

benvanik removed request for Max191 and qedawkins April 28, 2026 03:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLVMCPU] Lower explicit workgroup-local allocs through dispatch ABI#24279

[LLVMCPU] Lower explicit workgroup-local allocs through dispatch ABI#24279
benvanik wants to merge 1 commit intomainfrom
users/benvanik/batteries-5

benvanik commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

benvanik commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant