New envs: HSA_KERNARG_POOL_SIZE, ENABLE_NUMA_BINDING #268

wenxie-amd · 2025-11-04T10:30:22Z

Add two parameters, both disabled by default:

HSA_KERNARG_POOL_SIZE: When there are many small operators, the pool size can become insufficient, causing kernel launches to wait and significantly increasing launch times. This is particularly useful in scenarios with a high number of small kernels.
ENABLE_NUMA_BINDING: When enabled, this binds each GPU's process to its corresponding affinity socket, boosting CPU efficiency.

# Increase HSA kernarg pool size to 12MB for models with lot of kernels
# export HSA_KERNARG_POOL_SIZE=${HSA_KERNARG_POOL_SIZE:-12582912}

# Enable NUMA binding for better memory locality (may increase stability for large models)
export ENABLE_NUMA_BINDING=${ENABLE_NUMA_BINDING:-0}

limou102

Do these two newly added environment variables definitely speed up training? Why aren’t they enabled by default, especially since the NUMA setting feels a bit tricky.

wenxie-amd · 2025-11-04T12:44:42Z

Do these two newly added environment variables definitely speed up training? Why aren’t they enabled by default, especially since the NUMA setting feels a bit tricky.

The main branch is currently undergoing testing for the new version of the Docker release. We’re a bit cautious about these two environment variables potentially causing side effects, so we haven't activated them yet. They're being included as existing capabilities, and we'll consider enabling them for testing if any specific models need it in the future.

wenxie-amd requested review from Xiaoming-AMD and limou102 as code owners November 4, 2025 10:30

wenxie-amd added 2 commits November 4, 2025 18:44

New envs: HSA_KERNARG_POOL_SIZE, ENABLE_NUMA_BINDING

c26c840

New envs: HSA_KERNARG_POOL_SIZE, ENABLE_NUMA_BINDING

aca932a

wenxie-amd force-pushed the dev/wenx/numa_binding branch from fba890e to aca932a Compare November 4, 2025 10:44

limou102 approved these changes Nov 4, 2025

View reviewed changes

wenxie-amd merged commit 8be6a6e into main Nov 4, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New envs: HSA_KERNARG_POOL_SIZE, ENABLE_NUMA_BINDING #268

New envs: HSA_KERNARG_POOL_SIZE, ENABLE_NUMA_BINDING #268

Uh oh!

wenxie-amd commented Nov 4, 2025 •

edited

Loading

Uh oh!

limou102 left a comment

Uh oh!

wenxie-amd commented Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

New envs: HSA_KERNARG_POOL_SIZE, ENABLE_NUMA_BINDING #268

New envs: HSA_KERNARG_POOL_SIZE, ENABLE_NUMA_BINDING #268

Uh oh!

Conversation

wenxie-amd commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

limou102 left a comment

Choose a reason for hiding this comment

Uh oh!

wenxie-amd commented Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wenxie-amd commented Nov 4, 2025 •

edited

Loading