Skip to content

Conversation

@Xiaoming-AMD
Copy link
Collaborator

  • Set HSA_NO_SCRATCH_RECLAIM=1 to disable scratch memory reclaim, reducing allocation overhead and improving kernel launch performance.
  • Set HSA_ENABLE_SDMA=1 to enable SDMA engines for efficient memory transfers between host and device without blocking compute resources.
  • These settings help improve stability and performance for training and inference workloads on AMD GPUs.

Copy link
Contributor

@wenxie-amd wenxie-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wenxie-amd wenxie-amd merged commit 4e7c52b into main Apr 22, 2025
2 checks passed
@Xiaoming-AMD Xiaoming-AMD deleted the dev/xiaoming/env_optimize branch April 23, 2025 00:35
@Xiaoming-AMD Xiaoming-AMD changed the title [Feat] tune ROCm runtime with HSA_NO_SCRATCH_RECLAIM and HSA_ENABLE_SDMA feature(HSA ENV): tune ROCm runtime with HSA_NO_SCRATCH_RECLAIM and HSA_ENABLE_SDMA Jun 4, 2025
@Xiaoming-AMD Xiaoming-AMD changed the title feature(HSA ENV): tune ROCm runtime with HSA_NO_SCRATCH_RECLAIM and HSA_ENABLE_SDMA feat(HSA ENV): tune ROCm runtime with HSA_NO_SCRATCH_RECLAIM and HSA_ENABLE_SDMA Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants