Skip to content

[AMDGPU][COMGR] Install gfx125x entry trampolines with hotswap#3000

Draft
harsh-amd wants to merge 8 commits into
ROCm:amd-stagingfrom
harsh-amd:hotswap-entry-trampolines
Draft

[AMDGPU][COMGR] Install gfx125x entry trampolines with hotswap#3000
harsh-amd wants to merge 8 commits into
ROCm:amd-stagingfrom
harsh-amd:hotswap-entry-trampolines

Conversation

@harsh-amd

@harsh-amd harsh-amd commented Jun 21, 2026

Copy link
Copy Markdown

ISSUE ID: SWDEV-499597

Summary

  • Add gfx125x kernel-entry trampoline rewriting to COMGR hotswap behind AMD_COMGR_HOTSWAP_ENTRY_TRAMPOLINES.
  • Keep the entry-trampoline pass separated from the B0-to-A0 instruction patch dispatcher in comgr-hotswap-entry-trampoline.cpp.
  • Use llvm::object::ELFFile<> for grown-image ELF symbol traversal when adjusting symbol values, matching the COMGR agent conventions.
  • Keep COMGR as the rewrite engine/API only: this PR no longer builds or installs libamd_comgr_hotswap_tool.so.
  • Use explicit hotswap-local gfx1250 stepping features so B0-to-A0 instruction patches only run for explicit B0-to-A0 requests, while entry trampolines remain opt-in.
  • Update COMGR hotswap docs to point runtime users at rocm-systems libhsa-hotswap.so.

Paired PRs

Testing

  • git diff --check
  • ASAN_OPTIONS=use_sigaltstack=0 cmake --build build-comgr-asan --target amd_comgr HotswapElfTests HotswapMCTests hotswap-rewrite --parallel 32
  • ASAN_OPTIONS=use_sigaltstack=0 build-comgr-asan/test-unit/HotswapElfTests
  • ASAN_OPTIONS=use_sigaltstack=0 build-comgr-asan/test-unit/HotswapMCTests --gtest_filter="BuildKernelEntryTrampoline.*"
  • ASAN_OPTIONS=use_sigaltstack=0 build-llvm/bin/llvm-lit -sv build-comgr-asan/test-lit/hotswap-elf-growth.s build-comgr-asan/test-lit/hotswap-kernel-entry-trampoline.s build-comgr-asan/test-lit/hotswap-kernel-entry-trampoline-multi.s build-comgr-asan/test-lit/hotswap-barrier-isfirst.s
  • ASAN_OPTIONS=use_sigaltstack=0 cmake --build build-comgr-asan --target check-comgr --parallel 32

ASAN check-comgr passed on mi350-4 at e199f875ac611dfa82e4ce7ffb1f4c89e008fa44: lit reported 70 passed / 11 unsupported, CTest reported 31/31 passed, and HotswapElfTests reported 6/6 passed.

harsh-amd added a commit to harsh-amd/TheRock that referenced this pull request Jun 21, 2026
ISSUE ID: ROCm/llvm-project#3000

Build on PR ROCm#6007 by pinning amd-llvm to a commit that includes the COMGR hotswap kernel-entry trampoline implementation and rocm-systems to the paired loader-side revert.
@harsh-amd harsh-amd force-pushed the hotswap-entry-trampolines branch from 4905511 to 02c967b Compare June 21, 2026 15:52
harsh-amd added a commit to harsh-amd/TheRock that referenced this pull request Jun 21, 2026
ISSUE ID: ROCm/llvm-project#3000

Build on PR ROCm#6007 by pinning amd-llvm to a commit that includes the COMGR hotswap kernel-entry trampoline implementation and rocm-systems to the paired loader-side revert.
@harsh-amd harsh-amd force-pushed the hotswap-entry-trampolines branch from 02c967b to fe6deae Compare June 21, 2026 16:10
harsh-amd added a commit to harsh-amd/TheRock that referenced this pull request Jun 21, 2026
ISSUE ID: ROCm/llvm-project#3000

Build on PR ROCm#6007 by pinning amd-llvm to a commit that includes the COMGR hotswap kernel-entry trampoline implementation and rocm-systems to the paired loader-side revert.
@harsh-amd harsh-amd force-pushed the hotswap-entry-trampolines branch from fe6deae to 3c1d9b2 Compare June 21, 2026 16:41
harsh-amd added a commit to harsh-amd/TheRock that referenced this pull request Jun 21, 2026
ISSUE ID: ROCm/llvm-project#3000

Build on PR ROCm#6007 by pinning amd-llvm to a commit that includes the COMGR hotswap kernel-entry trampoline implementation and rocm-systems to the paired loader-side revert.
@harsh-amd harsh-amd force-pushed the hotswap-entry-trampolines branch from 3c1d9b2 to 9eba7e5 Compare June 21, 2026 17:39
harsh-amd added a commit to harsh-amd/TheRock that referenced this pull request Jun 21, 2026
ISSUE ID: ROCm/llvm-project#3000

Build on PR ROCm#6007 by pinning amd-llvm to a commit that includes the COMGR hotswap kernel-entry trampoline implementation and rocm-systems to the paired loader-side revert.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant