Skip to content

feat: add ROCm GPU support for AMD hardware#3757

Open
Geramy wants to merge 31 commits intoggml-org:masterfrom
lemonade-sdk:master
Open

feat: add ROCm GPU support for AMD hardware#3757
Geramy wants to merge 31 commits intoggml-org:masterfrom
lemonade-sdk:master

Conversation

@Geramy
Copy link
Copy Markdown

@Geramy Geramy commented Apr 17, 2026

Summary

Adds ROCm (Radeon Open Compute) backend build support to whisper.cpp, enabling inference acceleration on AMD GPUs. This brings whisper.cpp parity with CUDA support for NVIDIA GPUs.

Features

  • Multi-platform support — Linux and Windows CI workflows
  • AMD GPU targets — Supports gfx1151 (RCN), gfx1150 (RCZ), gfx1100 (NAVI31), gfx110X (NAVI3x), gfx120X (RDNA4)
  • Artifact bundling — Automatic packaging of shared libraries for portable distribution

Build Instructions

cmake -B build \
  -DGGML_HIP=ON \
  -DGPU_TARGETS=gfx1151 \
  -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang \
  -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++

cmake --build build --config Release

Geramy and others added 30 commits January 27, 2026 14:51
Add ROCm support and CI improvements

ci: automate library bundling in ROCm build workflow

Replace manual copying of ROCm libraries and shared objects with an automated CMake-based bundling step using GET_RUNTIME_DEPENDENCIES. This ensures all linked libraries (e.g., libamdhip64, librocm_sysdeps) are recursively detected and bundled into build/bin, filtering out system libs like libc.so, while improving portability and reducing maintenance for dependency management.

build: enhance library bundling to handle symlinks for portable distribution

- Updated CMake script to resolve and copy real targets of symlinks, then recreate symlinks in build dir
- Modified chmod to only affect real .so* files, ignoring symlinks
- Removed outdated comments and improved script clarity for better portability of whisper-cli binaries

Replaced file(CREATE_LINK): We now use execute_process(COMMAND ln -sf ...) which is standard on Linux

Changing Bundle Linked Libraries to use linux based ldd instead of cmake, I am trying to make the smallest changes to whisper.cpp as possible without modifying existing things like cmake files or adding more files.
Add copying of ROCm system dependency libraries (e.g., elf, drm, numa) to the build bundle in CI to include required shared libraries for proper ROCm functionality. Also ensure the build directory is created if it doesn't exist to avoid copy failures.
- Add self-hosted runner test jobs (test-rocm-linux, test-rocm-windows) for gfx1151/gfx1150
- Add cleanup composite actions for Linux and Windows runners
- Add runner heartbeat monitoring workflow
- Configure ci/run.sh with ROCm environment (HIP_PLATFORM, LD_LIBRARY_PATH, cmake flags)
- Add Windows ROCm build support to build.yml
- Fix conditional expression syntax warnings in build.yml
… should_build outputs to be specific. I ahve removed outputs.rocm_version from both ci steps, extracted resolve_rocm to a shared script for both jobs to use them. Fixed the matrix, removed both ubuntu-rocm and windows-rocm FGGML_ROCM=1 flag which doesn't apply because it isn't a real flag. Also commented out heartbeat runners.
…[^<]*\)<\/Key>.*/\1/gp'. This works on both Linux and Windows Git Bash.
- Fix alpha/RC version ordering bug in resolve-rocm-version.sh and build.yml
  (alpha was incorrectly treated as newer than RC)
- Fix NULL check bug on ndim validation in ruby_whisper_context.c
  (ndim check was incorrectly guarded by format != NULL)
- Add ${{ }} wrapper on if: conditionals at lines 615 and 1422 in build.yml
Replace duplicated ~55 lines of PowerShell version resolution logic in
windows-rocm job with a call to ci/resolve-rocm-version.sh via Git Bash.
This eliminates code duplication and ensures both Linux and Windows use
the same version resolution logic.
Replace PCRE non-greedy .*? with ERE-compatible [^0-9]* in Bash
regex patterns. Bash [[ =~ ]] uses POSIX ERE which does not support
.*? non-greedy quantifier. On Windows Git Bash this fails strictly,
leaving latest_file empty and causing 'Failed to extract ROCm version'
error.

Also adds:
- File count validation with S3 response debug output
- Empty latest_file check showing candidate files
- Empty file line skip to prevent false regex matches
- Rewrite resolve-rocm-version.sh to use AMD's official tarball repo
  (repo.amd.com/rocm/tarball/) instead of scanning Amazon S3
- Remove 'latest' auto-detection logic which failed on Windows Git Bash
  due to PCRE vs ERE regex incompatibility
- Add version format validation and clear error messages
- Update build.yml workflow_dispatch to use concrete ROCm versions
  (7.12.0, 7.2.1) with choice options instead of 'latest'
Remove type:choice restriction so users can type any ROCm version
while keeping 7.12.0 as default and linking to available versions
…rocm into geramy/rocm-build-tests-matrix

# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/runner_heartbeat.yml
#	ci/run.sh

Co-authored-by: Geramy <264964+Geramy@users.noreply.github.com>
Add AMD ROCm GPU build and test CI infrastructure
@Geramy
Copy link
Copy Markdown
Author

Geramy commented Apr 17, 2026

@ggerganov

@Geramy
Copy link
Copy Markdown
Author

Geramy commented Apr 17, 2026

PR has had several reviews and fixes applied, originated here: lemonade-sdk#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants