Skip to content

Add AMD ROCm GPU build and test CI infrastructure#1

Merged
Geramy merged 16 commits intomasterfrom
geramy/rocm-build-tests-matrix
Apr 17, 2026
Merged

Add AMD ROCm GPU build and test CI infrastructure#1
Geramy merged 16 commits intomasterfrom
geramy/rocm-build-tests-matrix

Conversation

@Geramy
Copy link
Copy Markdown
Member

@Geramy Geramy commented Mar 3, 2026

  • Add ROCm chipsets to Matrix and create per chipset builds and CI Tests
  • Add self-hosted runner test jobs (test-rocm-linux, test-rocm-windows) for gfx1151/gfx1150
  • Add cleanup composite actions for Linux and Windows runners
  • Add runner heartbeat monitoring workflow
  • Configure ci/run.sh with ROCm environment (HIP_PLATFORM, LD_LIBRARY_PATH, cmake flags)
  • Add Windows ROCm build support to build.yml
  • Fix conditional expression syntax warnings in build.yml

@Geramy Geramy self-assigned this Mar 3, 2026
@Geramy Geramy requested a review from ramkrishna2910 March 3, 2026 18:50
Copy link
Copy Markdown

@ramkrishna2910 ramkrishna2910 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments.

Comment thread .github/workflows/build.yml
Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml
Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml Outdated
Comment thread ci/run.sh Outdated
Comment thread .github/workflows/runner_heartbeat.yml Outdated
Geramy added 8 commits March 9, 2026 20:55
- Add self-hosted runner test jobs (test-rocm-linux, test-rocm-windows) for gfx1151/gfx1150
- Add cleanup composite actions for Linux and Windows runners
- Add runner heartbeat monitoring workflow
- Configure ci/run.sh with ROCm environment (HIP_PLATFORM, LD_LIBRARY_PATH, cmake flags)
- Add Windows ROCm build support to build.yml
- Fix conditional expression syntax warnings in build.yml
… should_build outputs to be specific. I ahve removed outputs.rocm_version from both ci steps, extracted resolve_rocm to a shared script for both jobs to use them. Fixed the matrix, removed both ubuntu-rocm and windows-rocm FGGML_ROCM=1 flag which doesn't apply because it isn't a real flag. Also commented out heartbeat runners.
…[^<]*\)<\/Key>.*/\1/gp'. This works on both Linux and Windows Git Bash.
@Geramy Geramy force-pushed the geramy/rocm-build-tests-matrix branch from 6f435d0 to cd3b5fc Compare March 10, 2026 03:55
Copy link
Copy Markdown

@ramkrishna2910 ramkrishna2910 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review from local testing — 2 bugs and 2 medium issues found. Build succeeds on gfx1151 (Radeon 8060S) with ROCm 7.1, inference works correctly.

Comment thread ci/resolve-rocm-version.sh Outdated
Comment thread bindings/ruby/ext/ruby_whisper_context.c
Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml
@ramkrishna2910
Copy link
Copy Markdown

@iswaryaalex @Geramy can we close on this? Been open for a while :D

@Geramy
Copy link
Copy Markdown
Member Author

Geramy commented Apr 1, 2026

@iswaryaalex @Geramy can we close on this? Been open for a while :D

Yeah I'll have time this weekend

- Fix alpha/RC version ordering bug in resolve-rocm-version.sh and build.yml
  (alpha was incorrectly treated as newer than RC)
- Fix NULL check bug on ndim validation in ruby_whisper_context.c
  (ndim check was incorrectly guarded by format != NULL)
- Add ${{ }} wrapper on if: conditionals at lines 615 and 1422 in build.yml
@Geramy
Copy link
Copy Markdown
Member Author

Geramy commented Apr 16, 2026

Action being ran to test latest changes.
https://github.com/lemonade-sdk/whisper.cpp-rocm/actions/runs/24538896530

Geramy added 6 commits April 16, 2026 16:13
Replace duplicated ~55 lines of PowerShell version resolution logic in
windows-rocm job with a call to ci/resolve-rocm-version.sh via Git Bash.
This eliminates code duplication and ensures both Linux and Windows use
the same version resolution logic.
Replace PCRE non-greedy .*? with ERE-compatible [^0-9]* in Bash
regex patterns. Bash [[ =~ ]] uses POSIX ERE which does not support
.*? non-greedy quantifier. On Windows Git Bash this fails strictly,
leaving latest_file empty and causing 'Failed to extract ROCm version'
error.

Also adds:
- File count validation with S3 response debug output
- Empty latest_file check showing candidate files
- Empty file line skip to prevent false regex matches
- Rewrite resolve-rocm-version.sh to use AMD's official tarball repo
  (repo.amd.com/rocm/tarball/) instead of scanning Amazon S3
- Remove 'latest' auto-detection logic which failed on Windows Git Bash
  due to PCRE vs ERE regex incompatibility
- Add version format validation and clear error messages
- Update build.yml workflow_dispatch to use concrete ROCm versions
  (7.12.0, 7.2.1) with choice options instead of 'latest'
Remove type:choice restriction so users can type any ROCm version
while keeping 7.12.0 as default and linking to available versions
@Geramy
Copy link
Copy Markdown
Member Author

Geramy commented Apr 17, 2026

@copilot resolve the merge conflicts in this pull request

…rocm into geramy/rocm-build-tests-matrix

# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/runner_heartbeat.yml
#	ci/run.sh

Co-authored-by: Geramy <264964+Geramy@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI commented Apr 17, 2026

@copilot resolve the merge conflicts in this pull request

Merge conflicts have been resolved in commit cf3281c. Three files had conflicts: build.yml (18 blocks), runner_heartbeat.yml (2 blocks), and ci/run.sh (1 block). All were resolved keeping the improvements from this PR (shared scripts, fixed conditionals, concrete ROCm version, commented-out self-hosted runners).

Copy link
Copy Markdown
Member Author

@Geramy Geramy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I answered all comments under them already outside of the review.

@Geramy Geramy merged commit 0a23b5a into master Apr 17, 2026
66 of 72 checks passed
@Geramy
Copy link
Copy Markdown
Member Author

Geramy commented Apr 17, 2026

PR request into upstream ggml-org#3757

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants