Skip to content

fix(gpu): support wildcards in GPU detection logic#2295

Open
jtlayton wants to merge 2 commits into
lemonade-sdk:mainfrom
jtlayton:detect-wildcard
Open

fix(gpu): support wildcards in GPU detection logic#2295
jtlayton wants to merge 2 commits into
lemonade-sdk:mainfrom
jtlayton:detect-wildcard

Conversation

@jtlayton

Copy link
Copy Markdown

This patch fixes ROCm detection for me for the GPUs covered by wildcard strings.

The identify_rocm_arch_from_name() function converts KFD gfx_target_version values (e.g. 110003, 120001) into specific family strings like gfx1103 and gfx1201. However, the RECIPE_DEFS table uses X-suffixed wildcards (gfx110X, gfx120X) to represent entire architecture families.

device_matches_constraint() did an exact string comparison, so gfx1103 != gfx110X and gfx1201 != gfx120X, causing valid AMD GPUs (RDNA3/RDNA4) detected via KFD sysfs to be reported as "Unsupported GPU" for ROCm backends.

Fix device_matches_constraint() to treat a trailing X in allowed family strings as a prefix wildcard match.

@github-actions github-actions Bot added bug Something isn't working runtime::rocm AMD ROCm runtime labels Jun 18, 2026

@fl0rianr fl0rianr left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this makes sense to me. Good work

It would be nice to add a tiny regression test for:

  • gfx1103 -> gfx110X
  • gfx1201 -> gfx120X
  • exact matches like gfx1151 or gfx1152
  • non-matches like gfx1151 vs gfx110X

No blocking - but helpful.

@fl0rianr fl0rianr linked an issue Jun 18, 2026 that may be closed by this pull request
@soulafein83

Copy link
Copy Markdown

I have the same problem on my rx 6600 (gfx1032)

@ckuethe

ckuethe commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

PR fixes my GFX1100

@ckuethe

ckuethe commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Not sure if this is relevant but there's a gfx110x build of whisper.cpp at https://github.com/lemonade-sdk/whisper.cpp-rocm/releases but lemond is looking for a gfx1100 build

ckuethe@ryzen:/usr/bin$ lemonade backends
Recipe              Backend     Status          Message/Version                               Action
----------------------------------------------------------------------------------------------------------------------------------------------------
flm                 npu         unsupported     Requires AMD XDNA 2 AMD NPU                    -
kokoro              cpu         installed       b17                                            -
                    metal       unsupported     Requires macOS                                 -
llamacpp            cpu         installable     Backend is supported but not installed.        lemonade backends install llamacpp:cpu
                    cuda        unsupported     Unsupported GPU                                -
                    metal       unsupported     Requires macOS                                 -
                    rocm        installed       b9710                                          -
                    system      unsupported     llama-server not found in PATH                 -
                    vulkan      installed       b9704                                          -
moonshine           cpu         installable     Backend is supported but not installed.        lemonade backends install moonshine:cpu
ryzenai-llm         npu         unsupported     Requires Windows                               -
sd-cpp              cpu         installable     Backend is supported but not installed.        lemonade backends install sd-cpp:cpu
                    cuda        unsupported     Unsupported GPU                                -
                    metal       unsupported     Requires macOS                                 -
                    rocm        installed       master-672-1f9ee88                             -
                    vulkan      installed       master-709-92a3b73                             -
vllm                rocm        update_required Backend update is required before use.         lemonade backends install vllm:rocm
whispercpp          cpu         installed       v1.8.4                                         -
                    metal       unsupported     Requires macOS                                 -
                    npu         unsupported     Requires Windows                               -
                    rocm        installable     Backend is supported but not installed.        lemonade backends install whispercpp:rocm
                    vulkan      installed       v1.8.4                                         -
----------------------------------------------------------------------------------------------------------------------------------------------------
ckuethe@ryzen:/usr/bin$ lemonade backends install whispercpp:rocm
Installing backend: whispercpp:rocm
[1/1] whisper-v1.8.4-linux-rocm-gfx1100.tar.gz (0.0 MB)
  Progress: 100% (0.0/0.0 MB)    Error: Failed to download whisper-server from: https://github.com/lemonade-sdk/whisper.cpp-rocm/releases/download/v1.8.4/whisper-v1.8.4-linux-rocm-gfx1100.tar.gz - Download failed after 6 attempts.
Last error: HTTP error 404 for URL: https://github.com/lemonade-sdk/whisper.cpp-rocm/releases/download/v1.8.4/whisper-v1.8.4-linux-rocm-gfx1100.tar.gz

@jtlayton

Copy link
Copy Markdown
Author

It would be nice to add a tiny regression test for:

Good idea. Added.

@jtlayton jtlayton force-pushed the detect-wildcard branch 4 times, most recently from ed518d6 to 2e91ae1 Compare June 19, 2026 11:44
jtlayton and others added 2 commits June 19, 2026 07:49
The identify_rocm_arch_from_name() function converts KFD gfx_target_version
values (e.g. 110003, 120001) into specific family strings like gfx1103 and
gfx1201. However, the RECIPE_DEFS table uses X-suffixed wildcards (gfx110X,
gfx120X) to represent entire architecture families.

device_matches_constraint() did an exact string comparison, so gfx1103 !=
gfx110X and gfx1201 != gfx120X, causing valid AMD GPUs (RDNA3/RDNA4) detected
via KFD sysfs to be reported as "Unsupported GPU" for ROCm backends.

Fix device_matches_constraint() to treat a trailing X in allowed family
strings as a prefix wildcard match.

Co-authored-by: Big Pickle <big-pickle@opencode.ai>
This adds a unit test to verify the `device_matches_constraint` logic.
It ensures that families with a trailing 'X' (e.g., "gfx110X") are
correctly recognized as wildcards that match specific models (e.g., "gfx1103").

Co-authored-by: opencode:Gemma-4-12B-it-GGUF
@jtlayton

Copy link
Copy Markdown
Author

Sort of added anyway...

I had an LLM cook up this test, but all it does is replicate the C++ logic in python and run it. If someone breaks the C++ code this won't catch it. I'm not clear on how we'd add a testcase here without adding some sort of LD_PRELOAD shim or something that spoofs fake GPUs for lemond to detect. Thoughts?

@fl0rianr

Copy link
Copy Markdown
Collaborator

Thanks for adding the regression coverage. I agree this kind of Python replica test is not a perfect implementation-level test - not a real regression test at all. But it can test the idea of a change in Code. So this can help human and LLM alike if it's changes accordingly an the python code ist considered as well. At the moment we have not all those cards in the CI (even no Nvidia at all) and it is consistent with the existing CUDA arch mapping test style and is useful as lightweight "expected-behavior" coverage.

I would not go for a more complex test at this point.

The actual code change is small and directly fixes the gfxNNNN vs gfxNNNX matching issue, so this looks good to me. I will wait what CI does and approve.

@fl0rianr fl0rianr left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine, no need to change, it's our job getting CI back on track with this MacOS whisper job failing.

@ianbmacdonald

Copy link
Copy Markdown
Collaborator

👋 I closed my duplicate (#2324) in favor of this — the wildcard is the cleaner approach. One small forward-compat note from comparing the two, take it or leave it:

The wildcard makes the support-set match (device_matches_constraint) tolerant of a future same-family arch — a hypothetical gfx1104 would match gfx110X automatically, which is the nice property this PR adds. But the download-target lookup is an exact match: backend_utils.cpp (~L800) does url_mapping.contains(arch) against backend_versions.json, which is keyed by the specific arches (gfx1100gfx1103, …). So a new same-family GPU would report supported via the wildcard but then miss url_mapping and fail to resolve a ROCm download variant.

It works for every GPU today (all current RDNA2/3/4 arches are enumerated in url_mapping) — just flagging that the forward-compat the wildcard buys at the match layer isn't mirrored at the download layer, in case it's worth a follow-up. Not a blocker.

@ianbmacdonald

Copy link
Copy Markdown
Collaborator

For repo hygiene — this PR looks like it resolves a cluster of three open issues reporting the same root cause (ROCm backends marked unsupported because the detected gfx arch no longer matches the family allowlist after 2a7aa18c). Might be worth adding Fixes: lines so they auto-close on merge:

They're already cross-linked to each other (#2296#2302#2319); this PR is the fix none of them point to yet. I closed my own duplicate attempt (#2324) in favor of this approach. Happy to help verify — I reproduced #2319 on an RX 7900 XT (gfx1100) and confirmed family matching restores ROCm detection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working priority::😎warm runtime::rocm AMD ROCm runtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gfx1201 unsupported with rocm (9070xt)

5 participants