Skip to content

feat(rocm): add gfx908 (MI100) and gfx90a (MI210) GPU support#2092

Open
kenvandine wants to merge 3 commits into
mainfrom
kenvandine/gfx90a_gfx908
Open

feat(rocm): add gfx908 (MI100) and gfx90a (MI210) GPU support#2092
kenvandine wants to merge 3 commits into
mainfrom
kenvandine/gfx90a_gfx908

Conversation

@kenvandine
Copy link
Copy Markdown
Member

Summary

Adds AMD Instinct MI100 (CDNA1/gfx908) and MI200/MI210 (CDNA2/gfx90a) to the lemonade backend, following llamacpp-rocm nightly build support added in lemonade-sdk/llamacpp-rocm#103.

  • Add gfx908 and gfx90a to ROCM_ARCH_MAPPING with both the direct arch string (used via HSA/WSL path) and KFD-computed variants (gfx9008, gfx9010) produced by the native Linux digit-only parsing path
  • Extend the gfx arch regex from \d{4} to [0-9a-f]{3,4} to match 3-char and alphanumeric arch strings like gfx908 and gfx90a
  • Add MI100/MI200/MI210/Arcturus/Aldebaran marketing name recognition as a fallback in identify_rocm_arch_from_name
  • Register gfx908 and gfx90a as supported families for llamacpp rocm and sd-cpp rocm backends
  • Add human-readable device family names for both new architectures

Dependencies

Feature request

Test plan

  • Verify lemonade backends reports rocm as supported on a system with an AMD Instinct MI100 (gfx908)
  • Verify lemonade backends reports rocm as supported on a system with an AMD Instinct MI200/MI210 (gfx90a)
  • Verify lemonade pull + lemonade run works end-to-end with a model on both GPU targets once llamacpp-rocm#103 builds land
  • Confirm existing RDNA2/3/3.5/4 detection is unaffected

🤖 Generated with Claude Code

Adds AMD Instinct MI100 (CDNA1/gfx908) and MI200/MI210 (CDNA2/gfx90a)
to the lemonade backend, following the llamacpp-rocm nightly build
support added in lemonade-sdk/llamacpp-rocm#103.

- Add gfx908 and gfx90a to ROCM_ARCH_MAPPING with both the direct arch
  string (used via HSA/WSL path) and the KFD-computed variants (gfx9008,
  gfx9010) produced by the native Linux digit-only parsing path
- Extend the gfx arch regex from \d{4} to [0-9a-f]{3,4} to match
  3-char and alphanumeric arch strings like gfx908 and gfx90a
- Add MI100/MI200/MI210/Arcturus/Aldebaran marketing name recognition
  as a fallback in identify_rocm_arch_from_name
- Register gfx908 and gfx90a as supported families for llamacpp rocm
  and sd-cpp rocm backends
- Add human-readable device family names for both new architectures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kenvandine kenvandine marked this pull request as draft June 3, 2026 18:38
@kenvandine kenvandine requested a review from superm1 June 3, 2026 22:17
@kenvandine kenvandine marked this pull request as ready for review June 3, 2026 23:52
// Empty string means "no ROCm binary for this ISA" — skip for get_rocm_arch / install filenames.
const std::map<std::string, std::string> ROCM_ARCH_MAPPING = {
// CDNA1 - AMD Instinct MI100 (Arcturus)
{"gfx908", "gfx908"}, // Direct arch string (from HSA/WSL path)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this

{"gfx9008", "gfx908"}, // KFD-computed string on native Linux (90008 → gfx9008)

// CDNA2 - AMD Instinct MI200/MI210 (Aldebaran)
{"gfx90a", "gfx90a"}, // Direct arch string (from HSA/WSL path)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this

Comment on lines +1908 to +1920
// CDNA1 GPUs (gfx908 architecture) - AMD Instinct MI100
if (device_lower.find("mi100") != std::string::npos ||
device_lower.find("arcturus") != std::string::npos) {
return "gfx908";
}

// CDNA2 GPUs (gfx90a architecture) - AMD Instinct MI200/MI210
if (device_lower.find("mi200") != std::string::npos ||
device_lower.find("mi210") != std::string::npos ||
device_lower.find("aldebaran") != std::string::npos) {
return "gfx90a";
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where did all this come from?

Copy link
Copy Markdown
Member

@superm1 superm1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you need lemonade/llama.cpp too?

@kenvandine
Copy link
Copy Markdown
Member Author

Yes, lemonade-sdk/llama.cpp#15

@superm1
Copy link
Copy Markdown
Member

superm1 commented Jun 4, 2026

Yes, lemonade-sdk/llama.cpp#15

That's for openvino though. We need rocm change.

@kenvandine
Copy link
Copy Markdown
Member Author

Yes, lemonade-sdk/llama.cpp#15

That's for openvino though. We need rocm change.

Sorry, crossed the streams there. Our llama.cpp already includes these in the gpu_targets.

@github-actions github-actions Bot added engine::llamacpp llama.cpp backend (LlamaCppServer); GPU/CPU LLM inference (Vulkan, ROCm, Metal) runtime::rocm AMD ROCm runtime enhancement New feature or request labels Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

engine::llamacpp llama.cpp backend (LlamaCppServer); GPU/CPU LLM inference (Vulkan, ROCm, Metal) enhancement New feature or request runtime::rocm AMD ROCm runtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MI-100 and MI-210 GPU support

2 participants