feat: add Linux ARM64 CPU and Vulkan llamacpp backend support by kenvandine · Pull Request #2081 · lemonade-sdk/lemonade

kenvandine · 2026-06-02T20:30:33Z

Summary

Extend RECIPE_DEFS in system_info.cpp to allow arm64 CPU family for llamacpp cpu, vulkan, and system backends, so they appear as supported/installable on ARM64 Linux
Download ARM64-specific binaries from upstream llama.cpp releases when compiled for aarch64:
- cpu: llama-{version}-bin-ubuntu-arm64.tar.gz
- vulkan: llama-{version}-bin-ubuntu-vulkan-arm64.tar.gz
Fix get_device_dict() catch block to always set the CPU family field via compile-time macros — without this, an exception in get_cpu_device() left family missing from the JSON, causing backend matching to fail even after the RECIPE_DEFS change (manifested as "Requires ARM64 processors CPU" despite being on an ARM64 system)
Update docs/guide/configuration/llamacpp.md and README.md to document ARM64 Linux support for cpu and vulkan backends

Validated against llama.cpp release assets at b9253 and b9482 — both ship bin-ubuntu-arm64.tar.gz and bin-ubuntu-vulkan-arm64.tar.gz. No version bump to backend_versions.json needed.

On ARM64 Linux (e.g., Qualcomm X Elite), vulkan is preferred over cpu by the existing RECIPE_DEFS preference order.

Test plan

Restart lemond on an ARM64 Linux system and run lemonade recipes — llamacpp:cpu and llamacpp:vulkan should show as installable
lemonade backends install llamacpp:vulkan downloads bin-ubuntu-vulkan-arm64.tar.gz and runs inference
lemonade backends install llamacpp:cpu downloads bin-ubuntu-arm64.tar.gz and runs inference
Existing x86_64 Linux behavior unchanged (still downloads x64 variants)
macOS and Windows builds unaffected

🤖 Generated with Claude Code

- Download arm64 binaries (cpu and vulkan) from ggml-org/llama.cpp releases when compiled for aarch64 Linux - Extend RECIPE_DEFS to allow arm64 CPU family for llamacpp cpu, vulkan, and system backends so they appear as installable on ARM64 - Fix get_device_dict() catch block to always set the cpu family via compile-time macros; without this, an exception in get_cpu_device() left the family field missing, causing backend matching to fail even after the RECIPE_DEFS change Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…port Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds two new jobs to cpp_server_build_test_release.yml: - build-lemonade-linux-arm64: compiles lemond and lemonade on the GitHub-provided ubuntu-24.04-arm runner, confirming the ARM64 code path builds cleanly on every PR. - test-cli-endpoints-linux-arm64: runs the cli, endpoints, ollama, and streaming-errors test suites against the built ARM64 binary. Omits llamacpp-system (no system llama-server), env-vars (requires .deb path), and Vulkan inference (no GPU on GitHub-hosted ARM64 runners). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fl0rianr

Thanks for bringing this in @kenvandine! I guessed it might be welcomed if I take a look here as well...

Non-blocking: after the ARM64 server startup issue is fixed, it might be useful to add one small ARM64-specific smoke test for the core change in this PR.

The PR changes the llama.cpp asset names to bin-ubuntu-arm64.tar.gz / bin-ubuntu-vulkan-arm64.tar.gz, but this matrix currently only runs the generic CLI/endpoint/Ollama tests. At least lemonade backends install llamacpp:cpu should be testable on the ARM64 runner and would cover the new CPU asset path directly. Vulkan may be harder without GPU access.

Replace the generic "family" JSON key in device dictionaries with specific names that communicate what the field represents: - CPU devices: "cpu_isa" (e.g. "x86_64", "arm64") - GPU devices (AMD, NVIDIA, Metal): "gpu_isa" (e.g. "gfx1151", "sm_89", "metal") - NPU devices: "npu_isa" (e.g. "XDNA2") Addresses PR feedback from r3349563880. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The ARM64 test job was missing the server startup step, causing all tests to fail with "Server is not running on port 13305". Add a "Start lemond server" step that sets XDG_RUNTIME_DIR, launches ./build/lemond in the background, and polls /live for up to 60 seconds before timing out with a log dump. Addresses PR feedback from r3349545074. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

superm1

Doesn't the llama.cpp uprev job need to be changed too?

kenvandine · 2026-06-03T18:18:27Z

No change needed to the uprev job. The ARM64 and x86 Linux builds come from the same upstream llama.cpp release tag — llamacpp_server.cpp selects the right archive name at compile time (bin-ubuntu-arm64.tar.gz vs bin-ubuntu-x64.tar.gz), but both pull from the same release. So when the uprev job bumps llamacpp.cpu and llamacpp.vulkan in backend_versions.json, the updated version applies to both architectures automatically.

The one gap is that the validate job only tests on Windows self-hosted runners and won't exercise the Linux ARM64 download paths, but that requires an ARM64 self-hosted runner and is out of scope here.

Ensures the server process inherits HF_HOME so model/cache paths resolve correctly during tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fl0rianr · 2026-06-03T19:31:09Z

From my side this is ready for merge, if the super fast tests are running successfully.

sd-cpp has no ARM64 Linux binary (cpu backend is x86_64 only), so image generation tests fail with 500 on the ARM64 CI runner. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

sd-cpp has no ARM64 Linux binary, so fall through to llamacpp the same way the test already does on macOS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

kenvandine and others added 3 commits June 2, 2026 15:23

docs: update llamacpp backend docs for Linux ARM64 CPU and Vulkan sup…

eeae7e2

…port Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

kenvandine requested a review from jeremyfowers June 2, 2026 20:37

Merge branch 'main' into kenvandine/arm64

ee3b8fc

jeremyfowers reviewed Jun 3, 2026

View reviewed changes

Comment thread .github/workflows/cpp_server_build_test_release.yml

Merge branch 'main' into kenvandine/arm64

a19af0b

superm1 reviewed Jun 3, 2026

View reviewed changes

Comment thread src/cpp/server/system_info.cpp Outdated

fl0rianr reviewed Jun 3, 2026

View reviewed changes

Comment thread .github/workflows/cpp_server_build_test_release.yml

kenvandine and others added 3 commits June 3, 2026 13:56

Merge branch 'main' into kenvandine/arm64

880abbe

superm1 reviewed Jun 3, 2026

View reviewed changes

fl0rianr reviewed Jun 3, 2026

View reviewed changes

Comment thread .github/workflows/cpp_server_build_test_release.yml Outdated

ci: set HF_HOME before starting lemond on ARM64 test job

5ef3a94

Ensures the server process inherits HF_HOME so model/cache paths resolve correctly during tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge branch 'main' into kenvandine/arm64

e8d68be

kenvandine requested a review from jeremyfowers June 4, 2026 00:53

kenvandine and others added 3 commits June 4, 2026 15:26

test: skip sd-cpp tests on Linux ARM64

c93ed48

sd-cpp has no ARM64 Linux binary (cpu backend is x86_64 only), so image generation tests fail with 500 on the ARM64 CI runner. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test: use llamacpp recipe in pull_multi test on Linux ARM64

8e82061

sd-cpp has no ARM64 Linux binary, so fall through to llamacpp the same way the test already does on macOS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge branch 'main' into kenvandine/arm64

2c36484

jeremyfowers added this to the Lemonade v10.7 milestone Jun 5, 2026

kenvandine added 3 commits June 5, 2026 23:54

Merge branch 'main' into kenvandine/arm64

8d15427

Merge branch 'main' into kenvandine/arm64

5c8f451

Merge branch 'main' into kenvandine/arm64

c703874

github-actions Bot added engine::llamacpp llama.cpp backend (LlamaCppServer); GPU/CPU LLM inference (Vulkan, ROCm, Metal) runtime::vulkan Vulkan runtime / GPU backend enhancement New feature or request documentation Improvements or additions to documentation labels Jun 6, 2026

kenvandine and others added 2 commits June 6, 2026 23:48

Merge branch 'main' into kenvandine/arm64

4e12aab

Merge branch 'main' into kenvandine/arm64

33605b1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Linux ARM64 CPU and Vulkan llamacpp backend support#2081

feat: add Linux ARM64 CPU and Vulkan llamacpp backend support#2081
kenvandine wants to merge 18 commits into
mainfrom
kenvandine/arm64

kenvandine commented Jun 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

fl0rianr left a comment

Uh oh!

Uh oh!

superm1 left a comment

Uh oh!

kenvandine commented Jun 3, 2026

Uh oh!

Uh oh!

fl0rianr commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kenvandine commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

Uh oh!

fl0rianr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

superm1 left a comment

Choose a reason for hiding this comment

Uh oh!

kenvandine commented Jun 3, 2026

Uh oh!

Uh oh!

fl0rianr commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kenvandine commented Jun 2, 2026 •

edited

Loading