feat: add llama.cpp OpenVINO backend for Linux by kenvandine · Pull Request #2085 · lemonade-sdk/lemonade

kenvandine · 2026-06-03T00:21:31Z

Adds support for the upstream ggml-org/llama.cpp OpenVINO backend on Linux. OpenVINO enables inference on Intel CPUs, iGPUs, dGPUs, and NPUs via Intel's optimization runtime.

backend_versions.json: pin openvino to b9253 (same build as cpu/vulkan/metal)
llamacpp_server.cpp: add is_llamacpp_openvino_backend() helper; handle openvino in get_install_params() (Linux-only, ggml-org/llama.cpp release asset llama-{ver}-bin-ubuntu-openvino-x64.tar.gz); enable context-shift and LD_LIBRARY_PATH setup for OpenVINO like CUDA/Vulkan
system_info.cpp: register llamacpp/openvino in RECIPE_DEFS for Linux x86_64 (between Vulkan and ROCm in preference order)
defaults.json: add openvino_args and openvino_bin defaults to llamacpp section
config_file.cpp: add LEMONADE_LLAMACPP_OPENVINO_ARGS and LEMONADE_LLAMACPP_OPENVINO_BIN env variable mappings

https://claude.ai/code/session_01A3rK6yxjK9h7pe4ikrAvqj

Adds support for the upstream ggml-org/llama.cpp OpenVINO backend on Linux. OpenVINO enables inference on Intel CPUs, iGPUs, dGPUs, and NPUs via Intel's optimization runtime. - backend_versions.json: pin openvino to b9253 (same build as cpu/vulkan/metal) - llamacpp_server.cpp: add is_llamacpp_openvino_backend() helper; handle openvino in get_install_params() (Linux-only, ggml-org/llama.cpp release asset llama-{ver}-bin-ubuntu-openvino-x64.tar.gz); enable context-shift and LD_LIBRARY_PATH setup for OpenVINO like CUDA/Vulkan - system_info.cpp: register llamacpp/openvino in RECIPE_DEFS for Linux x86_64 (between Vulkan and ROCm in preference order) - defaults.json: add openvino_args and openvino_bin defaults to llamacpp section - config_file.cpp: add LEMONADE_LLAMACPP_OPENVINO_ARGS and LEMONADE_LLAMACPP_OPENVINO_BIN env variable mappings https://claude.ai/code/session_01A3rK6yxjK9h7pe4ikrAvqj

Copilot

Pull request overview

This pull request adds a new llama.cpp OpenVINO backend option for Linux to the Lemonade server, including version pinning, backend selection support, and configuration defaults/env-var wiring so users can install and run the upstream OpenVINO release assets.

Changes:

Registers llamacpp/openvino as a supported backend on Linux x86_64 and adds it into the backend preference ordering.
Extends the llama.cpp backend installer/runtime wiring to fetch the OpenVINO Linux asset and set LD_LIBRARY_PATH appropriately.
Adds config defaults and environment-variable mappings for openvino_args and openvino_bin, plus a backend version pin.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`src/cpp/server/system_info.cpp`	Adds `llamacpp/openvino` to the supported backend matrix for Linux x86_64 and places it in the preference order.
`src/cpp/server/config_file.cpp`	Adds env-var mappings for `LEMONADE_LLAMACPP_OPENVINO_ARGS` and `LEMONADE_LLAMACPP_OPENVINO_BIN`.
`src/cpp/server/backends/llamacpp_server.cpp`	Adds OpenVINO install asset selection, context-shift enabling, and `LD_LIBRARY_PATH` setup for the OpenVINO tarball layout.
`src/cpp/resources/defaults.json`	Introduces default `llamacpp.openvino_args` and `llamacpp.openvino_bin` values.
`src/cpp/resources/backend_versions.json`	Pins `llamacpp.openvino` to the same build tag as other upstream llama.cpp backends.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

superm1 · 2026-06-03T11:23:21Z

    {"LEMONADE_LLAMACPP_ROCM_ARGS",      "llamacpp",  "rocm_args"},
    {"LEMONADE_LLAMACPP_VULKAN_ARGS",    "llamacpp",  "vulkan_args"},
    {"LEMONADE_LLAMACPP_CPU_ARGS",       "llamacpp",  "cpu_args"},
+    {"LEMONADE_LLAMACPP_OPENVINO_ARGS",  "llamacpp",  "openvino_args"},


isn't this migration code? We didn't have oepnvino support before so how can you migrate?

These aren't migration-only mappings — migrate_from_env() is called on every fresh install to bootstrap config.json from env vars. All backends use the same mechanism (see LEMONADE_LLAMACPP_VULKAN_ARGS, LEMONADE_LLAMACPP_CUDA_BIN, etc.). Adding OpenVINO entries here means users who configure via env vars get them picked up on first run, consistent with the existing pattern.

I could have sworn there was a discussion somewhere about axing them.

@jfowers comments please

Please see #2106. I'm getting rid of cruft, don't add more.

- backend_versions.json: update openvino build to b9488 (matches first upstream release with OpenVINO asset), add openvino.runtime_version=2026.0 to encode the OpenVINO runtime version embedded in the asset filename - llamacpp_server.cpp: add get_openvino_runtime_version() helper (mirrors get_therock_version()); use it to construct the correct asset filename llama-{build}-bin-ubuntu-openvino-{runtime}-x64.tar.gz - llamacpp_server.cpp: consolidate duplicate CUDA + OpenVINO LD_LIBRARY_PATH blocks into a single combined branch https://claude.ai/code/session_01A3rK6yxjK9h7pe4ikrAvqj

Switch the OpenVINO backend download source from ggml-org/llama.cpp to lemonade-sdk/llama.cpp, which bundles the OpenVINO runtime libs in the tarball so no system-wide OpenVINO install is required. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

superm1 · 2026-06-05T18:29:05Z

+  "openvino": {
+    "runtime_version": "2026.0"


does the runtime need to get installed somehow?

claude and others added 2 commits June 2, 2026 23:23

Merge branch 'main' into claude/llama-cpp-openvino-linux-0n3HI

becb691

kenvandine requested a review from Copilot June 3, 2026 03:42

Copilot started reviewing on behalf of kenvandine June 3, 2026 03:42 View session

Copilot AI reviewed Jun 3, 2026

View reviewed changes

superm1 reviewed Jun 3, 2026

View reviewed changes

Comment thread src/cpp/server/backends/llamacpp_server.cpp Outdated

superm1 reviewed Jun 3, 2026

View reviewed changes

Comment thread src/cpp/server/backends/llamacpp_server.cpp

superm1 reviewed Jun 3, 2026

View reviewed changes

kenvandine and others added 3 commits June 3, 2026 17:19

Merge branch 'main' into claude/llama-cpp-openvino-linux-0n3HI

9df3afb

superm1 reviewed Jun 4, 2026

View reviewed changes

Comment thread src/cpp/server/system_info.cpp

kenvandine added 2 commits June 4, 2026 16:09

Merge branch 'main' into claude/llama-cpp-openvino-linux-0n3HI

ae2b453

Merge branch 'main' into claude/llama-cpp-openvino-linux-0n3HI

4d70b52

superm1 reviewed Jun 5, 2026

View reviewed changes

github-actions Bot added engine::llamacpp llama.cpp backend (LlamaCppServer); GPU/CPU LLM inference (Vulkan, ROCm, Metal) runtime::cpu CPU-only execution path enhancement New feature or request labels Jun 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add llama.cpp OpenVINO backend for Linux#2085

feat: add llama.cpp OpenVINO backend for Linux#2085
kenvandine wants to merge 7 commits into
lemonade-sdk:mainfrom
kenvandine:claude/llama-cpp-openvino-linux-0n3HI

kenvandine commented Jun 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

superm1 Jun 3, 2026

Uh oh!

kenvandine Jun 3, 2026

Uh oh!

superm1 Jun 3, 2026

Uh oh!

superm1 Jun 4, 2026

Uh oh!

superm1 Jun 5, 2026

Uh oh!

Uh oh!

superm1 Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kenvandine commented Jun 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

superm1 Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

kenvandine Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

superm1 Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

superm1 Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

superm1 Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

superm1 Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants