feat(vendor): add llama.cpp vendor plugin by xb17 · Pull Request #2131 · danielmiessler/Fabric

xb17 · 2026-05-28T19:09:10Z

What

Adds a dedicated vendor plugin for llama.cpp server.

Background

Issue #2072 requested llama.cpp support. Currently users have to route through the openai_compatible driver pointed at http://localhost:8080/v1, which works but requires manual setup and doesn't expose llama.cpp-specific behaviour.

Why a dedicated driver

llama.cpp's server is OpenAI-compatible but differs in a few meaningful ways:

	`openai_compatible`	`llama.cpp`
Default URL	varies	`http://localhost:8080/v1`
API key	required field	optional (local server)
`cache_prompt`	not sent	✓ sent — reuses KV cache for repeated system prompts
SDK dependency	OpenAI Go SDK	none (hand-rolled HTTP)

The cache_prompt: true field is particularly valuable for Fabric usage patterns, where the same system prompt (pattern) is sent repeatedly across requests. llama.cpp reuses the KV cache for the matching prefix, significantly reducing time-to-first-token on subsequent requests.

Implementation

Follows the same hand-rolled HTTP approach as the LM Studio plugin. Implements ListModels, SendStream, and Send. API key is optional — the Authorization header is only set when a key is configured.

New files:

internal/plugins/ai/llamacpp/llamacpp.go
i18n keys in internal/i18n/locales/en.json
Registration in internal/core/plugin_registry.go

Closes #2072

Adds a dedicated vendor plugin for llama.cpp server (https://github.com/ggml-org/llama.cpp). llama.cpp exposes an OpenAI-compatible REST API but has behaviour that differs from the generic openai_compatible driver: - Default base URL: http://localhost:8080/v1 (llama.cpp default port) - No API key required (auth header is sent only when a key is configured) - Supports cache_prompt to reuse the KV cache across requests that share a common prefix (e.g. the same system prompt), reducing latency - Does not use LM Studio-specific extensions (chat_template_kwargs, etc.) The driver follows the same hand-rolled HTTP pattern as the LM Studio plugin and implements ListModels, SendStream, and Send. Closes danielmiessler#2072

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(vendor): add llama.cpp vendor plugin#2131

feat(vendor): add llama.cpp vendor plugin#2131
xb17 wants to merge 1 commit into
danielmiessler:mainfrom
xb17:feat/llamacpp-vendor

xb17 commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

xb17 commented May 28, 2026

What

Background

Why a dedicated driver

Implementation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant