refactor(backends): self-describing WrappedServer backends (#2287)#2320
refactor(backends): self-describing WrappedServer backends (#2287)#2320jeremyfowers wants to merge 5 commits into
Conversation
Make each inference backend describe itself with a plain-data descriptor plus a server class, and rewrite the scattered `if (recipe == "...")` sites to read a registry built from those descriptors. Adding a backend becomes one LEMON_BACKENDS line plus a descriptor + factory file — no router, CLI, docs, or support-matrix edits. - Descriptor types (BackendDescriptor/BackendOption/SlotPolicy) + a CLI-safe data registry and a server-only factory registry, generated from the LEMON_BACKENDS list at CMake configure time. - All 9 backends carry a descriptor (device, slot policy, options, support matrix, labels, binary) and a create(). - Descriptor-driven: router creation, NPU/slot eviction, device type, recipe options/CLI flags, config-section identity, support matrix, recipe labels, cloud availability. - /system-info recipes enriched with display_name/selectable_backend/options/ support; the app reads recipe display names from it instead of hardcoded TS. - docs/tools/gen_backend_docs.py generates docs/dev/backends-reference.md from /system-info; a CI step fails on drift. Authoring guide in docs/dev/adding-a-backend.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CI statusAll cross-platform builds pass (MSVC, AppleClang, GCC, Arch, openSUSE, Fedora rpm), validating the descriptor aggregate-init, CMake The single red — Test CLI/Endpoints (windows-latest) →
This PR touches backend construction, not inference, |
Restructure the self-describing backends to the layout the issue #2287 plan specified — one folder per backend — instead of the flat file layout I used before. This also folds the earlier _descriptor/_factory split into the spec's cleaner shape: the descriptor is a header-only `inline const` and create() lives with the server class. Each backend now lives in its own folder, in namespace lemon::backends::<stem>: include/lemon/backends/<stem>/<stem>.h inline const descriptor (CLI-safe data) include/lemon/backends/<stem>/<stem>_server.h WrappedServer subclass + create() decl server/backends/<stem>/<stem>_server.cpp implementation + create() def Shared registry/util files stay at the top of backends/. The CMake foreach over LEMON_BACKENDS compiles each <stem>/<stem>_server.cpp and generates the registry headers from the folder paths. Removes the per-backend *_descriptor.{h,cpp} and *_factory.{h,cpp} files. Behavior is unchanged (same descriptors, same create()). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make the existing curated docs generate from the backend descriptors instead of
just shipping a separate reference file — closing appendix rows 14 and 22.
- Expand the descriptor with the editorial fields the curated docs need:
`modality`, `experimental`, `web_display_name`, and a per-support-row
`device_summary` (RecipeBackendDef). These keep the descriptor the single
source of truth.
- /system-info exposes them plus a registry `order` index and `slot_policy`.
- gen_backend_docs.py now targets multiple docs and renders:
* README.md "Supported Configurations" HTML matrix (grouped by modality,
merged rows, rowspans, experimental tag) — wrapped in GENERATED markers;
* docs/guide/configuration/multi-model.md NPU-exclusivity list.
The backend-docs-drift CI job's --check now covers all three docs.
The generated README matrix is also more complete than the hand-written one
(it now includes whispercpp rocm/metal, kokoro metal, sd-cpp metal). Footnotes
and prose outside the markers are preserved.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wrap cli.md's "Recipe-Specific Options" tables in GENERATED markers and render them from the descriptor options. This also fixes pre-existing drift: the section documented `--steps`/`--cfg-scale`/`--width`/`--height` flags that the CLI no longer registers, and omitted the moonshine and vllm recipes. Now covered by the backend-docs-drift CI check. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add inline-marker support to the generator and wrap the `--recipe` "Common values" list in custom-models.md so it renders from the descriptor recipe set (plus collection.omni). Now covered by the backend-docs-drift CI check. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Implements the plan in #2287: each inference backend describes itself with a plain-data descriptor plus a server class, and every scattered
if (recipe == "...")site is rewritten to read a registry built from those descriptors.What changed
Adding a backend is now one
LEMON_BACKENDSline + a<stem>_descriptor.cpp(data) + a<stem>_factory.cpp(create()). No router, CLI, docs, or support-matrix edits — those are all derived.BackendDescriptor/BackendOption/SlotPolicy(backend_descriptor.h);RecipeBackendDefmoved to a shared header.LEMON_BACKENDSat CMake configure time — a CLI-safe data registry (descriptors only, links into bothlemonadeandlemond) and a server-only factory registry (binds each descriptor to its class'screate()). This split is what lets the CLI read recipe options/flags from descriptors without linking server classes.create().SlotPolicy), device type, recipe options / CLI flags / defaults, config-section identity, support matrix (RECIPE_DEFSdeleted), recipe→label inference, cloud availability./system-inforecipesentries enriched withdisplay_name/selectable_backend/uses_ctx_size/options/support. The desktop app now reads recipe display names from/system-infoinstead of hardcoded TypeScript.docs/tools/gen_backend_docs.pybootslemond, reads/system-info+server_models.json, and rewrites marker-delimited regions ofdocs/dev/backends-reference.md. A new CI job (backend-docs-drift) fails on drift. Authoring guide:docs/dev/adding-a-backend.md.Corner cases / cleanups
ryzenaieverywhere (was inconsistent betweens_backend_namesandrecipe_to_config_section).ModelInfo::extrasmap (from unknownserver_models.jsonkeys) so new backends add per-model fields without editing shared structs.Verification
Local (this machine):
lemond+lemonadeCLI + web-app build green;tscclean. Passing suites: server_endpoints (69), server_pinning (6), app-regression (37), test_model_name_normalization, test_cuda_arch_mapping. CLIrun --helpshows all descriptor-derived flags;/system-infocarries the enriched fields; docs--checkis clean.Pre-existing failures unrelated to this change (reproduced identically on
main):test_flm_status(stale message expectations, 16),test_llamacpp_system_backend(HIP plugin required on AMD-GPU hosts),test_multi_checkpoint_completeness(model pull/network),server_eviction(referencesphi-3-mini-4k-instruct-q4, absent from the registry),server_cli2test_020 (built-in model name "Lite Collection" with a space breaks the test's whitespace parser). Relying on CI for clean-environment + cross-platform validation.Notes for reviewers
recipeOptionsConfig.ts(the deeply TypeScript-typed per-recipe option forms) is intentionally left to maintainers per AGENTS.md — the schema is now exposed via/system-infofor a future dynamic migration.BackendSpec(install params are class-side behavior); the descriptor supplies the binary name.🤖 Generated with Claude Code