Two small, related UX improvements to the Model Manager that make it clearer what your machine can run and what each model is for. Both are renderer-side and need no API changes; I have a working proof-of-concept (from #1390, which I've just closed in favour of focused, separately-reviewed pieces) and I'm happy to implement whichever direction you prefer — opening this first since it touches the UI and is worth a design discussion.
1. Backend status chips (system state, in the options popover)
Today there's no indication of what Lemonade could run on a given machine. On a Ryzen AI Max+ "Strix Halo", for example, models might run on llama.cpp (Vulkan/ROCm), vLLM (ROCm), or the NPU backends — but unless you already know that and have each backend set up, the capability is invisible; you find out by trial and error.
The proposal: in the Model Manager's options popover (alongside By recipe / Downloaded only), show a compact row of chips reflecting this system's backend readiness — one chip per backend, colour-coded:
| Chip |
Meaning |
| 🟢 Green |
installed — ready to use |
| 🟡 Yellow |
installable — supported here, not installed yet |
| 🔴 Red |
supported by your hardware, but not configured/available |
| ⚪ Grey |
not supported on this hardware (informational) |
(The mock is only to convey the idea — the labels, layout, and placement would all be refined; it could be a lot cleaner.)
The red vLLM chip is the whole point: vLLM is a viable path on this hardware, but nothing surfaces that today. A compact system-state row turns "what can I actually do on this box?" into a glance — and nudges users toward backends worth enabling.
These states are computable entirely client-side from data the API already returns: /system-info recipes[].backends[].state enumerates every recipe — including ones unsupported on the current hardware (they render grey). No backend change is needed. (#1390 already computed a 3-state version of this — installed/available/unsupported; the 4-state split here is a small refinement of the existing backend.state values, e.g. separating installable from action_required.)
2. Hugging Face model-card links
Separately: link each model to its Hugging Face model card. Curated/baked-in models currently surface little about what they're for beyond raw tags, and there's a real limit to how much that can be conveyed with badges/icons (a crab for one thing, 1010 for "coding", …) before it stops scaling.
Each model's name becomes a link to its Hugging Face card (hovering shows the URL), opening in a new tab. One click takes a user from the model row to the source, where the real research lives:
That click-through is the payoff — e.g. an unsloth card links out to a how-to for building purpose-built recipes for different agentic workflows (the -it / -general / -coding variants), and its quant-accuracy graphs help pick a quantization — rather than only seeing "what ships." It also helps explain why Lemonade included a given model. And practically, the model card is where updates and llama.cpp version dependencies tend to get posted — so linking the card stays current for free, versus maintaining links into Lemonade's own docs.
Scope: curated/baked-in models and HF search results. The card URL is derivable from the existing checkpoint field (https://huggingface.co/{checkpoint} with any :quant suffix stripped); an optional url field in the model schema could cover non-derivable/non-HF cases. (User-registered models would be a follow-up.)
Proof of concept
Both were built and running in a working branch (now closed PR #1390), on a much older version of Lemonade — so this is a proven idea, not a paper design, though it would be rebuilt fresh on current main. Backend status as it looked there (an earlier, larger take — bigger chips than the compact ones proposed above):

Would you want this?
If this is a direction you'd like in the Model Manager, I'm happy to refine it and put up focused PRs (card-links and chips separately, off current main). Keen to hear your thoughts.
Thanks!
Two small, related UX improvements to the Model Manager that make it clearer what your machine can run and what each model is for. Both are renderer-side and need no API changes; I have a working proof-of-concept (from #1390, which I've just closed in favour of focused, separately-reviewed pieces) and I'm happy to implement whichever direction you prefer — opening this first since it touches the UI and is worth a design discussion.
1. Backend status chips (system state, in the options popover)
Today there's no indication of what Lemonade could run on a given machine. On a Ryzen AI Max+ "Strix Halo", for example, models might run on llama.cpp (Vulkan/ROCm), vLLM (ROCm), or the NPU backends — but unless you already know that and have each backend set up, the capability is invisible; you find out by trial and error.
The proposal: in the Model Manager's options popover (alongside By recipe / Downloaded only), show a compact row of chips reflecting this system's backend readiness — one chip per backend, colour-coded:
(The mock is only to convey the idea — the labels, layout, and placement would all be refined; it could be a lot cleaner.)
The red vLLM chip is the whole point: vLLM is a viable path on this hardware, but nothing surfaces that today. A compact system-state row turns "what can I actually do on this box?" into a glance — and nudges users toward backends worth enabling.
These states are computable entirely client-side from data the API already returns:
/system-inforecipes[].backends[].stateenumerates every recipe — including ones unsupported on the current hardware (they render grey). No backend change is needed. (#1390 already computed a 3-state version of this —installed/available/unsupported; the 4-state split here is a small refinement of the existingbackend.statevalues, e.g. separatinginstallablefromaction_required.)2. Hugging Face model-card links
Separately: link each model to its Hugging Face model card. Curated/baked-in models currently surface little about what they're for beyond raw tags, and there's a real limit to how much that can be conveyed with badges/icons (a crab for one thing,
1010for "coding", …) before it stops scaling.Each model's name becomes a link to its Hugging Face card (hovering shows the URL), opening in a new tab. One click takes a user from the model row to the source, where the real research lives:
That click-through is the payoff — e.g. an unsloth card links out to a how-to for building purpose-built recipes for different agentic workflows (the
-it/-general/-codingvariants), and its quant-accuracy graphs help pick a quantization — rather than only seeing "what ships." It also helps explain why Lemonade included a given model. And practically, the model card is where updates and llama.cpp version dependencies tend to get posted — so linking the card stays current for free, versus maintaining links into Lemonade's own docs.Scope: curated/baked-in models and HF search results. The card URL is derivable from the existing
checkpointfield (https://huggingface.co/{checkpoint}with any:quantsuffix stripped); an optionalurlfield in the model schema could cover non-derivable/non-HF cases. (User-registered models would be a follow-up.)Proof of concept
Both were built and running in a working branch (now closed PR #1390), on a much older version of Lemonade — so this is a proven idea, not a paper design, though it would be rebuilt fresh on current
main. Backend status as it looked there (an earlier, larger take — bigger chips than the compact ones proposed above):Would you want this?
If this is a direction you'd like in the Model Manager, I'm happy to refine it and put up focused PRs (card-links and chips separately, off current
main). Keen to hear your thoughts.Thanks!