Skip to content

fix(server): echo persisted recipe_options in the /load response#1706

Open
ianbmacdonald wants to merge 2 commits into
mainfrom
fix/persisted-recipe-options-recall
Open

fix(server): echo persisted recipe_options in the /load response#1706
ianbmacdonald wants to merge 2 commits into
mainfrom
fix/persisted-recipe-options-recall

Conversation

@ianbmacdonald

@ianbmacdonald ianbmacdonald commented Apr 22, 2026

Copy link
Copy Markdown
Collaborator

What & why

save_options: true on POST /load replaces the model's stored recipe options with the set sent in the request (omitting a key clears it). As @bitgamma pointed out, that replace-on-save behavior is intentional — sending recipe options means sending the complete set, which is how a caller unsets one. Agreed; this PR no longer touches those semantics.

The problem it addresses is narrower: that contract is invisible to a client driving the API programmatically. The /load response reports a bare success, so an agent iteratively tuning a recipe (load → measure → adjust → save) can't tell that an option it didn't resend was just dropped — short of a follow-up GET /models or reading the API docs.

Change

Echo the model's persisted recipe_options in the /load success response, mirroring what GET /models already returns (model_info_to_json serializes the same recipe_options.to_json()). Replace-on-save semantics are unchanged — the resulting persisted state is simply made observable.

The echoed set is the persisted one (identical to /models), not the per-request effective options: runtime options sent without save_options are applied for that load but are not reflected here. This is documented explicitly.

  • src/cpp/server/server.cpp — echo recipe_options in both the per-model and collection /load branches (uniform response shape)
  • test/server_endpoints.pytest_012m: asserts the response carries recipe_options and that a subsequent partial save surfaces the cleared key's absence (real RED→GREEN teeth — an unpatched server returns no recipe_options)
  • docs/api/lemonade.md — corrected the stale /load response example and documented the field + persisted-vs-effective semantics

Validation

Recipe-options endpoint suite 6/6 (test_010/011/012/012a/012b/012m) on real hardware:

  • lemonade 10.8.0 (dev .deb), Ubuntu 26.04, kernel 7.0.0-22-generic
  • AMD Radeon RX 7900 XT (Navi 31, gfx1100, 24 GB), backend llamacpp / vulkan

RED→GREEN confirmed against a live server: pre-patch /load returns {status, model_name, checkpoint, recipe} (no recipe_options); post-patch it includes recipe_options, and a partial save visibly drops the omitted key.


Note: this supersedes the earlier merge-on-partial-save approach (which would have changed the replace-on-save contract). The branch was reset to a single clean commit on top of main.

@jeremyfowers

Copy link
Copy Markdown
Member

Please ensure tests are passing and then request review from @bitgamma

@github-actions github-actions Bot added enhancement New feature or request cpp labels Jun 6, 2026
@ianbmacdonald ianbmacdonald force-pushed the fix/persisted-recipe-options-recall branch from 7376f63 to 81dd2bd Compare June 16, 2026 00:39
@ianbmacdonald

Copy link
Copy Markdown
Collaborator Author

Rebased onto current main and reworked to fit the recipe-options changes that landed since this PR was opened (canonical-ID keying via cache_key_to_canonical_id, the build_recipe_options snapshot signature, and the backend-aware two-pass load resolution in Router::load_model). Summary of what's here now:

Core fix (unchanged intent): load-time resolution re-reads recipe_options.json so an out-of-band edit is honored on the next load/auto-load without restarting lemond; keys are canonical throughout (built-ins as builtin.<name>).

Thread safety: the load-time refresh reads the file into a local (it does not mutate the shared recipe_options_ member), and the resolution that reads server_models_/user_models_ now runs under models_cache_mutex_ — so this change adds no new unsynchronized access vs main. A broader unification of all recipe_options_ access under a single lock (so /models listings also reflect out-of-band edits immediately) is intentionally left as a separate concurrency follow-up to keep this PR focused.

save_options=true semantics: the request is now merged over the model's prior saved options only (Layer 3), not the fully-resolved effective set. This preserves earlier user saves without baking image_defaults/model-JSON defaults into recipe_options.json — so a future change to a model's registry defaults is no longer shadowed by a stale saved copy.

Tests: added regressions for auto-load recall, user.* alias arg preservation, delete/recreate cleanup, and the out-of-band disk-edit case; numbered to avoid collision with the merged pull/variants and cloud tests. Note: touching test/server_endpoints.py triggered Black (v26.1.0) to normalize some pre-existing formatting drift in unrelated regions of that file — those hunks are formatting-only.

@ianbmacdonald ianbmacdonald requested a review from bitgamma June 16, 2026 00:40
@ianbmacdonald ianbmacdonald force-pushed the fix/persisted-recipe-options-recall branch from 81dd2bd to b3adcfb Compare June 16, 2026 13:34
@ianbmacdonald

Copy link
Copy Markdown
Collaborator Author

Rebased onto current main (95c5f651) to clear the merge conflict that arose after #2235 (auto context-size), #2226 (safe model pinning), and #2119 (downloaded-only filter) landed in the same model-manager/router/server files.

The only textual conflict was an import block in test/server_endpoints.py (both sides added helpers from test_models); resolved as the union. The C++ changes auto-merged — they sit in the recipe-option resolution path (top of load_model and the handle_load save branch), disjoint from the new eviction/pinning logic.

Re-validated on real hardware (RX 7900 XT, llama.cpp/vulkan): the test_010test_012* recipe-options suite (including test_012o, the on-disk recipe_options.json-honored-without-restart case, and test_012b, evict+reload on option change) all pass against a live lemond. Build is clean.

@bitgamma

Copy link
Copy Markdown
Member

except if modifying the recipe_options.json manually (which is not really supported behavior) I fail to reproduce the issue this PR is trying to solve. Do you have a step by step reproduction?

@ianbmacdonald ianbmacdonald force-pushed the fix/persisted-recipe-options-recall branch from b3adcfb to 61fe7f5 Compare June 18, 2026 22:48
@ianbmacdonald ianbmacdonald changed the title fix: harden persisted recipe option recall fix: preserve prior saved recipe options on partial save Jun 18, 2026
@ianbmacdonald

Copy link
Copy Markdown
Collaborator Author

@bitgamma you're right that the out-of-band recipe_options.json edit isn't a supported flow — I've dropped that part. The remaining bug is a clobber on save, reproducible purely over the API with no manual file edits:

  1. POST /api/v1/load {"model_name": M, "ctx_size": 8192, "save_options": true} → saved {ctx_size: 8192}
  2. POST /api/v1/load {"model_name": M, "llamacpp_args": "--top-k 20", "save_options": true} → saved becomes {llamacpp_args: "--top-k 20"}ctx_size is dropped
  3. POST /api/v1/load {"model_name": M} → effective ctx_size is back to the default instead of 8192

handle_load rebuilt the saved entry from only the current request's recipe-option fields, so the second save replaced the first. The fix merges each save_options=true request over the model's prior saved options. test_012p covers exactly this sequence (asserted via GET /models/{id} and the option-free reload).

I've also narrowed the PR accordingly: the load-time disk-refresh from the earlier revision is gone, since — as you note — it only affected manual edits, and lemond keeps its in-memory map authoritative on every API save. Rebased on current main and validated 9/9 on the recipe-options suite (RX 7900 XT, Vulkan).

@bitgamma

Copy link
Copy Markdown
Member
1. POST /api/v1/load {"model_name": M, "ctx_size": 8192, "save_options": true} → saved {ctx_size: 8192}
2. POST /api/v1/load {"model_name": M, "llamacpp_args": "--top-k 20", "save_options": true} → saved becomes 
3. {llamacpp_args: "--top-k 20"} — ctx_size is dropped
POST /api/v1/load {"model_name": M} → effective ctx_size is back to the default instead of 8192

this behavior is intentional, sending recipe options means sending all of them. Otherwise there is no way to unset them

A save_options:true load *replaces* the model's stored recipe options with
the set sent in the request (omitting a key clears it). That contract is
documented but invisible to a client driving the API programmatically: the
/load response reports a bare success, so a caller incrementally tuning a
recipe cannot tell that an option it did not resend was dropped without a
follow-up /models query or reading the API docs.

Echo the model's persisted recipe_options in the /load success response
(both the per-model and collection branches), mirroring the existing
/models representation (model_info_to_json already serializes
recipe_options.to_json()). The replace-on-save semantics are unchanged; the
resulting persisted state is simply made observable. The echoed set is the
persisted one (== /models), not the per-request effective options; this is
documented in docs/api/lemonade.md.

Adds test_012m_load_response_echoes_recipe_options: asserts the response
carries recipe_options and that a subsequent partial save surfaces the
cleared key's absence. Validated on ai3 (RX 7900 XT, llamacpp/vulkan):
recipe-options suite 6/6 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: glm-5.2 <noreply@zhipuai.cn>
Co-Authored-By: gpt-5.5 <noreply@openai.com>
@ianbmacdonald ianbmacdonald force-pushed the fix/persisted-recipe-options-recall branch from 61fe7f5 to 5e58f75 Compare June 19, 2026 21:33
@ianbmacdonald ianbmacdonald changed the title fix: preserve prior saved recipe options on partial save fix(server): echo persisted recipe_options in the /load response Jun 19, 2026
@ianbmacdonald

Copy link
Copy Markdown
Collaborator Author

Thanks @bitgamma — that's a fair call, and you're right. The replace-on-save behavior is documented (docs/api/lemonade.md: "Any previously stored value for model_name is replaced"), and the omit-to-unset mechanism makes sense — there has to be some way to clear an option, and "send the complete set" is a reasonable way to get it. So I'm not arguing the default semantics; I've reverted the merge-on-save change entirely.

I do think there's a narrower, real problem underneath, and it's not "merge vs replace" — it's discoverability. A client driving the API programmatically (e.g. an agent that loads a model, measures, adjusts an option, and saves to iterate on a recipe) has no way to learn the replace-all rule except by reading that one line in the docs. The API itself never surfaces it: POST /load with save_options: true accepts the partial set, persists it, and returns a bare success — so a caller that saves {ctx_size: 8192} and later saves {llamacpp_args: "..."} to refine gets back "success" while ctx_size is silently dropped. No error, no echo.

So I've re-pointed this PR away from changing the default and toward making the contract observable:

Echo the model's persisted recipe_options in the /load response — the same set GET /models already returns (model_info_to_json serializes the identical recipe_options.to_json()). Your replace semantics are untouched; the result is just visible, so a client can confirm what a save persisted (including that a key it didn't resend was cleared) without a follow-up /models round-trip or out-of-band doc knowledge.

The echoed field is the persisted set (== /models), not the per-request effective options — I documented that distinction in docs/api/lemonade.md and also fixed the stale /load response example there.

The diff is down to server.cpp + one test + the doc. test_012m exercises the save → partial-save → cleared-key sequence end-to-end through the new echo. Validated on an RX 7900 XT (llama.cpp/vulkan): recipe-options suite 6/6.

Does that direction work for you?

@bitgamma

Copy link
Copy Markdown
Member

I don't think returning persisted options on /load is desirable as that might be confusing. Without reading the docs (which we have to assume to be the default case), you would assume returned options would be the effective one and not the stored values.

@ianbmacdonald

ianbmacdonald commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator Author

I don't think returning persisted options on /load is desirable as that might be confusing. Without reading the docs (which we have to assume to be the default case), you would assume returned options would be the effective one and not the stored values.

You are talking about a human or an agent? That is a lot of wasted context if you are letting a Qwen3.6 model on a strix halo orchestrate lemonade.

the original use case was a project for a client, where crewai (a long time ago) was persisting its own recipes (before we made recipes better) and just could not get it right.. stuck in the trap of trying new options and getting inconsistent results. The PR grew a bit (my fault) but its right back at the narrow scope that makes agents burn calories or fail at optimizing lemonade recipes on their own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpp enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants