You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Brings in PR #865 (Gemma 4 E4B universal default + native tool_calls
priority + Lemonade v10.1.0 port flip 8000→13305) on top of our
existing gaia-lite work. The two PRs are complementary: #865 sets the
global LLM/VLM default; this branch sets gaia-lite's preset.
Conflict resolution
-------------------
src/gaia/llm/lemonade_manager.py
Both branches changed the module header. Ours added a re-export
pattern (__all__ + ``from lemonade_client import DEFAULT_CONTEXT_SIZE``)
so DEFAULT_CONTEXT_SIZE has a single source of truth. Main flipped
the default port 8000→13305 for Lemonade v10.1.0. Resolution keeps
both: the re-export pattern is preserved and the port is updated.
Test updates required by stricter pre-flight semantics
------------------------------------------------------
This branch's _maybe_load_expected_model + _ensure_model_loaded changes
tightened two behaviours that the existing test suite expressed against
the old loose semantics:
tests/unit/test_chat_preflight.py
Pre-flight now requires the EXPECTED model to be active (not just
any LLM) and its ctx_size ≥ 32K. Updated _model() helper to accept
name + ctx_size kwargs; the four "skips load" tests now load a
health entry whose model_name matches the expected model_id and
whose ctx_size is at the 32K floor. Without this, the test fixture
"test-llm" mismatched the expected "Qwen3.5-35B-A3B-GGUF" and tripped
the wrong-model reload branch our PR added.
tests/unit/test_lemonade_model_loading.py
_ensure_model_loaded now defaults ctx_size to DEFAULT_CONTEXT_SIZE
(32768) for unknown models, fixing the silent-empty-stream regression
where Lemonade's default 4096 ctx truncated ChatAgent's >7K-token
system prompt. Two assertions updated from ctx_size=None → 32768.
All affected tests pass; lint clean.
Copy file name to clipboardExpand all lines: cpp/README.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,23 +28,23 @@ Included demos:
28
28
29
29
The agent connects to an OpenAI-compatible LLM server at `http://localhost:8000/api/v1` by default. The reference backend is [Lemonade Server](https://github.com/lemonade-sdk/lemonade), which runs models locally on AMD hardware.
30
30
31
-
Download and install Lemonade Server v10.0.0, then start it:
31
+
Download and install Lemonade Server v10.2.0, then start it:
0 commit comments