Commit 9ab2974
authored
fix(setup): start genie-ai-runtime before memory-heavy services (#76)
Fixes #75. Issue evidence on Jetson Orin Nano 8 GB shows the runtime's auto-clamp behavior is order-sensitive: same `-c 4096` request fits only ~1.7k context when the full stack is already resident, but fits 4k / 6k / 8k cleanly when `genie-ai-runtime` loads first. This PR pins the right startup order so the runtime claims its KV cache before memory-heavy services occupy DRAM.
Three coordinated changes:
1. `Before=genie-whisper.service genie-whisper-warmup.service homeassistant.service genie-core.service` on `genie-ai-runtime.service`. systemd ordering directive that makes the LLM unit's load complete before the other memory-heavy services start. Combined with PR #72's existing `After=genie-ai-runtime.service genie-llm.service` on `genie-core`, the dependency is now bidirectional. `Before=` is a no-op for units that aren't installed on this host, so this is safe for installs that don't ship homeassistant or whisper.
2. `GENIEPOD_AI_RUNTIME_CONTEXT` default bumped from `2048` to `8192` — the largest context the issue verified loads cleanly with `--int8-kv` on Orin Nano 8 GB. The env knob stays settable via systemd drop-in for smaller Jetsons.
3. `deploy/scripts/start_all.sh` reorders the `UNITS=(...)` array so the configured LLM unit + warmup run before `homeassistant`, `genie-whisper`, `genie-whisper-warmup` in the manual lifecycle path too, mirroring the systemd `Before=`.
Tests added to `tool_dispatch_test.rs` lock both invariants:
- `start_all_uses_configured_llm_backend` asserts `$configured_llm_unit` appears before `homeassistant.service` and `genie-whisper.service` in the `UNITS=` array.
- `genie_ai_runtime_service_preserves_model_page_cache` asserts `GENIEPOD_AI_RUNTIME_CONTEXT=8192` and that the new `Before=` clause is present.
Compatibility with PR #70 (warm page cache across restart) is preserved — `Before=` only affects boot-time ordering, not `systemctl restart genie-ai-runtime` alone.
End-user verified on the same Jetson the issue was filed against.
Worth a follow-up: PR #74's `GENIE_RUNTIME_MAX_BODY_BYTES = 4 KB` body-compaction threshold is now leaving performance on the table at the new 8192-token runtime context (the client compacts prompts the runtime could now handle). Right path is to make the threshold a function of `GENIEPOD_AI_RUNTIME_CONTEXT` or probe runtime capacity at connection time. Not blocking this PR.
All 7 CI checks green on `c1cae29` (fmt, clippy, test, aarch64 cross-compile, shellcheck, ruff, `--no-default-features`).1 parent 8204be0 commit 9ab2974
3 files changed
Lines changed: 38 additions & 10 deletions
File tree
- crates/genie-core/tests
- deploy
- scripts
- systemd
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
264 | 264 | | |
265 | 265 | | |
266 | 266 | | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
267 | 285 | | |
268 | 286 | | |
269 | 287 | | |
| |||
285 | 303 | | |
286 | 304 | | |
287 | 305 | | |
288 | | - | |
289 | | - | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
290 | 314 | | |
291 | 315 | | |
292 | 316 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
118 | 118 | | |
119 | 119 | | |
120 | 120 | | |
121 | | - | |
122 | 121 | | |
123 | | - | |
124 | | - | |
125 | 122 | | |
126 | 123 | | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
127 | 127 | | |
128 | 128 | | |
129 | 129 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
5 | 10 | | |
6 | 11 | | |
7 | 12 | | |
| |||
12 | 17 | | |
13 | 18 | | |
14 | 19 | | |
15 | | - | |
16 | | - | |
17 | | - | |
18 | | - | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
19 | 23 | | |
20 | 24 | | |
21 | 25 | | |
22 | 26 | | |
23 | 27 | | |
24 | 28 | | |
25 | | - | |
| 29 | + | |
26 | 30 | | |
27 | 31 | | |
28 | 32 | | |
| |||
0 commit comments