fix(models): platform-aware activation + download cancel by yasinBursali · Pull Request #893 · Light-Heart-Labs/DreamServer

yasinBursali · 2026-04-11T00:10:02Z

Merge order: Merge last — depends on #906, #905, #900, #908, and #902.

Summary

Addresses five bugs in the model management system and adds a download cancel feature. Builds on the model action infrastructure already in main.

What — Bugs fixed

#	Severity	Platform	Bug
1	Critical	macOS	Activation ran `_compose_restart_llama_server` which silently succeeded (replicas: 0 in docker-compose.macos.yml) while the native llama-server process kept the old model loaded
2	High	Linux NVIDIA	`LLAMA_SERVER_IMAGE` written to `.env` even on `apple` backend — meaningless write that could corrupt the env on macOS
3	High	Linux IPv6	Health check used `localhost` → resolved to `::1` on IPv6-enabled hosts — 5-minute false rollback even when model loaded successfully
4	Medium	All	`_compose_restart_llama_server` read `.compose-flags` directly (file may not exist) and used `docker compose restart` for AMD — restart does not re-read `.env`
5	Medium	All	No path traversal protection in activate handler — inconsistent with the delete handler which already had it

Plus: new download cancel feature — users can abort in-progress downloads.

How — Implementation

`dream-server/bin/dream-host-agent.py`

Bug 1 — macOS native restart:
Added if gpu_backend == "apple": branch in _do_model_activate before the _in_container / _compose_restart paths. Stops the existing native process via PID file (ps-verify to avoid PID reuse accidents → SIGTERM → 10s wait → SIGKILL), then re-launches via subprocess.Popen with Metal args matching scripts/bootstrap-upgrade.sh:431-438. Rollback path mirrors the forward path. Extracted shared _launch_native_llama_server() helper to avoid duplicating the ~20-line launch block in both directions.

Bug 2 — apple guard on LLAMA_SERVER_IMAGE:
Reads gpu_backend from .env before the update block and skips LLAMA_SERVER_IMAGE when gpu_backend == "apple".

Bug 3 — 127.0.0.1 vs localhost:
Changed llama_host = "localhost" to llama_host = "127.0.0.1" on the host-native health check path. Docker binds to 127.0.0.1 explicitly; on IPv6-enabled Linux hosts, localhost resolves to ::1 first.

Bug 4 — _compose_restart_llama_server rework:

Replaced inline .compose-flags read with resolve_compose_flags(), which already falls back to running resolve-compose-stack.sh when the cache file is absent
Changed all Docker restart paths to stop + up -d (was: restart for AMD, docker start for no-compose-flags NVIDIA) — up -d re-reads .env; restart and start do not
Named volumes (lemonade-cache, lemonade-llama, lemonade-recipe) survive stop + up -d so there is no Lemonade binary re-cache penalty

Bug 5 — path traversal:
Added .resolve() + .is_relative_to() check on the GGUF target path in _do_model_activate, matching the existing delete handler pattern.

Cancel feature:

Added _model_download_proc: subprocess.Popen | None and _model_download_cancel: threading.Event globals
Switched subprocess.run() to subprocess.Popen() + proc.wait() in the download loop so the process can be killed from outside the thread
The _poll_progress thread checks _model_download_cancel and kills the active curl proc
_model_download_cancel.wait(5) replaces time.sleep(5) in retry delay so cancel is immediate
Added POST /v1/model/download/cancel host agent endpoint
Cancel handler captures a local reference to _model_download_proc to avoid TOCTOU race

`dream-server/extensions/services/dashboard-api/routers/models.py`

Added POST /api/models/download/cancel proxying to host agent via existing _call_agent_model helper

`dream-server/extensions/services/dashboard/src/hooks/useDownloadProgress.js`

Added cancelDownload() async function exposed from the hook
Handles cancelled status (alongside existing failed/error)
Logs cancel errors with console.error instead of swallowing silently

`dream-server/extensions/services/dashboard/src/pages/Models.jsx`

Added red Cancel button inside DownloadProgressBar component, rendered only when helpers.cancelDownload is available

Testing

Automated (all pass):

py_compile — clean
ruff check — clean
Critique Guardian — APPROVED (all CG observations addressed)

Manual (Apple Silicon, local install):

Activated Qwen3.5-2B while Qwen3.5-9B was running → old PID killed, new process launched, health check passed in 1 attempt, .llama-server.pid updated
Attempted activate on non-downloaded model → 400, running process untouched
Cancel with no active download → {"status": "no_download"}

Platform impact

Platform	Impact
macOS Apple Silicon	Activation now works — native process managed via PID file
Linux NVIDIA	`127.0.0.1` health check (IPv6 fix), `stop+up -d` re-reads `.env`
Linux AMD (Lemonade)	`stop+up -d` re-reads `.env`; named volumes preserve cached binary
Docker Desktop / WSL2	`_recreate_llama_server` path unchanged

🤖 Generated with Claude Code

Lightheartdevs

Five bugs + a feature in one PR is larger than I'd normally like, but every piece is tightly coupled to model activation/download so I'll let it stand. Each bug is legitimate:

Bug 1 (Critical, macOS): the silent-success activation bug is particularly nasty — _compose_restart_llama_server "succeeded" because replicas:0 in docker-compose.macos.yml made the compose call a no-op, while the native llama-server process kept the old model loaded. User sees "success," asks a question, gets the old model's response. Worst kind of bug.

Fix is correct: the if gpu_backend == "apple": branch before the container/compose paths stops the native process via PID file (with ps-verify to avoid PID reuse — good paranoia), launches fresh via subprocess.Popen with Metal args matching bootstrap-upgrade.sh:431-438. Shared _launch_native_llama_server() helper is the right extraction.

Bug 2: writing LLAMA_SERVER_IMAGE to .env on the apple backend is meaningless and could corrupt the env. Skip is correct.

Bug 3: localhost → 127.0.0.1 on the host-native health check. Consistent with #977 (dreamforge/perplexica healthcheck) and #975 (native llama-server binds). Good.

Bug 4: stop + up -d over restart — same pattern as #935. restart doesn't re-read .env; up -d does. The named-volume analysis (lemonade-cache, lemonade-llama, lemonade-recipe survive) correctly addresses the "does this nuke the cached binary" concern.

Bug 5: path traversal protection on activate matches the existing delete handler pattern — good consistency.

Cancel feature: the subprocess.Popen + threading.Event + TOCTOU-safe local-ref pattern is correct. _model_download_cancel.wait(5) replacing time.sleep(5) for fast cancel is a nice touch. Dashboard UI button is scoped correctly (only renders when helpers.cancelDownload is available).

Merge order (per author): last, after #906, #905, #900, #908, #902. That's a 5-deep dependency chain — please bundle these into a merge train or the leaf PRs will keep needing rebases. Ship.

Lightheartdevs · 2026-04-18T14:50:45Z

Cross-PR coordination note from a deeper re-read.

The new _launch_native_llama_server helper binds to --host 0.0.0.0 — this preserves the existing behavior in main (both bootstrap-upgrade.sh and install-macos.sh currently bind to 0.0.0.0 on macOS native launches), so it's not a regression vs the current state.

However, it conflicts with the direction of draft PR #975, which changes all native launches (macOS installer, CLI, host agent, bootstrap-upgrade, Windows) from --host 0.0.0.0 to --host 127.0.0.1. If #975's security fix is split out and lands first (as I've recommended in #975's review), this new helper will need to be updated to match — otherwise merging this PR would silently re-introduce the 0.0.0.0 bind for the macOS activation path that #975 just removed.

Options:

If this PR merges first: fix(security): bind llama-server and host agent to loopback #975 needs to add _launch_native_llama_server to its list of native-launch fixes (currently it only covers the 7 call sites that existed when fix(security): bind llama-server and host agent to loopback #975 was branched).
If fix(security): bind llama-server and host agent to loopback #975 merges first: update this PR to use 127.0.0.1 in the new helper.

Either works; just needs coordination between you and yasin so one doesn't silently undo the other. Not blocking — still approving.

Addresses five bugs in the model management system and adds a download cancel feature. Builds on the model action infrastructure from PR Light-Heart-Labs#886. - **macOS native llama-server restart**: Adds a gpu_backend == "apple" branch in _do_model_activate that stops the existing native process via PID file (SIGTERM → 10s wait → SIGKILL, with ps-verify to avoid PID reuse accidents), then re-launches via subprocess.Popen with Metal args. Previously _compose_restart_llama_server ran docker commands that silently succeeded on macOS (replicas: 0 in docker-compose.macos.yml) while the native process kept running the old model. - **LLAMA_SERVER_IMAGE apple guard**: Reads gpu_backend before updating .env and skips LLAMA_SERVER_IMAGE on apple. Previously the image was written unconditionally — meaningless on macOS where llama-server is a native binary, not a container. - **Health check uses 127.0.0.1**: On IPv6-enabled Linux hosts, localhost resolves to ::1 first, but Docker binds to 127.0.0.1 only. Previously the health check timed out after 5 minutes and triggered a false rollback even when the model loaded successfully. - **_compose_restart_llama_server uses resolve_compose_flags()**: Replaced the inline .compose-flags read with resolve_compose_flags(), which falls back to running resolve-compose-stack.sh dynamically when the cached file is absent. Also changed both AMD and NVIDIA paths to stop + up -d (was: restart for AMD, start for no-compose-flags NVIDIA) so that updated .env values are always picked up by the new container. - **Path traversal protection in activate handler**: Added .resolve() + .is_relative_to() check on the GGUF target path, matching the existing delete handler pattern. - Adds POST /v1/model/download/cancel host agent endpoint - Uses threading.Event (_model_download_cancel) + _model_download_proc global. Download thread checks the cancel flag at the start of each part loop and in the _poll_progress thread (which kills the curl Popen). On cancel: kills curl, cleans up .part file, writes cancelled status. - Changed subprocess.run() to subprocess.Popen() + proc.wait() so the curl process can be killed from the cancel handler or poll thread. - Cancel handler captures a local reference to _model_download_proc to avoid TOCTOU race where the download thread nulls it mid-check. - Dashboard-api proxies via POST /api/models/download/cancel. - Frontend useDownloadProgress exposes cancelDownload(). Models.jsx renders a red Cancel button in the progress bar. - Also handles 'cancelled' status in useDownloadProgress (was only 'failed' and 'error'). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Three non-blocking improvements from Critique Guardian review: - Extract _launch_native_llama_server() helper to remove ~40 lines of duplicate code between the forward and rollback paths in _do_model_activate. Both paths now call the same function, which reads GGUF_FILE / CTX_SIZE / LLAMA_REASONING from .env after the caller has written the update. - Replace _time.sleep(5) in retry loop with _model_download_cancel.wait(5) so a cancel request is honored immediately during the retry delay instead of waiting up to 5 seconds. - Hoist import time to the top of _do_model_activate instead of scattering inline imports inside conditional blocks. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace the empty catch block in cancelDownload with console.error so failed cancel requests are visible in devtools instead of disappearing silently. Consistent with project error-handling rules. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

yasinBursali force-pushed the fix/model-activate-bugs-v2 branch from 213342c to 6eec8af Compare April 16, 2026 14:03

This was referenced Apr 16, 2026

fix(host-agent): surface docker failures in _compose_restart_llama_server #906

Merged

fix(dashboard-api): require all parts present for split-file models #902

Merged

Lightheartdevs previously approved these changes Apr 18, 2026

View reviewed changes

yasinBursali and others added 4 commits April 18, 2026 14:33

test(models): cover cancel and restart helpers

7dc775f

Lightheartdevs dismissed their stale review via 7dc775f April 18, 2026 18:35

Lightheartdevs force-pushed the fix/model-activate-bugs-v2 branch from 6eec8af to 7dc775f Compare April 18, 2026 18:35

Lightheartdevs merged commit caa8b47 into Light-Heart-Labs:main Apr 18, 2026
27 of 28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(models): platform-aware activation + download cancel#893

fix(models): platform-aware activation + download cancel#893
Lightheartdevs merged 4 commits intoLight-Heart-Labs:mainfrom
yasinBursali:fix/model-activate-bugs-v2

yasinBursali commented Apr 11, 2026 •

edited

Loading

Uh oh!

Lightheartdevs left a comment

Uh oh!

Lightheartdevs commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yasinBursali commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What — Bugs fixed

How — Implementation

dream-server/bin/dream-host-agent.py

dream-server/extensions/services/dashboard-api/routers/models.py

dream-server/extensions/services/dashboard/src/hooks/useDownloadProgress.js

dream-server/extensions/services/dashboard/src/pages/Models.jsx

Testing

Platform impact

Uh oh!

Lightheartdevs left a comment

Choose a reason for hiding this comment

Uh oh!

Lightheartdevs commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yasinBursali commented Apr 11, 2026 •

edited

Loading

`dream-server/bin/dream-host-agent.py`

`dream-server/extensions/services/dashboard-api/routers/models.py`

`dream-server/extensions/services/dashboard/src/hooks/useDownloadProgress.js`

`dream-server/extensions/services/dashboard/src/pages/Models.jsx`