Skip to content

bug: list_models reports split-part model as 'downloaded' when only first part exists — misleads user, activation fails #316

@yasinBursali

Description

@yasinBursali

Bug Report: list_models reports split-part model as 'downloaded' when only first part exists — misleads user, activation fails

Severity: Medium
Category: Validation
Platform: All (macOS, Linux, Windows/WSL2)
Confidence: Confirmed

Description

list_models in routers/models.py determines a split-model's download status by checking only whether the first part file exists on disk. If a multi-part model download was interrupted after the first part completed (but before subsequent parts), the endpoint reports the model as "downloaded" — ready to activate. The host agent's download-check was already fixed (commit 3f1e18d9) to require ALL parts, but the dashboard router's status computation was not updated. A user who sees "downloaded" and tries to activate the model gets a failed activation (llama-server cannot load an incomplete split model).

Affected File(s)

  • dream-server/extensions/services/dashboard-api/routers/models.py (L112–121, list_models)

Root Cause

# routers/models.py L112-121:
parts = model.get("gguf_parts", [])
first_part = parts[0]["file"] if parts else gguf_file   # only checks part 1

if gguf_file and gguf_file == active_gguf:
    status = "loaded"
elif first_part and first_part in downloaded:           # BUG: only part 1 checked
    status = "downloaded"
else:
    status = "available"

The host agent already has the correct check (commit 3f1e18d9):

# dream-host-agent.py L1107-1110:
all_downloaded = all((models_dir / fn).exists() for fn, _ in download_plan)

The two components are now inconsistent.

Platform Analysis

  • macOS: Affected — model files stored on macOS host filesystem; same status logic applies.
  • Linux: Affected — same. Interrupted downloads leave orphaned part 1 files.
  • Windows/WSL2: Affected — same; partial downloads are possible on all platforms.

Reproduction

  1. Start downloading a split-file model (e.g., a model with gguf_parts of 2+ files).
  2. Interrupt the download after the first part completes.
  3. Call GET /api/models.
  4. Expected: status "available" (not all parts present).
  5. Actual: status "downloaded" (only first part checked).
  6. Try to activate the model — activation fails with health check timeout because llama-server cannot find the missing parts.

Impact

Operators waste 5+ minutes (health check loop) activating an incomplete model before seeing a rollback. The misleading "downloaded" status gives false confidence. On slow connections where large model downloads are frequently interrupted, this is a common scenario.

Suggested Approach

In list_models, when gguf_parts is non-empty, check that ALL part filenames are present in the downloaded dict before setting status = "downloaded". This mirrors the logic already in the host agent's _handle_model_download.


Filed by automated Python auditor after full-sweep review of Python changes merged 2026-04-06 → 2026-04-11 on upstream/main @ c0600ca.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions