Skip to content

security: model download integrity — SHA256 absent for 74% of library models and skipped entirely for split-file downloads #318

@yasinBursali

Description

@yasinBursali

Bug Report: Model download integrity — SHA256 missing for most models and skipped for all split-file downloads

Severity: Medium
Category: Security
Platform: All (Linux, macOS, Windows/WSL2)
Confidence: Confirmed

Description

The new model download feature (PR feat(models)) introduced in dream-host-agent.py verifies SHA256 checksums only for single-file downloads when a checksum is present in the library catalog. Two gaps exist:

  1. 14 of 19 library models have no gguf_sha256 in config/model-library.json — verification is skipped silently for all of them.
  2. Split-file downloads (gguf_parts) never verify SHA256 regardless of whether checksums are present — the code explicitly gates verification on len(download_plan) == 1.

Affected File(s)

  • dream-server/bin/dream-host-agent.py lines 1191–1205 (SHA256 verification block)
  • dream-server/config/model-library.json (checksum coverage)

Root Cause

# dream-host-agent.py lines 1191–1205
# Verify SHA256 if provided (single-file only)
if gguf_sha256 and len(download_plan) == 1:
    ...
    if actual != gguf_sha256:
        final_target.unlink(missing_ok=True)
        _write_model_status(...)
        return

The len(download_plan) == 1 guard explicitly skips verification for split models. For single-file models, the guard if gguf_sha256 means models without checksums in the catalog are also not verified.

Evidence

From config/model-library.json (confirmed with script output):

Model SHA256 Split
qwen3.5-4b-q4 No
qwen3.5-9b-q4 No
qwen3.5-27b-q4 No
qwen3-30b-a3b-q4 No
qwen3-coder-next-q4 No
phi4-mini-q4 No
qwen3.5-2b-q4 No
gemma4-e2b-q4 No
deepseek-r1-7b-q4 No
gemma4-e4b-q4 No
phi4-q4 No
deepseek-r1-14b-q4 No
gemma4-26b-a4b-q4 No
gemma4-31b-q4 No
deepseek-r1-32b-q4 No
qwen3.5-35b-a3b-q4 No
deepseek-r1-70b-q4 No
llama4-scout-q4 Yes (2 parts)
qwen3.5-122b-a10b-q4 Yes (3 parts)

Platform Analysis

  • All platforms: Model files are downloaded via curl over HTTPS. Without hash verification, a MITM attacker (DNS poisoning, compromised CDN edge node, rogue router) can swap the file content without detection. HuggingFace CDN uses edge caching — CDN compromise is a documented supply chain risk.
  • macOS: Issue bootstrap-upgrade.sh: SHA256 verification skipped on macOS — missing shasum fallback #279 already tracks missing shasum fallback for bootstrap-upgrade.sh; this is a separate code path (host agent /v1/model/download).

Reproduction

  1. Trigger a model download (e.g., POST /api/models/llama4-scout-q4/download).
  2. Intercept one of the split-file URLs mid-download and serve a different GGUF payload.
  3. The host agent renames the .part file to the final filename without verifying integrity.
  4. The tampered model file is activated and loaded by llama-server.

Impact

  • A MITM or supply chain attacker can replace model weights with a malicious GGUF file.
  • llama.cpp has historically had parsing vulnerabilities — a crafted GGUF triggers potential RCE within the llama-server container.
  • Split models are the largest files (Llama-4-Scout: ~10 GB, Qwen3.5-122B: ~70 GB) and thus most exposed to long download windows.

Suggested Approach

  1. Add per-part SHA256 checksums to model-library.json for all 14 missing models, including sha256 fields inside each entry of gguf_parts.
  2. Remove the len(download_plan) == 1 gate — apply SHA256 verification to every individual part after it is downloaded and before renaming from .part to final filename.
  3. For models genuinely lacking upstream-published checksums, document the gap and consider fetching the model card hash from the HuggingFace API at download time.

Filed by automated security auditor after full-sweep of changes merged 2026-04-06 → 2026-04-11 on upstream/main @ c0600ca.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions