Skip to content

feat(ai-chat): allow cancelling in-progress model downloads#701

Open
chriscrosstalk wants to merge 1 commit intodevfrom
feat/ai-model-download-cancel
Open

feat(ai-chat): allow cancelling in-progress model downloads#701
chriscrosstalk wants to merge 1 commit intodevfrom
feat/ai-model-download-cancel

Conversation

@chriscrosstalk
Copy link
Copy Markdown
Collaborator

Summary

Adds a cancel button to in-progress Ollama model downloads in the AI Settings page, mirroring the content download cancel pattern from PR #554. Also unifies the "Active Model Downloads" visual layout with the "Active Downloads" card used for ZIMs, maps, and pmtiles — same byte counts, progress bar, live speed, and status indicator.

Before this PR, if a user started a 40 GB model by accident or picked the wrong variant, their only options were to wait it out, restart the admin container, or docker exec nomad_ollama pkill ollama. The Active Model Downloads card also showed only a percentage with no file size, speed, or status.

Closes #676.

What changed

Backend (3 files):

  • OllamaService.downloadModel() — accepts an optional AbortSignal and jobId. Passes signal to the axios pull stream, detects ERR_CANCELED and returns a non-retryable failure. Tracks per-digest progress in a Map and aggregates total bytes across all blobs so the UI shows a single monotonically-increasing total instead of a percent that jumps around as each blob is pulled. Throttles Transmit broadcasts to every 500 ms to prevent SSE flooding from Ollama's rapid progress events.
  • DownloadModelJob.handle() — mirrors the RunDownloadJob abort pattern: in-memory abortControllers map, signalCancel() static method, 2-second Redis poll loop (key nomad:download:model-cancel:{jobId}), and UnrecoverableError on user-initiated cancel. Without the UnrecoverableError, BullMQ's attempts: 40 config would retry the cancelled download for 40 minutes. Stores downloadedBytes/totalBytes in job data alongside percent.
  • DownloadService.cancelJob() — refactored to check the model queue as a fall-through when the job isn't found in the file queue. Extracted _pollForTerminalState() and _removeJobWithLockFallback() helpers to share logic between both branches. On model cancel, broadcasts a percent: -2, status: 'cancelled' event so the frontend hook can clear the entry.

No new controller, route, or API client method is needed — POST /api/downloads/jobs/:jobId/cancel and api.cancelDownloadJob(jobId) already existed from PR #554 and now handle both queues.

Frontend (2 files):

  • useOllamaModelDownloads — adds jobId, downloadedBytes, totalBytes to the OllamaModelDownload type, handles percent === -2 (cancelled) with a 2-second auto-clear, exposes a removeDownload(model) function for optimistic cleanup.
  • ActiveModelDownloads — rewritten to match the ActiveDownloads card layout: bold title + ollama badge + X GB / Y GB + progress bar with percent overlay + "Downloading..." indicator with pulsing green dot and live speed (5-sample moving average of byte deltas, identical pattern to the content download component). Cancel X opens a StyledModal confirmation that explains partial data behavior. Guards against missing jobId (defensive for stale broadcasts during hot upgrades).

Bonus: the generic ActiveDownloads component (used in Easy Setup) automatically picks up cancel-for-model-downloads support since it already lists model jobs from /api/downloads/jobs.

Visual consistency

The AI model download card now matches the Content Explorer download card:

Element Before After
Title Bold model name Bold model name
Type badge ollama-model (via HorizontalBarChart) ollama (matches zim badge style)
Progress text 20.5% / 100% 2.3 GB / 24.6 GB
Progress bar Percent only Percent overlay on progress bar
Status (none) ● Downloading... 111.1 MB/s
Cancel (none) X button → confirmation modal

Partial blob behavior — documented in the modal

Ollama saves model blobs incrementally during a pull but only writes the local manifest after ALL blobs are downloaded. This means:

  • Cancelling a pull leaves partial blobs on disk (verified: ~5 GB for a cancelled qwen2.5:14b at ~40%)
  • ollama list does NOT show the partial model (no manifest)
  • ollama rm <model> fails with model not found (nothing to remove via Ollama's API)
  • Re-pulling the same model reuses the partial blobs and completes in a fraction of the time (verified: qwen2.5:14b cancelled at 40%, re-pulled to 91% in 12 seconds)

There is currently no Ollama API to clean up orphaned partial blobs, and manually deleting blob files is unsafe because blobs are content-addressed and may be shared with other installed models.

The confirmation modal explains this trade-off directly to the user:

"Any data already downloaded will remain on disk. If you re-download this model later, it will resume from where it left off rather than starting over."

A follow-up "Reclaim disk space" feature (blob GC that walks installed manifests and removes unreferenced blobs) could be added in a future PR if disk reclamation becomes a common request.

Test plan

All verified on NOMAD3 (192.168.200.183, Ollama v0.20.4) via hot-patched JS:

  • Happy path — phi3:mini, qwen2.5:14b, mistral:7b, gpt-oss:20b all pull and install cleanly through new signal-aware code
  • New visual layout — mixtral:8x7b showed 2.3 GB / 24.6 GB, 9%, Downloading... 111.1 MB/s with pulsing green dot
  • Cancel mid-stream via UI (X → modal → Cancel Download → spinner → entry clears)
  • Cancel mid-stream via API directly (POST /api/downloads/jobs/:jobId/cancel)
  • No retry storm — single Job failed: Download cancelled log, zero retry attempts after 15s wait
  • BullMQ cleanup — job and lock keys both removed from Redis
  • Idempotent cancel — calling cancel on a non-existent jobId returns {success: true, message: 'Job not found (may have already completed)'}
  • Partial blob resume — re-pulling a cancelled model reuses partial blobs and completes quickly
  • Transmit throttling — 500ms cap prevents SSE flood, speed calculation stable
  • Modal copy correctly explains the disk-retention behavior
  • "Keep Downloading" dismisses the modal without cancelling
  • Browser console — zero errors, zero warnings

Adds a cancel button to in-progress Ollama model downloads and unifies
the Active Model Downloads card layout with the Active Downloads card
used for ZIMs, maps, and pmtiles (byte counts, progress bar, live speed,
status indicator).

Closes #676.
@chriscrosstalk
Copy link
Copy Markdown
Collaborator Author

TL;DR - this bring the AI model active downloads display in line with the Content Explorer robust downloads display.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant