You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Users with their own server want a quick path to download + run the
Sparrow Engine HTTP server in Docker. Pre-existing docs were misleading:
README had no Docker section at all; user-manual §2.4 row 5 said
'docker pull sparrow-engine:cpu (or :gpu)' but no such pull works
(release.yml does not push to a container registry today).
Adds:
- README.md § Alternative install paths → new 'Docker image (server
deployments)' subsection with image inventory, Zenodo download
(Option A), build-from-source (Option B), docker run + compose
examples.
- docs/user-manual.md § 2.4 row 5: replace misleading 'docker pull'
command with reference to new §2.8.
- docs/user-manual.md § 2.8 'Docker image deployment': comprehensive
operator-grade section covering image inventory, why-no-pull
rationale (with SW-1 cross-ref), both download paths with caveats,
Dockerfile multi-stage detail, run-the-server invocations for CPU +
GPU, bundled docker-compose.yml walkthrough, operator env-var table,
cross-references to §7 (HTTP API) + §11 (cold-start) + sparrow
companion repo's sync_sparrow_engine.sh.
The Zenodo download path uses sparrow companion repo's
download_sparrow_engine_images.sh script which knows the current
pinned record + expected SHA-256 digests + handles docker load +
canonical retag. Documented caveat that Zenodo lags source HEAD;
build-from-source is the bleeding-edge path.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy file name to clipboardExpand all lines: README.md
+66Lines changed: 66 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -125,6 +125,72 @@ environment. Check the installed version with
125
125
See [§6 of the user manual](docs/user-manual.md#6-python-package--sparrow-engine)
126
126
for the full API surface and GPU sidecar options.
127
127
128
+
### Docker image (server deployments)
129
+
130
+
Sparrow Engine ships as a self-contained HTTP server in two Docker flavors. Both expose `/v1/detect`, `/v1/classify`, `/v1/detect_audio`, `/healthz`, `/openapi.json` on port 8080.
131
+
132
+
| Image | Size | GPU |
133
+
|---|---|---|
134
+
|`sparrow-engine-server:sparrow-combined`|~170 MB | CPU only |
135
+
|`sparrow-engine-server-gpu:sparrow-combined`|~3.7 GB | CUDA 12 + cuDNN bundled; requires NVIDIA Container Toolkit on the host |
136
+
137
+
Two ways to get the image. Neither pulls from a container registry — there is no public `docker pull sparrow-engine` because no central registry push is wired into CI (intentional today; see [user manual §2.8](docs/user-manual.md#28-docker-image-deployment) for rationale).
138
+
139
+
**Option A — download pre-built tarballs from Zenodo** (~3 min on a decent link, no build toolchain needed). Uses the sparrow companion repo's downloader script which knows the current Zenodo record + expected SHA-256 digests:
./scripts/download_sparrow_engine_images.sh # CPU + GPU
145
+
./scripts/download_sparrow_engine_images.sh --cpu-only # CPU only (~43 MB compressed)
146
+
./scripts/download_sparrow_engine_images.sh --gpu-only # GPU only (~1.5 GB compressed)
147
+
```
148
+
149
+
The script verifies SHA-256 + `docker load`s + retags as `sparrow-engine-server[-gpu]:sparrow-combined`. **Caveat**: the Zenodo record is refreshed manually per release, not on every commit, so the published tarballs may lag the latest source by one or more releases. The current record's pin commit is recorded in `sparrow/sparrow-engine/sparrow-engine.version` after the download; if you need the absolute latest fixes, use Option B.
150
+
151
+
**Option B — build from source** (~10 min the first time; cached layers on subsequent builds; always reflects the current source tree):
**Or use the bundled `docker-compose.yml`** (resource limits, healthcheck, log rotation, read-only filesystem all pre-configured):
181
+
182
+
```bash
183
+
cd Pytorch-Wildlife/sparrow-engine/docker
184
+
docker compose --profile cpu up -d # CPU
185
+
docker compose --profile gpu up -d # GPU
186
+
docker compose --profile cpu logs -f # tail logs
187
+
docker compose --profile cpu down # stop
188
+
```
189
+
190
+
The Compose file mounts `${SPARROW_ENGINE_MODEL_DIR:-./models}` read-only into the container; set the env var or place models at `sparrow-engine/docker/models/` before bringing the stack up. Models can also be downloaded via the [Model zoo](#model-zoo) section below.
191
+
192
+
For full HTTP API documentation, request shapes, response schemas, and operator-grade env-var reference: [§7 of the user manual](docs/user-manual.md#7-http-api-server--sparrow-engine-server).
193
+
128
194
---
129
195
130
196
> 📖 **[Read the full user manual →](docs/user-manual.md)**
| GitHub Releases binary | End users; production |`bash installer/sparrow-engine-install.sh --cli`|
256
256
| pip install Python wheel | Notebook + script users (Python API only — no `spe` CLI binary; use the Homebrew, installer, or tarball rows above for the CLI) |`pip install sparrow-engine` (CPU) or `pip install sparrow-engine-gpu`|
| Docker image | Server deployments |Build locally from `sparrow-engine/Dockerfile.{cpu,gpu}` OR download pre-built tarballs from Zenodo via `sparrow/scripts/download_sparrow_engine_images.sh` — see §2.8 for the full flow. No `docker pull` from a registry today.|
Sparrow Engine ships as a self-contained HTTP server in two Docker flavors. Operators running a sparrow stack, or anyone who wants the engine on a remote server, typically use this path.
|`sparrow-engine-server:sparrow-combined`|~43 MB |~170 MB | CPU only |
360
+
|`sparrow-engine-server-gpu:sparrow-combined`|~1.5 GB |~3.7 GB | CUDA 12 + cuDNN bundled; requires NVIDIA Container Toolkit on the host |
361
+
362
+
Both flavors expose the same 15-route axum HTTP API on port 8080: `/v1/detect`, `/v1/detect/batch`, `/v1/classify`, `/v1/pipeline`, `/v1/detect_audio`, plus `/v1/catalog`, `/v1/models`, `/v1/manifest`, `/healthz`, `/v1/health`, `/openapi.json`, and the inference-log + drift endpoints from Phase 4. See §7 for the full request / response schemas.
363
+
364
+
#### Why no `docker pull`?
365
+
366
+
There is no `docker pull sparrow-engine:cpu` or equivalent — `release.yml` does not push images to a container registry (GHCR / Docker Hub / etc.) today. Rationale: the audience that needs Docker is operators deploying the sparrow webapp stack, and that audience already runs the sparrow companion repo which provides a Zenodo-backed download script. Registry publish is tracked at `sparrow-engine-dev:docs/ideas.md § Sparrow Studio Web Integration follow-ups → SW-1` and will land alongside the cross-repo CI auto-PR work.
367
+
368
+
#### Option A — download pre-built tarballs from Zenodo
369
+
370
+
Fastest path. ~3 min on a decent link. No build toolchain needed. Uses sparrow companion repo's downloader script which knows the current Zenodo record + expected SHA-256 digests + handles the `docker load` + canonical retag step.
./scripts/download_sparrow_engine_images.sh # CPU + GPU (~1.55 GB compressed)
376
+
./scripts/download_sparrow_engine_images.sh --cpu-only # CPU only (~43 MB compressed)
377
+
./scripts/download_sparrow_engine_images.sh --gpu-only # GPU only (~1.5 GB compressed)
378
+
./scripts/download_sparrow_engine_images.sh --help # full flag list
379
+
```
380
+
381
+
The script:
382
+
1. Downloads `sparrow-engine-{cpu,gpu}-prior-pin-<sha>.tar.zst` from the pinned Zenodo record into `./.sparrow-engine-cache/`
383
+
2. Verifies SHA-256 against the digests recorded in `sparrow-engine/sparrow-engine.version`
384
+
3.`docker load`s each tarball
385
+
4. Retags the loaded image as the canonical `sparrow-engine-server[-gpu]:sparrow-combined` so `docker-compose.yml` finds it
386
+
387
+
**Pin caveat**: the Zenodo record is refreshed manually per release, not on every commit. The downloader script's hardcoded record reflects whatever sparrow's `sparrow-engine.version` was pinned to when the script last shipped. Check the current pin SHA against this repo's HEAD before trusting the tarballs include the latest fixes; if you need bleeding edge, use Option B.
388
+
389
+
#### Option B — build from source
390
+
391
+
~10 min the first time; cached layers on subsequent builds. Always reflects the current source tree at HEAD. Recommended when you need fixes that post-date the latest Zenodo refresh.
docker build -f docker/Dockerfile.gpu -t sparrow-engine-server-gpu:sparrow-combined .# GPU only
398
+
```
399
+
400
+
The Dockerfiles are multi-stage:
401
+
-`Dockerfile.cpu`: builder stage = `rust:bookworm`; runtime stage = `debian:bookworm-slim` + bundled `libonnxruntime.so.1.25.1`. No CUDA dependencies. Outputs a 170 MB image.
402
+
-`Dockerfile.gpu`: builder stage = `rust:bookworm`; runtime stage = `nvidia/cuda:12.6.3-cudnn-runtime-ubuntu24.04` + bundled `libonnxruntime.so.1.25.1` + CUDA provider sidecars. Requires NVIDIA Container Toolkit at run time. Outputs a 3.7 GB image.
403
+
404
+
ORT version is centralized at `docker/.ort-version` (single source of truth; both Dockerfiles default `ARG ORT_VERSION` agrees with it; CI gate at `release.yml § Compare ORT_VERSION` enforces the 3-way agreement).
405
+
406
+
#### Run the server
407
+
408
+
After either Option A or B. The container expects models mounted read-only at `/models` (see [Model zoo](#model-zoo) for the download path).
409
+
410
+
```bash
411
+
# CPU — minimal
412
+
docker run -d --rm --name sparrow-engine -p 8080:8080 \
413
+
-v $HOME/.sparrow-engine/models:/models:ro \
414
+
-e SPARROW_ENGINE_DEVICE=cpu \
415
+
sparrow-engine-server:sparrow-combined
416
+
417
+
# GPU — requires NVIDIA Container Toolkit installed on the host
418
+
docker run -d --rm --name sparrow-engine-gpu -p 8080:8080 --gpus all \
curl -fsS -X POST -F "image=@test.jpg""http://localhost:8080/v1/detect?model=MDV6-yolov10-e"
427
+
```
428
+
429
+
#### Or use the bundled `docker-compose.yml`
430
+
431
+
Includes Docker-Compose-best-practices defaults: resource limits (4 GB / 4 CPU for CPU, 8 GB / 4 CPU + GPU reservation for GPU), `init: true` for proper signal handling, `restart: unless-stopped`, `read_only: true` filesystem, `no-new-privileges: true`, JSON log rotation (50 MB × 5 files), 30s graceful stop.
432
+
433
+
```bash
434
+
cd Pytorch-Wildlife/sparrow-engine/docker
435
+
docker compose --profile cpu up -d # CPU
436
+
docker compose --profile gpu up -d # GPU (requires nvidia-container-toolkit)
437
+
docker compose --profile cpu logs -f # tail logs
438
+
docker compose --profile cpu down # stop
439
+
```
440
+
441
+
The Compose file mounts `${SPARROW_ENGINE_MODEL_DIR:-./models}` read-only into the container. Set the env var to point at an absolute models path, or place the models under `sparrow-engine/docker/models/` before bringing the stack up. Both flavors share the same 8080 host port via `profiles:` so only one can run at a time per host.
442
+
443
+
#### Operator env vars
444
+
445
+
| Variable | Default | Notes |
446
+
|---|---|---|
447
+
|`SPARROW_ENGINE_DEVICE`|`cpu` / `cuda:0`| Image-dependent — CPU image uses `cpu`; GPU image uses `cuda:0`. Override for multi-GPU hosts. |
448
+
|`SPARROW_ENGINE_MODEL_DIR`|`/models` (inside container) | The Compose file maps the host directory to this path. |
449
+
|`SPARROW_ENGINE_LOG_FORMAT`|`pretty` (Compose) / `json` (raw `docker run`) |`json` for production log aggregation; `pretty` for dev. |
450
+
|`SPARROW_ENGINE_BIND_ADDR`|`0.0.0.0:8080`| Override for non-standard ports. |
451
+
|`SPARROW_ENGINE_LOG_LEVEL`|`info`|`debug` for boot-trace + per-request tracing. |
452
+
453
+
#### Cross-references
454
+
- Full HTTP API + request/response schemas: §7
455
+
- Server boot lifecycle + cold-start characteristics: §11
456
+
- Sparrow Studio Web stack consumes these images via digest pin: `sparrow/sparrow-engine/sparrow-engine.version` + `sparrow/scripts/sync_sparrow_engine.sh` in the companion repo
0 commit comments