Skip to content

Commit b710048

Browse files
zhmiaoCopilot
andcommitted
docs: add Docker image deployment section to README + user-manual §2.8
Users with their own server want a quick path to download + run the Sparrow Engine HTTP server in Docker. Pre-existing docs were misleading: README had no Docker section at all; user-manual §2.4 row 5 said 'docker pull sparrow-engine:cpu (or :gpu)' but no such pull works (release.yml does not push to a container registry today). Adds: - README.md § Alternative install paths → new 'Docker image (server deployments)' subsection with image inventory, Zenodo download (Option A), build-from-source (Option B), docker run + compose examples. - docs/user-manual.md § 2.4 row 5: replace misleading 'docker pull' command with reference to new §2.8. - docs/user-manual.md § 2.8 'Docker image deployment': comprehensive operator-grade section covering image inventory, why-no-pull rationale (with SW-1 cross-ref), both download paths with caveats, Dockerfile multi-stage detail, run-the-server invocations for CPU + GPU, bundled docker-compose.yml walkthrough, operator env-var table, cross-references to §7 (HTTP API) + §11 (cold-start) + sparrow companion repo's sync_sparrow_engine.sh. The Zenodo download path uses sparrow companion repo's download_sparrow_engine_images.sh script which knows the current pinned record + expected SHA-256 digests + handles docker load + canonical retag. Documented caveat that Zenodo lags source HEAD; build-from-source is the bleeding-edge path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent b074060 commit b710048

2 files changed

Lines changed: 178 additions & 1 deletion

File tree

README.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,72 @@ environment. Check the installed version with
125125
See [§6 of the user manual](docs/user-manual.md#6-python-package--sparrow-engine)
126126
for the full API surface and GPU sidecar options.
127127

128+
### Docker image (server deployments)
129+
130+
Sparrow Engine ships as a self-contained HTTP server in two Docker flavors. Both expose `/v1/detect`, `/v1/classify`, `/v1/detect_audio`, `/healthz`, `/openapi.json` on port 8080.
131+
132+
| Image | Size | GPU |
133+
|---|---|---|
134+
| `sparrow-engine-server:sparrow-combined` | ~170 MB | CPU only |
135+
| `sparrow-engine-server-gpu:sparrow-combined` | ~3.7 GB | CUDA 12 + cuDNN bundled; requires NVIDIA Container Toolkit on the host |
136+
137+
Two ways to get the image. Neither pulls from a container registry — there is no public `docker pull sparrow-engine` because no central registry push is wired into CI (intentional today; see [user manual §2.8](docs/user-manual.md#28-docker-image-deployment) for rationale).
138+
139+
**Option A — download pre-built tarballs from Zenodo** (~3 min on a decent link, no build toolchain needed). Uses the sparrow companion repo's downloader script which knows the current Zenodo record + expected SHA-256 digests:
140+
141+
```bash
142+
git clone https://github.com/Clamps251/sparrow.git
143+
cd sparrow
144+
./scripts/download_sparrow_engine_images.sh # CPU + GPU
145+
./scripts/download_sparrow_engine_images.sh --cpu-only # CPU only (~43 MB compressed)
146+
./scripts/download_sparrow_engine_images.sh --gpu-only # GPU only (~1.5 GB compressed)
147+
```
148+
149+
The script verifies SHA-256 + `docker load`s + retags as `sparrow-engine-server[-gpu]:sparrow-combined`. **Caveat**: the Zenodo record is refreshed manually per release, not on every commit, so the published tarballs may lag the latest source by one or more releases. The current record's pin commit is recorded in `sparrow/sparrow-engine/sparrow-engine.version` after the download; if you need the absolute latest fixes, use Option B.
150+
151+
**Option B — build from source** (~10 min the first time; cached layers on subsequent builds; always reflects the current source tree):
152+
153+
```bash
154+
git clone --branch sparrow-engine-dev https://github.com/microsoft/Pytorch-Wildlife.git
155+
cd Pytorch-Wildlife/sparrow-engine
156+
docker build -f docker/Dockerfile.cpu -t sparrow-engine-server:sparrow-combined .
157+
docker build -f docker/Dockerfile.gpu -t sparrow-engine-server-gpu:sparrow-combined . # GPU
158+
```
159+
160+
**Run the server** (after either Option A or B). The container expects models mounted read-only at `/models`:
161+
162+
```bash
163+
# CPU
164+
docker run -d --rm --name sparrow-engine -p 8080:8080 \
165+
-v $HOME/.sparrow-engine/models:/models:ro \
166+
-e SPARROW_ENGINE_DEVICE=cpu \
167+
sparrow-engine-server:sparrow-combined
168+
169+
# GPU (requires NVIDIA Container Toolkit on the host)
170+
docker run -d --rm --name sparrow-engine-gpu -p 8080:8080 --gpus all \
171+
-v $HOME/.sparrow-engine/models:/models:ro \
172+
-e SPARROW_ENGINE_DEVICE=cuda:0 \
173+
sparrow-engine-server-gpu:sparrow-combined
174+
175+
# Verify
176+
curl -fsS http://localhost:8080/healthz
177+
curl -fsS http://localhost:8080/openapi.json | jq '.paths | keys'
178+
```
179+
180+
**Or use the bundled `docker-compose.yml`** (resource limits, healthcheck, log rotation, read-only filesystem all pre-configured):
181+
182+
```bash
183+
cd Pytorch-Wildlife/sparrow-engine/docker
184+
docker compose --profile cpu up -d # CPU
185+
docker compose --profile gpu up -d # GPU
186+
docker compose --profile cpu logs -f # tail logs
187+
docker compose --profile cpu down # stop
188+
```
189+
190+
The Compose file mounts `${SPARROW_ENGINE_MODEL_DIR:-./models}` read-only into the container; set the env var or place models at `sparrow-engine/docker/models/` before bringing the stack up. Models can also be downloaded via the [Model zoo](#model-zoo) section below.
191+
192+
For full HTTP API documentation, request shapes, response schemas, and operator-grade env-var reference: [§7 of the user manual](docs/user-manual.md#7-http-api-server--sparrow-engine-server).
193+
128194
---
129195

130196
> 📖 **[Read the full user manual →](docs/user-manual.md)**

docs/user-manual.md

Lines changed: 112 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,7 @@ Under the stdin-pipe form the wrapper detects that `$0` is the shell name and sk
254254
| Clean-room from-source build | Developers; reproducibility | `cd sparrow-engine && ./scripts/build_all_flavors.sh` (workspace root) |
255255
| GitHub Releases binary | End users; production | `bash installer/sparrow-engine-install.sh --cli` |
256256
| pip install Python wheel | Notebook + script users (Python API only — no `spe` CLI binary; use the Homebrew, installer, or tarball rows above for the CLI) | `pip install sparrow-engine` (CPU) or `pip install sparrow-engine-gpu` |
257-
| Docker image pull | Server deployments | `docker pull sparrow-engine:cpu` (or `:gpu`) |
257+
| Docker image | Server deployments | Build locally from `sparrow-engine/Dockerfile.{cpu,gpu}` OR download pre-built tarballs from Zenodo via `sparrow/scripts/download_sparrow_engine_images.sh` — see §2.8 for the full flow. No `docker pull` from a registry today. |
258258

259259
**Cite**: `docs/install.md § Per-consumer install paths` (lines 163-220); `sparrow-engine/scripts/build_all_flavors.sh`; `installer/homebrew/{sparrow-engine,sparrow-engine-gpu}.rb` + `installer/homebrew/README.md` (Homebrew tap source-of-truth).
260260

@@ -348,6 +348,117 @@ ONLINE machine: OFFLINE machine:
348348

349349
---
350350

351+
### 2.8 Docker image deployment
352+
353+
Sparrow Engine ships as a self-contained HTTP server in two Docker flavors. Operators running a sparrow stack, or anyone who wants the engine on a remote server, typically use this path.
354+
355+
#### Image inventory
356+
357+
| Image | Compressed download | Loaded size | GPU |
358+
|---|---|---|---|
359+
| `sparrow-engine-server:sparrow-combined` | ~43 MB | ~170 MB | CPU only |
360+
| `sparrow-engine-server-gpu:sparrow-combined` | ~1.5 GB | ~3.7 GB | CUDA 12 + cuDNN bundled; requires NVIDIA Container Toolkit on the host |
361+
362+
Both flavors expose the same 15-route axum HTTP API on port 8080: `/v1/detect`, `/v1/detect/batch`, `/v1/classify`, `/v1/pipeline`, `/v1/detect_audio`, plus `/v1/catalog`, `/v1/models`, `/v1/manifest`, `/healthz`, `/v1/health`, `/openapi.json`, and the inference-log + drift endpoints from Phase 4. See §7 for the full request / response schemas.
363+
364+
#### Why no `docker pull`?
365+
366+
There is no `docker pull sparrow-engine:cpu` or equivalent — `release.yml` does not push images to a container registry (GHCR / Docker Hub / etc.) today. Rationale: the audience that needs Docker is operators deploying the sparrow webapp stack, and that audience already runs the sparrow companion repo which provides a Zenodo-backed download script. Registry publish is tracked at `sparrow-engine-dev:docs/ideas.md § Sparrow Studio Web Integration follow-ups → SW-1` and will land alongside the cross-repo CI auto-PR work.
367+
368+
#### Option A — download pre-built tarballs from Zenodo
369+
370+
Fastest path. ~3 min on a decent link. No build toolchain needed. Uses sparrow companion repo's downloader script which knows the current Zenodo record + expected SHA-256 digests + handles the `docker load` + canonical retag step.
371+
372+
```bash
373+
git clone https://github.com/Clamps251/sparrow.git
374+
cd sparrow
375+
./scripts/download_sparrow_engine_images.sh # CPU + GPU (~1.55 GB compressed)
376+
./scripts/download_sparrow_engine_images.sh --cpu-only # CPU only (~43 MB compressed)
377+
./scripts/download_sparrow_engine_images.sh --gpu-only # GPU only (~1.5 GB compressed)
378+
./scripts/download_sparrow_engine_images.sh --help # full flag list
379+
```
380+
381+
The script:
382+
1. Downloads `sparrow-engine-{cpu,gpu}-prior-pin-<sha>.tar.zst` from the pinned Zenodo record into `./.sparrow-engine-cache/`
383+
2. Verifies SHA-256 against the digests recorded in `sparrow-engine/sparrow-engine.version`
384+
3. `docker load`s each tarball
385+
4. Retags the loaded image as the canonical `sparrow-engine-server[-gpu]:sparrow-combined` so `docker-compose.yml` finds it
386+
387+
**Pin caveat**: the Zenodo record is refreshed manually per release, not on every commit. The downloader script's hardcoded record reflects whatever sparrow's `sparrow-engine.version` was pinned to when the script last shipped. Check the current pin SHA against this repo's HEAD before trusting the tarballs include the latest fixes; if you need bleeding edge, use Option B.
388+
389+
#### Option B — build from source
390+
391+
~10 min the first time; cached layers on subsequent builds. Always reflects the current source tree at HEAD. Recommended when you need fixes that post-date the latest Zenodo refresh.
392+
393+
```bash
394+
git clone --branch sparrow-engine-dev https://github.com/microsoft/Pytorch-Wildlife.git
395+
cd Pytorch-Wildlife/sparrow-engine
396+
docker build -f docker/Dockerfile.cpu -t sparrow-engine-server:sparrow-combined .
397+
docker build -f docker/Dockerfile.gpu -t sparrow-engine-server-gpu:sparrow-combined . # GPU only
398+
```
399+
400+
The Dockerfiles are multi-stage:
401+
- `Dockerfile.cpu`: builder stage = `rust:bookworm`; runtime stage = `debian:bookworm-slim` + bundled `libonnxruntime.so.1.25.1`. No CUDA dependencies. Outputs a 170 MB image.
402+
- `Dockerfile.gpu`: builder stage = `rust:bookworm`; runtime stage = `nvidia/cuda:12.6.3-cudnn-runtime-ubuntu24.04` + bundled `libonnxruntime.so.1.25.1` + CUDA provider sidecars. Requires NVIDIA Container Toolkit at run time. Outputs a 3.7 GB image.
403+
404+
ORT version is centralized at `docker/.ort-version` (single source of truth; both Dockerfiles default `ARG ORT_VERSION` agrees with it; CI gate at `release.yml § Compare ORT_VERSION` enforces the 3-way agreement).
405+
406+
#### Run the server
407+
408+
After either Option A or B. The container expects models mounted read-only at `/models` (see [Model zoo](#model-zoo) for the download path).
409+
410+
```bash
411+
# CPU — minimal
412+
docker run -d --rm --name sparrow-engine -p 8080:8080 \
413+
-v $HOME/.sparrow-engine/models:/models:ro \
414+
-e SPARROW_ENGINE_DEVICE=cpu \
415+
sparrow-engine-server:sparrow-combined
416+
417+
# GPU — requires NVIDIA Container Toolkit installed on the host
418+
docker run -d --rm --name sparrow-engine-gpu -p 8080:8080 --gpus all \
419+
-v $HOME/.sparrow-engine/models:/models:ro \
420+
-e SPARROW_ENGINE_DEVICE=cuda:0 \
421+
sparrow-engine-server-gpu:sparrow-combined
422+
423+
# Verify
424+
curl -fsS http://localhost:8080/healthz
425+
curl -fsS http://localhost:8080/openapi.json | jq '.paths | keys | length' # 15
426+
curl -fsS -X POST -F "image=@test.jpg" "http://localhost:8080/v1/detect?model=MDV6-yolov10-e"
427+
```
428+
429+
#### Or use the bundled `docker-compose.yml`
430+
431+
Includes Docker-Compose-best-practices defaults: resource limits (4 GB / 4 CPU for CPU, 8 GB / 4 CPU + GPU reservation for GPU), `init: true` for proper signal handling, `restart: unless-stopped`, `read_only: true` filesystem, `no-new-privileges: true`, JSON log rotation (50 MB × 5 files), 30s graceful stop.
432+
433+
```bash
434+
cd Pytorch-Wildlife/sparrow-engine/docker
435+
docker compose --profile cpu up -d # CPU
436+
docker compose --profile gpu up -d # GPU (requires nvidia-container-toolkit)
437+
docker compose --profile cpu logs -f # tail logs
438+
docker compose --profile cpu down # stop
439+
```
440+
441+
The Compose file mounts `${SPARROW_ENGINE_MODEL_DIR:-./models}` read-only into the container. Set the env var to point at an absolute models path, or place the models under `sparrow-engine/docker/models/` before bringing the stack up. Both flavors share the same 8080 host port via `profiles:` so only one can run at a time per host.
442+
443+
#### Operator env vars
444+
445+
| Variable | Default | Notes |
446+
|---|---|---|
447+
| `SPARROW_ENGINE_DEVICE` | `cpu` / `cuda:0` | Image-dependent — CPU image uses `cpu`; GPU image uses `cuda:0`. Override for multi-GPU hosts. |
448+
| `SPARROW_ENGINE_MODEL_DIR` | `/models` (inside container) | The Compose file maps the host directory to this path. |
449+
| `SPARROW_ENGINE_LOG_FORMAT` | `pretty` (Compose) / `json` (raw `docker run`) | `json` for production log aggregation; `pretty` for dev. |
450+
| `SPARROW_ENGINE_BIND_ADDR` | `0.0.0.0:8080` | Override for non-standard ports. |
451+
| `SPARROW_ENGINE_LOG_LEVEL` | `info` | `debug` for boot-trace + per-request tracing. |
452+
453+
#### Cross-references
454+
- Full HTTP API + request/response schemas: §7
455+
- Server boot lifecycle + cold-start characteristics: §11
456+
- Sparrow Studio Web stack consumes these images via digest pin: `sparrow/sparrow-engine/sparrow-engine.version` + `sparrow/scripts/sync_sparrow_engine.sh` in the companion repo
457+
458+
**Cite**: `sparrow-engine/docker/{Dockerfile.cpu,Dockerfile.gpu,docker-compose.yml,.ort-version}`; `sparrow/scripts/download_sparrow_engine_images.sh` + `sparrow/sparrow-engine/sparrow-engine.version`.
459+
460+
---
461+
351462
## 3. Hardware + system requirements
352463

353464
### Section overview

0 commit comments

Comments
 (0)