You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Don't fail Supervisor setup when an app image is missing (#6816)
* Don't fail Supervisor setup when an app image is missing
A missing builder image (docker:<version>-cli) during a build-required app
load aborted Supervisor setup entirely, leaving the system stuck in setup
state where every subsequent operation was blocked by the not-healthy
guard. Triggered in practice when the host's Docker patch version had no
matching `-cli` tag published on Docker Hub.
Two issues compounded the failure: `images.pull` in `run_command` leaked a
raw `aiodocker.DockerError` past the `@Job` decorator, which rewrapped it
as `JobException` and bypassed the `suppress(DockerError, ...)` guard in
`addon.load()`; and the load path treated all Docker errors the same
whether the image was simply missing or the daemon itself was misbehaving.
Wrap the pull error in `run_command` so it propagates as Supervisor's
`DockerError` (a `HassioError`) and is preserved by the decorator.
Distinguish 404s in `attach()` and `check_image()` by raising
`DockerNotFound`/`DockerAPIError` instead of generic `DockerError`. In
`addon.load()`, only the `DockerNotFound` path is treated as "image
missing": for build-required apps we skip the inline build and surface a
`MISSING_IMAGE` repair so the resolution autofix loop handles it off the
critical path; for pull-based apps we still attempt install during load
and create the repair on failure. Other `DockerError`s (daemon trouble or
a failed internal install in `check_image`) are logged at CRITICAL — which
the Sentry logging integration captures — and the addon is left detached
rather than masked as a misleading missing-image repair.
In the autofix path, swallow `DockerBuildError`, `DockerNoSpaceOnDevice`,
`DockerRegistryAuthError`, and `DockerRegistryRateLimitExceeded` as
`ResolutionFixupError` so they don't generate Sentry events on every
retry. The repair stays available for manual retry once the underlying
cause (registry tag published, disk freed, credentials fixed, rate limit
expired) is resolved.
* Clarify outer DockerError comment in App.load()
The comment claimed "a future load will reattempt and surface a
MISSING_IMAGE repair if appropriate", but App.load() is only called at
Supervisor startup, on fresh install, and on backup restore — there is no
automatic retry mechanism. Reword to match reality: the CRITICAL log
captures the issue for diagnostics (Sentry), and the user can trigger a
manual repair once the daemon is healthy.
* Clarify comment about user interaction
0 commit comments