Skip to content

fix(resolver): dedupe override.yml; apply gpu_backends filter to user-extensions#1029

Closed
yasinBursali wants to merge 1 commit intoLight-Heart-Labs:mainfrom
yasinBursali:fix/compose-resolver-dedup-gpu-filter
Closed

fix(resolver): dedupe override.yml; apply gpu_backends filter to user-extensions#1029
yasinBursali wants to merge 1 commit intoLight-Heart-Labs:mainfrom
yasinBursali:fix/compose-resolver-dedup-gpu-filter

Conversation

@yasinBursali
Copy link
Copy Markdown
Contributor

What

Two resolver fixes bundled:

  1. Stop appending docker-compose.override.yml twice to COMPOSE_FLAGS.
  2. Apply the same gpu_backends filter to user-installed extensions that built-in extensions already receive.

Why

Defect 1: installers/lib/compose-select.sh called the resolver (which appends docker-compose.override.yml internally) and then appended it again at the bash layer. Docker merges the file twice — idempotent for scalars but a foot-gun for anchors / extends: / list-merge semantics.

Defect 2: scripts/resolve-compose-stack.sh read the manifest and filtered by gpu_backends for built-in extensions but included every data/user-extensions/*/compose.yaml unconditionally. A user-library extension declaring gpu_backends: [nvidia] would be merged on AMD/Apple and fail at container start — or worse, block any depends_on chain.

How

  • compose-select.sh: delete the now-redundant if [[ -f ... override.yml ]] block after load_env_from_output.
  • resolve-compose-stack.sh (inline Python): mirror the built-in loop's filter verbatim into the user-extension loop. Same manifest lookup order (.yaml -> .yml -> .json), same schema_version == "dream.services.v1" gate, same gpu_backends default ["amd","nvidia"], same "all"/"none" sentinels, same narrow yaml/json/structure exception dispatch honouring skip_broken.

The intentional scope limit: user-extension loop still hardcodes compose.yaml and does not apply compose.local.yaml / compose.multigpu.yaml overlays — those gaps pre-dated this PR and are left for a separate follow-up to keep blast radius small.

Testing

  • bash -n, shellcheck (zero net-new warnings), python3 -m py_compile on the extracted heredoc — all pass.
  • make lint — passes.
  • tests/test-tier-map.sh — 82/0.
  • Functional:
    • Override dedup: old code emits 2x override.yml, new code emits 1x.
    • gpu_backends: [nvidia] user-ext filtered out on AMD, included on NVIDIA.
    • gpu_backends: [all] included on every backend.
    • Missing manifest -> skipped (matches built-in).

Review

Critique Guardian APPROVED (no required changes).

Platform Impact

  • macOS / Linux / Windows (WSL2): user-ext filter is pure Python using pathlib; no BSD/GNU divergence. compose-select.sh change is Linux-installer-only (matches file's scope).

…-extensions

Two resolver-layer defects fixed together because they cluster in
the same call chain.

1. installers/lib/compose-select.sh double-appended
   docker-compose.override.yml to COMPOSE_FLAGS. scripts/resolve-
   compose-stack.sh already appends it once when its env output is
   consumed by compose-select.sh's load_env_from_output. Removing
   the redundant bash-level append leaves the resolver as the single
   source of truth.

2. scripts/resolve-compose-stack.sh included every
   data/user-extensions/<id>/compose.yaml unconditionally, bypassing
   the gpu_backends compatibility check that the built-in extension
   loop already performs. An NVIDIA-only user extension installed on
   AMD/Apple would be merged into the stack and fail at container
   start. This adds the same manifest read + schema_version +
   gpu_backends filter the built-in loop uses, verbatim, including
   "all"/"none" sentinels and the narrow-dispatch error handling
   gated by skip_broken.
@Lightheartdevs
Copy link
Copy Markdown
Collaborator

Audit follow-up: not merge-ready as-is.

The resolver dedupe and manifest-declared GPU filtering direction is good, but this branch silently drops legacy/custom user extensions that have compose.yaml without a manifest. Current main still includes those. Please keep the legacy compose.yaml fallback, then apply manifest-only filters only when a manifest exists.

@yasinBursali
Copy link
Copy Markdown
Contributor Author

Closing as superseded by #1051.

The maintainer audit on this PR (2026-04-28) flagged that the manifest-only filter silently drops legacy/custom user extensions whose compose.yaml exists without a manifest. PR #1051 (fix/resolver-python-hygiene) was created as the cleaner replacement that:

  • Preserves the legacy compose.yaml fallback (manifest-less compat carve-out),
  • Applies the GPU-backend filter only when a manifest exists,
  • Adds the apple-guard for compose.local.yaml overlays (caught by CG re-review),
  • Has been smoke-matrix verified across 3 fixtures × 4 backends.

The maintainer's own audit on #1051 said it was "better than #1029". Closing #1029 to remove the duplicate from the queue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants