fix(compose): drop CPU overlay AUDIO_STT_MODEL literal; make AMD memory limit env-driven by yasinBursali · Pull Request #1067 · Light-Heart-Labs/DreamServer

yasinBursali · 2026-04-30T23:22:37Z

What

Two related compose-overlay drift fixes:

dream-server/docker-compose.cpu.yml: drop the now-empty open-webui: block (was overriding AUDIO_STT_MODEL to turbo)
dream-server/docker-compose.amd.yml: replace literal memory: 110G with ${LLAMA_SERVER_MEMORY_LIMIT:-110G}

Why

CPU overlay: PR #985 made AUDIO_STT_MODEL env-driven (${AUDIO_STT_MODEL:-Systran/faster-whisper-base} in docker-compose.base.yml). The cpu overlay's residual literal AUDIO_STT_MODEL: "deepdml/faster-whisper-large-v3-turbo-ct2" overrode that interpolation, so on CPU-only Linux installs:

Phase 06 of the installer correctly writes AUDIO_STT_MODEL=Systran/faster-whisper-base to .env
Phase 12 pre-downloads only the base model
Open WebUI's container env is forced to turbo by the overlay literal
Whisper 404s on every transcription

The NVIDIA overlay's identical literal is intentional — for NVIDIA, Phase 06 also picks turbo, so the literal is a redundant safety net (per #985's design). Only the cpu overlay drifted.

AMD overlay: 5 of 6 GPU overlays use ${LLAMA_SERVER_MEMORY_LIMIT:-N} (NVIDIA 64G, Apple 32G, Intel/Arc 24G, CPU 6G). .env.schema.json registers the variable as a tunable string. The amd overlay's literal memory: 110G made AMD the only platform where users couldn't tune llama-server's memory cap.

tier0.yml's literal memory: 4G is a deliberate hard cap for <8 GB RAM hosts and was left alone.

How

cpu.yml: removed the entire open-webui: block (3 + leading-blank lines). Container inherits the base's env-var interpolation.
amd.yml: 1-line substitution. The 110G default is preserved as the AMD-specific fallback so existing AMD installs without the env var keep current byte-for-byte behavior.

Testing

YAML parse, make lint, docker compose config --quiet, pre-commit: all PASS
Functional substitution checks:
- cpu env unset → resolves Systran/faster-whisper-base
- cpu env override → propagates correctly
- amd env unset → resolves 110G (fallback preserved)
- amd env override LLAMA_SERVER_MEMORY_LIMIT=32G → resolves 32G
Cross-overlay non-regression sweep (nvidia, intel, arc, apple, multigpu, tier0): all PASS

Manual:

Linux CPU: docker exec dream-open-webui env | grep AUDIO_STT_MODEL should equal Systran/faster-whisper-base post-install
Linux AMD: set LLAMA_SERVER_MEMORY_LIMIT=32G in .env, dream restart, then docker inspect dream-llama-server --format '{{.HostConfig.Memory}}' should equal 34359738368 (32 GiB) — was 118111600640 (110 GiB) before

Review

Independent verification confirmed both bugs are real and isolated to these two overlays; cross-overlay scan found no other drift on AUDIO_STT_MODEL or LLAMA_SERVER_MEMORY_LIMIT.

Known Considerations

Schema default LLAMA_SERVER_MEMORY_LIMIT="64G" and AMD's compose fallback :-110G differ. This matches the codebase's existing pattern: schema default applies to installer env-generation; per-backend overlay fallback applies at compose-time when .env is silent. Same divergence already exists for nvidia/apple/intel/arc/cpu — not a regression.
A potential follow-up (separate PR, not this one) would have installer phases write LLAMA_SERVER_MEMORY_LIMIT to .env per-backend so the schema default becomes load-bearing. That's a design decision, not a bug.

Platform Impact

Linux CPU: STT now works on first install (was 404 on every transcription).
Linux AMD: LLAMA_SERVER_MEMORY_LIMIT env var now honored (was silently ignored).
Linux NVIDIA / Intel / Arc / multigpu / tier0: no behavior change.
macOS Apple Silicon: no behavior change (llama-server.replicas: 0, no AUDIO_STT_MODEL override).
Windows AMD: no behavior change (llama-server.replicas: 0, no override).

…ry limit env-driven Two related compose-overlay drift bugs. CPU overlay: docker-compose.cpu.yml hardcoded AUDIO_STT_MODEL=turbo, overriding the env-driven base default (${AUDIO_STT_MODEL:-Systran/faster-whisper-base}) that PR Light-Heart-Labs#985 established. On CPU-only Linux installs Phase 06 writes the base model to .env and Phase 12 pre-downloads it, but open-webui ended up requesting turbo (never cached) -> STT 404. Drop the now-empty open-webui block in cpu.yml so the container inherits base's interpolation. NVIDIA's identical literal is preserved per Light-Heart-Labs#985's intentional safety-net design. AMD overlay: docker-compose.amd.yml hardcoded memory: 110G on llama-server while every other GPU overlay reads ${LLAMA_SERVER_MEMORY_LIMIT:-N}. AMD users tuning the documented env var found it silently ignored. Replace with the env-driven pattern preserving 110G as the AMD-specific fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Lightheartdevs

Two correct surgical fixes: (1) drop CPU overlay's NVIDIA-specific AUDIO_STT_MODEL literal — wrong overlay; (2) make AMD memory limit env-driven via ${LLAMA_SERVER_MEMORY_LIMIT:-110G}. Both compose configs validate. Ship after rebase.

Lightheartdevs

Re-audited for merge. CPU overlay now inherits the base AUDIO_STT_MODEL, and AMD memory remains env-driven with the 110G fallback. Local CPU and AMD compose configs passed with required secret placeholders. Approving for squash merge.

yasinBursali force-pushed the fix/cpu-amd-overlay-env-driven-limits branch from 77651c2 to 9bf34b2 Compare April 30, 2026 23:28

yasinBursali marked this pull request as ready for review May 1, 2026 22:44

Lightheartdevs approved these changes May 2, 2026

View reviewed changes

Lightheartdevs mentioned this pull request May 2, 2026

fix(nvidia-overlay): allow .env override of AUDIO_STT_MODEL default #1073

Merged

Lightheartdevs approved these changes May 2, 2026

View reviewed changes

Lightheartdevs merged commit f1d74de into Light-Heart-Labs:main May 2, 2026
33 of 34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(compose): drop CPU overlay AUDIO_STT_MODEL literal; make AMD memory limit env-driven#1067

fix(compose): drop CPU overlay AUDIO_STT_MODEL literal; make AMD memory limit env-driven#1067
Lightheartdevs merged 1 commit intoLight-Heart-Labs:mainfrom
yasinBursali:fix/cpu-amd-overlay-env-driven-limits

yasinBursali commented Apr 30, 2026

Uh oh!

Lightheartdevs left a comment

Uh oh!

Lightheartdevs left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yasinBursali commented Apr 30, 2026

What

Why

How

Testing

Review

Known Considerations

Platform Impact

Uh oh!

Lightheartdevs left a comment

Choose a reason for hiding this comment

Uh oh!

Lightheartdevs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants