Skip to content

fix(security): route .env writes through host agent, restore :ro mount#908

Merged
Lightheartdevs merged 2 commits intoLight-Heart-Labs:mainfrom
yasinBursali:fix/env-mount-ro-host-agent-update
Apr 18, 2026
Merged

fix(security): route .env writes through host agent, restore :ro mount#908
Lightheartdevs merged 2 commits intoLight-Heart-Labs:mainfrom
yasinBursali:fix/env-mount-ro-host-agent-update

Conversation

@yasinBursali
Copy link
Copy Markdown
Contributor

@yasinBursali yasinBursali commented Apr 11, 2026

Merge order: Merge after #906, #905, and #900 — touches dream-host-agent.py and docker-compose.base.yml.

What

Restores the .env mount for the dashboard-api container to :ro and routes env-editor writes through a new host-agent endpoint POST /v1/env/update, mirroring the existing host-agent-owned write path used by model activation.

Why (security regression)

Commit c7ffea39 (settings environment editor) changed the dashboard-api .env mount from :ro to writable so the new PUT /api/settings/env endpoint could write .env with _write_text_atomic(env_path, raw_text) from inside the container.

The endpoint itself is API-key-gated, but the filesystem-level :rw mount is a container escape risk: any RCE in the dashboard-api container (dependency CVE, SSRF chain, malicious extension manifest during install) now has write access to .env at the filesystem level, bypassing the API key check entirely.

Before this regression, container RCE meant credential read. After the regression, it meant credential overwrite — an attacker could plant a known DREAM_AGENT_KEY, reset DASHBOARD_API_KEY, overwrite cloud API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY), then reach the host agent on port 7710 for full container lifecycle control.

How

  1. docker-compose.base.yml — restore - ./.env:/dream-server/.env:ro.
  2. dream-host-agent.py — new POST /v1/env/update endpoint + _handle_env_update method that:
    • authenticates via check_auth
    • enforces its own MAX_ENV_BODY = 65536 (default MAX_BODY = 16384 truncates real .env files which routinely exceed 16KB)
    • validates the raw body parses as JSON with a raw_text string
    • loads .env.schema.json from INSTALL_DIR, uses properties keys as the allowlist
    • validates every line: unknown keys → 400, malformed lines → 400, values containing ASCII control characters (other than tab) → 400
    • acquires _model_activate_lock non-blocking (409 on contention) to avoid racing concurrent _do_model_activate calls, which also read-modify-write .env
    • backs up the current .env under DATA_DIR/config-backups/.env.backup.<timestamp>
    • writes atomically via tempfile + os.replace
    • returns the relative backup path in the response
    • logs every reject path with client IP for audit trail
  3. dashboard-api/main.py — new _call_agent_env_update(raw_text) helper mirroring _call_agent_core_recreate. api_settings_env_save still runs the existing _prepare_env_save validation for UX, then delegates the write to the host agent via the helper, handling HTTPError/URLError/OSError into 503/500.
  4. Delete three now-orphaned helpers from main.py: _write_text_atomic, _resolve_env_backup_root, _display_backup_path.
  5. Update test_settings_env.py fixture settings_env_fixture with a monkeypatch for _call_agent_env_update that fakes the agent response (fake also writes the target file so existing read-back tests pass).
  6. Add TestHandleEnvUpdate to test_host_agent.py — 9 tests covering happy path, 413 oversize, 400 unknown key / malformed line / control char (\x00 and \x1b), tab allowed, 409 lock contention, 500 missing schema.

Testing

  • `pytest dashboard-api/tests/test_host_agent.py dashboard-api/tests/test_settings_env.py` → 41 passed (10 settings env + 31 host agent)
  • `python3 -m py_compile` on both modified .py files → clean
  • YAML valid: `python3 -c "import yaml; yaml.safe_load(open('docker-compose.base.yml'))"`
  • Manual test needed:
    1. Install cleanly, open dashboard Settings → Environment editor.
    2. Change a field (e.g. `OLLAMA_PORT`), click Save. Expected: 200 + backup path returned.
    3. Verify `data/config-backups/.env.backup.*` file exists with previous contents.
    4. Verify new `.env` on host has the new value.
    5. Inspect the running dashboard-api container: `docker exec dream-dashboard-api mount | grep .env` — should show `ro`.
    6. From inside the container: `docker exec dream-dashboard-api sh -c 'echo test > /dream-server/.env'` — must fail with "Read-only file system".
    7. Try an oversize body: oversized curl POST to `/v1/env/update` → expect 413.
    8. Try an unknown key: curl with `{"raw_text": "NOT_IN_SCHEMA=foo"}` → expect 400.

Platform Impact

  • macOS: affected — the host agent runs natively via launchd. `.env` lives under `INSTALL_DIR` on the host filesystem, backups go under `DATA_DIR/config-backups/`. `:ro` mount works identically on Docker Desktop.
  • Linux: affected — host agent runs natively via systemd. Same behavior.
  • Windows/WSL2: affected — host agent runs inside the WSL2 distro. `.env` lives on the WSL2 filesystem. `:ro` mount works identically on Docker Desktop for Windows.

Known residual risk

An attacker with a valid `DREAM_AGENT_KEY` can still set any schema-allowed key — this is the intended threat model for the endpoint. The raw-text-blob API (versus structured key-value JSON) is an architectural tradeoff that is worth revisiting in a future hardening pass, but is out of scope for this security regression fix.

Multi-line value injection via JSON-embedded `\n`: `splitlines()` will decompose the attacker's raw_text into multiple lines and each line re-enters full key + control-char validation. A smuggled line whose key is ALSO in `.env.schema.json` would be written — but such a write requires an already-authenticated caller, so the attacker has already passed the API-key gate and could use the legitimate API to set the same keys. This is the accepted tradeoff of the raw-text API.

Follow-up items (deferred, not blocking)

  • Consider migrating the endpoint to structured `{"values": {"KEY": "value", ...}}` body shape in a future PR — eliminates the multi-line parsing ambiguity at the API layer.

This was referenced Apr 11, 2026
@yasinBursali yasinBursali force-pushed the fix/env-mount-ro-host-agent-update branch from f93ebab to 9bb03bc Compare April 12, 2026 11:58
@yasinBursali yasinBursali marked this pull request as ready for review April 16, 2026 14:05
Lightheartdevs
Lightheartdevs previously approved these changes Apr 18, 2026
Copy link
Copy Markdown
Collaborator

@Lightheartdevs Lightheartdevs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid security improvement. The architecture is right: container .env mount goes :ro, dashboard-api delegates writes to the host agent via authenticated /v1/env/update, host agent performs allowlist validation + atomic write + backup.

Reviewed the host agent endpoint:

  • MAX_ENV_BODY=65536 — correctly bypasses the default 16 KB read_json_body cap, which would truncate a real .env (example alone is ~11 KB).
  • Key-name validation via ^[A-Za-z_][A-Za-z0-9_]*$ regex — prevents injection via weird key characters.
  • Schema allowlist — warns but accepts unknown keys with a logged note. Pragmatic: extensions and GPU pinning write keys not in the core schema. I'd consider making this configurable (strict mode for prod, lenient for dev), but the current compromise is defensible.
  • Control-char rejection in values — good defense in depth; catches what splitlines() doesn't.
  • _model_activate_lock coordination — correct. Model activation also writes .env; without this lock, concurrent writes would clobber each other.
  • Atomic write via os.replace — correct.
  • Backup path under data/config-backups/ — good, stays on the data volume.

Dashboard-api side:

  • Correctly removes now-unused _write_text_atomic and _resolve_env_backup_root helpers.
  • Handles URLError/HTTPError with 503 (agent unreachable) — user-friendly.
  • Async wrapping of the agent call keeps the event loop responsive.

Nitpick: the \x00 test in _handle_env_update tests looks reachable via Body-escape — may want to add a len(value) < 65536 per-key cap as belt-and-suspenders.

Cross-PR note: overlaps with draft #975 which also touches .env handling and the host agent binding. This PR's scope is tighter and should land first. Ship.

yasinBursali and others added 2 commits April 18, 2026 14:03
Commit c7ffea3 (settings environment editor) changed the dashboard-api
.env bind mount from :ro to writable so the new PUT /api/settings/env
endpoint could call _write_text_atomic() from inside the container.
This introduces a container-escape risk: any RCE in the dashboard-api
container (dependency CVE, SSRF chain, malicious extension manifest)
can now overwrite .env at the filesystem level, bypassing the API key
check. An attacker can plant a known DREAM_AGENT_KEY and reach the
host agent for full container lifecycle control.

Restore the mount to :ro and route env writes through a new host-agent
endpoint POST /v1/env/update, mirroring the pattern _do_model_activate
already uses for model-switch .env updates. The dashboard-api keeps
its validation (_prepare_env_save) for UX and passes the prepared
raw_text to the host agent via _call_agent_env_update. The host agent
re-validates every key against .env.schema.json (defense in depth),
rejects values with embedded control characters, acquires
_model_activate_lock non-blocking to avoid racing model switches,
backs up the previous .env under DATA_DIR/config-backups/, and writes
atomically via tempfile + os.replace.

Includes 9 unit tests covering happy path, 413 oversize body, 400
unknown key / malformed line / control-char value, 409 lock contention,
and 500 missing schema. Deletes three helpers (_write_text_atomic,
_resolve_env_backup_root, _display_backup_path) that are no longer
referenced after the refactor.

Known residual: an attacker with an already-valid DREAM_AGENT_KEY can
still set any schema-allowed key. This is the intended threat model
for the endpoint; the raw_text API is a tradeoff vs. structured
key-value JSON and is worth revisiting in future hardening.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…trip

Two fixes for the .env write path:

1. Host agent: change strict reject (400) to warn-and-accept for keys
   not in .env.schema.json.  Extension install hooks and GPU pinning
   write keys that are absent from the core schema (e.g. JWT_SECRET
   from LibreChat, COMFYUI_GPU_UUID from the installer).  Rejecting
   them made the dashboard Settings save unusable after any extension
   install.

2. Dashboard API: _render_env_from_values no longer drops extras with
   empty values.  The filter `value != ""` silently discarded keys
   like LLAMA_ARG_TENSOR_SPLIT="" on round-trip, even though empty
   values are semantically meaningful (disabling a setting).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Lightheartdevs Lightheartdevs force-pushed the fix/env-mount-ro-host-agent-update branch from 9bb03bc to 36f9412 Compare April 18, 2026 18:05
@Lightheartdevs Lightheartdevs merged commit 7dec44b into Light-Heart-Labs:main Apr 18, 2026
28 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants