Skip to content

Commit 746c88b

Browse files
committed
security: fix critical/high review findings from production readiness audit
CRITICAL: - #1: Command injection via SUDO_USER — switched execSync to execFileSync in getUserHome() to bypass shell interpolation - #2: Symlink attack in /tmp — selfUpdate() now uses mkdtempSync with 0700 perms instead of hardcoded /tmp paths HIGH: - #3: SHA256 verification failure — download binary with artifact name so sha256sum --check finds the correct file - Light-Heart-Labs#4: Broken rollback — verify new binary via exitCode check, not dead catch block (throwOnError: false skips catch) - Light-Heart-Labs#5: Data loss in uninstall — docker compose down -v now conditional on !keepData - Light-Heart-Labs#6: rm -rf path safety — refuse system directories (/, /home, /root, /usr, etc.) with structural depth check MEDIUM: - Light-Heart-Labs#7: Model download failure halts install (throw instead of silent return) — prevents llama-server crash-loop - Light-Heart-Labs#8: Tier change now applies CTX_SIZE even when model name unchanged (Tier 1->2 both use qwen3-8b but differ in context) Tests: updated model.test.ts to expect throw on download failure All 138 tests passing
1 parent 541b711 commit 746c88b

10 files changed

Lines changed: 6418 additions & 31 deletions

File tree

dream-server/cli-installer/docs/context_source.md

Lines changed: 3682 additions & 0 deletions
Large diffs are not rendered by default.

dream-server/cli-installer/docs/context_tests.md

Lines changed: 2394 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# Production Readiness Review — Dream Server CLI Installer
2+
3+
## Project Context
4+
5+
Dream Server is a local AI management platform that orchestrates a multi-container Docker stack (LLM inference via llama-server, Open WebUI chat, SearXNG search, ComfyUI image gen, n8n workflows, Qdrant vector DB, etc.). This CLI installer is the primary user-facing tool for installing, configuring, updating, and diagnosing the entire stack.
6+
7+
**Runtime**: Bun (TypeScript), distributed as a single static binary for Linux x64/arm64.
8+
**Privilege model**: Runs as a regular user or via `sudo`. Detects `SUDO_USER` to resolve the real user's home directory. Docker socket access is required.
9+
**Target users**: Non-technical users running bare-metal GPU servers (NVIDIA/AMD). The installer must be robust to partial failures, network interruptions, and misconfigurations.
10+
**Threat model**: The binary self-updates from GitHub Releases with SHA256 verification. Secrets are generated with `crypto.getRandomValues()`. The installer runs `docker compose up` on the user's system, so command injection via .env values is a risk surface.
11+
12+
## Review Scope
13+
14+
This is a **production readiness** review before merging into `main`. The codebase is feature-complete (Phases 1-3 done, 138 tests passing). We need to identify:
15+
16+
1. Bugs or crash vectors that would break the installer for end users
17+
2. Security issues in secret handling, self-update, or command execution
18+
3. Error handling gaps — any situation where the installer would hang, crash silently, or leave the system in a broken state
19+
4. Robustness issues — race conditions, timeouts, unhandled edge cases
20+
5. Test coverage gaps — critical paths that are not tested
21+
22+
## Attached Context Packs
23+
24+
| File | Contents | Token Estimate |
25+
|------|----------|---------------|
26+
| `context_source.md` | All source code: commands, lib, phases, entry point | ~28K |
27+
| `context_tests.md` | All 21 test files covering 138 tests | ~20K |
28+
29+
## Focus Areas
30+
31+
### 1. Command Injection & Input Sanitization
32+
- Can malicious `.env` values inject shell commands? The env parser reads user-edited files and values end up in `exec()` calls via Docker compose.
33+
- Are user-supplied paths (`--dir`) sanitized before use in `exec()`, `Bun.write()`, `join()`?
34+
- Is the `SUDO_USER` environment variable trusted safely in `getUserHome()` (used in `execSync(getent passwd $SUDO_USER)`)?
35+
36+
### 2. Self-Update Security
37+
- Is the SHA256 verification in `update.ts` correct? Check for TOCTOU between download and verification, partial download corruption, and checksum file format parsing.
38+
- Can the rollback mechanism leave the binary in a broken state?
39+
- Is the binary URL construction safe from path traversal?
40+
41+
### 3. Error Handling & Graceful Degradation
42+
- Are there any `await` calls without timeout or error handling that could hang forever?
43+
- Do all `process.exit()` calls have appropriate cleanup (dangling Docker containers, partial file writes)?
44+
- What happens if Docker daemon crashes mid-install? Is the state recoverable on re-run?
45+
- Are all `catch {}` blocks handling errors appropriately, or are some silently swallowing important failures?
46+
47+
### 4. Concurrency & Race Conditions
48+
- Are there any TOCTOU issues (checking file existence then reading/writing)?
49+
- Can concurrent installer runs corrupt the `.env` file or data directories?
50+
- Is the health check retry loop safe against resource exhaustion?
51+
52+
### 5. Secret Generation & Handling
53+
- Are secrets generated with sufficient entropy? Check `crypto.getRandomValues()` usage.
54+
- Are secrets accidentally logged or printed to stdout?
55+
- Could `.env` merge logic accidentally expose or duplicate secrets?
56+
57+
### 6. Docker & System Interaction
58+
- Is the compose command detection (`docker compose` vs `docker-compose` vs `sudo docker compose`) robust?
59+
- Are there timeout values that are too short for slow systems (e.g., model download, large image pulls)?
60+
- Does `nvidia-smi` parsing handle all real-world output formats?
61+
- Is port checking via `ss`/`netstat` reliable? Does the regex catch IPv6 bindings?
62+
63+
### 7. Test Coverage Gaps
64+
- Are there critical paths in the source code that have NO corresponding test?
65+
- Are mock setups realistic? (e.g., do they mock at the right level, or do they allow real system calls to leak through?)
66+
- Are edge cases tested? (empty .env, corrupt .env, Docker daemon unreachable, no network, disk full)
67+
68+
### 8. Code Quality & Maintainability
69+
- Dead code, unused imports, or unreachable branches
70+
- Functions with too many responsibilities
71+
- Magic strings or numbers that should be constants
72+
- Inconsistent error message formatting
73+
74+
## Output Format
75+
76+
For each finding, provide:
77+
78+
- **Severity**: CRITICAL / HIGH / MEDIUM / LOW
79+
- **File**: path and line numbers
80+
- **Category**: which focus area (1-8 above)
81+
- **Description**: what the issue is
82+
- **Impact**: what happens in production
83+
- **Suggested Fix**: concrete code-level recommendation
84+
85+
Group findings by severity (CRITICAL first). End with:
86+
- Total counts per severity
87+
- Top 3 highest-priority fixes
88+
- Overall production readiness assessment (READY / READY WITH CAVEATS / NOT READY)
Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
# Production Readiness Review — Dream Server CLI Installer
2+
3+
Here is the production readiness review for the Dream Server CLI Installer based on the provided source code and test suite.
4+
5+
## Findings
6+
7+
### CRITICAL
8+
9+
**1. Root Command Injection via `SUDO_USER` Variable**
10+
11+
- **Severity**: CRITICAL
12+
- **File**: `src/lib/config.ts`, line 21
13+
- **Category**: 1. Command Injection & Input Sanitization
14+
- **Description**: The `getUserHome()` function resolves the original user's home directory by passing the `SUDO_USER` environment variable directly into a template literal: `execSync(\`getent passwd ${sudoUser}`)`. Because `execSync` with a string argument spawns a shell (`/bin/sh`) to evaluate the command, any shell metacharacters will be executed.
15+
- **Impact**: A user or compromised script executing the installer via `sudo` can pass a maliciously crafted `SUDO_USER` variable (e.g., `SUDO_USER="root; rm -rf /"`), leading to arbitrary command execution as `root` before the CLI even processes application logic. This is a trivial Local Privilege Escalation (LPE).
16+
- **Suggested Fix**: Use `execFileSync` to pass arguments as a strict array, bypassing the shell entirely:
17+
18+
```typescript
19+
import { execFileSync } from "node:child_process";
20+
// ...
21+
const result = execFileSync("getent", ["passwd", sudoUser], {
22+
encoding: "utf-8",
23+
timeout: 2000,
24+
});
25+
```
26+
27+
**2. Arbitrary File Overwrite / Privilege Escalation via `/tmp` (Symlink Attack)**
28+
29+
- **Severity**: CRITICAL
30+
- **File**: `src/commands/update.ts`, lines 101-102
31+
- **Category**: 2. Self-Update Security & 4. Concurrency & Race Conditions
32+
- **Description**: The self-updater downloads the new binary and checksum to hardcoded, world-writable paths (`/tmp/dream-installer-update`). When `curl -o` executes, it will follow any existing symlinks at that location.
33+
- **Impact**: An unprivileged local attacker can pre-create `/tmp/dream-installer-update` as a symlink pointing to a critical system file (e.g., `/etc/shadow` or `/usr/bin/sudo`). When an admin runs `sudo dream-installer update`, `curl` will overwrite the target file with the binary, resulting in complete system compromise or destruction.
34+
- **Suggested Fix**: Use `node:fs` `mkdtempSync` to create a secure, randomly named directory with strict `0700` permissions.
35+
36+
```typescript
37+
import { mkdtempSync } from "node:fs";
38+
import { tmpdir } from "node:os";
39+
const tmpDir = mkdtempSync(join(tmpdir(), "dream-update-"));
40+
const tmpPath = join(tmpDir, getBinaryName());
41+
```
42+
43+
### HIGH
44+
45+
**3. Guaranteed SHA256 Verification Failure**
46+
47+
- **Severity**: HIGH
48+
- **File**: `src/commands/update.ts`, lines 118-121
49+
- **Category**: 2. Self-Update Security
50+
- **Description**: The downloaded `.sha256` checksum file generated by standard release tooling contains the original artifact filename (e.g., `dream-installer-linux-x64`). However, the installer saves the downloaded binary as `dream-installer-update`. When `sha256sum --check` reads the file, it searches for the original filename in `/tmp`, fails to find it, and unconditionally aborts the update.
51+
- **Impact**: The self-update integrity check is fundamentally broken and will permanently fail in production.
52+
- **Suggested Fix**: Combine this with the fix for Finding #2 by downloading the binary into the secure temporary directory using its actual artifact name (`getBinaryName()`) so `sha256sum` can successfully locate it.
53+
54+
**4. Broken Rollback Mechanism Deletes Working Backups**
55+
56+
- **Severity**: HIGH
57+
- **File**: `src/commands/update.ts`, lines 142-144
58+
- **Category**: 3. Error Handling & Graceful Degradation
59+
- **Description**: After replacing the binary, the updater validates it using `await exec([currentBinary, '--version'], { throwOnError: false })`. Because `throwOnError` is explicitly false, `exec` returns the `exitCode` instead of throwing an exception if the new binary segfaults or fails.
60+
- **Impact**: The `catch` block containing the rollback logic is completely bypassed. The script incorrectly assumes success, deletes the working `.bak` backup file, and leaves the user with a permanently bricked CLI.
61+
- **Suggested Fix**: Explicitly check the `exitCode` and manually throw an error to trigger the rollback logic:
62+
63+
```typescript
64+
const { exitCode } = await exec([currentBinary, "--version"], {
65+
throwOnError: false,
66+
timeout: 5000,
67+
});
68+
if (exitCode !== 0) throw new Error("New binary failed execution");
69+
```
70+
71+
**5. Unconditional Deletion of Named Volumes (Data Loss)**
72+
73+
- **Severity**: HIGH
74+
- **File**: `src/commands/uninstall.ts`, lines 94-98
75+
- **Category**: 3. Error Handling & Graceful Degradation
76+
- **Description**: During the uninstall process, the CLI executes `docker compose down -v` unconditionally. The check for the user's `--keep-data` flag doesn't occur until a later step involving directory removal.
77+
- **Impact**: Any containers relying on Docker named volumes (e.g., Qdrant, Postgres) will have their data irrevocably wiped, even if the user explicitly provided the `--keep-data` argument.
78+
- **Suggested Fix**: Conditionally append the `-v` flag to the arguments array.
79+
80+
```typescript
81+
const downArgs = ["down"];
82+
if (!opts.keepData) downArgs.push("-v");
83+
await exec([...composeCmd, ...downArgs], {
84+
cwd: installDir,
85+
throwOnError: false,
86+
timeout: 30_000,
87+
});
88+
```
89+
90+
**6. Catastrophic Root Directory Wipe Vector**
91+
92+
- **Severity**: HIGH
93+
- **File**: `src/commands/uninstall.ts`, line 114
94+
- **Category**: 1. Command Injection & Input Sanitization
95+
- **Description**: The uninstaller runs `exec(['rm', '-rf', installDir])` with no structural path validation. It relies solely on the prior detection of a `.env` file.
96+
- **Impact**: If a user runs `dream-installer uninstall --dir / --force` (or `--dir /home`) and an errant `.env` file happens to exist in that location, the script will wipe the host's root filesystem or user home directory.
97+
- **Suggested Fix**: Implement strict path safety guards against top-level system directories.
98+
99+
```typescript
100+
import { resolve } from "node:path";
101+
const target = resolve(installDir);
102+
if (["/", "/home", "/root", "/usr", "/etc"].includes(target)) {
103+
throw new Error(
104+
"Safety check failed: Refusing to delete critical system directory.",
105+
);
106+
}
107+
```
108+
109+
### MEDIUM
110+
111+
**7. Installer Silently Proceeds When Model Download Fails**
112+
113+
- **Severity**: MEDIUM
114+
- **File**: `src/phases/model.ts`, lines 91-98
115+
- **Category**: 3. Error Handling & Graceful Degradation
116+
- **Description**: If the LLM model download fails after 3 attempts, `downloadModel` simply prints a console message and returns `void` normally instead of halting execution.
117+
- **Impact**: The `install.ts` orchestrator proceeds to start the Docker Compose stack. `llama-server` immediately enters a crash-loop because the `.gguf` file is missing. The user receives a misleading "Installation Complete" message for a broken system.
118+
- **Suggested Fix**: Throw an error to halt the installation cleanly.
119+
120+
```typescript
121+
if (!success)
122+
throw new Error("Model download failed. Cannot proceed with installation.");
123+
```
124+
125+
**8. Missing Context Size Updates on Tier Changes**
126+
127+
- **Severity**: MEDIUM
128+
- **File**: `src/commands/config.ts`, lines 94-96
129+
- **Category**: 3. Error Handling & Graceful Degradation
130+
- **Description**: When applying a new tier, the code checks `if (tierConfig.model !== currentModel)` before updating `.env`. If a user upgrades between tiers that share the same base model (e.g., upgrading Tier 1 to Tier 2 both use `qwen3-8b`), the condition evaluates to false.
131+
- **Impact**: Upgrading tiers fails to increase `CTX_SIZE` and `MAX_CONTEXT` from 16384 to 32768, depriving the user of the expected performance upgrade.
132+
- **Suggested Fix**: Always apply configuration changes if a new tier is selected, or compare the explicit tier ID instead of the model name.
133+
134+
**9. Fallback Logic Dead Code via `throwOnError: false**`
135+
136+
- **Severity**: MEDIUM
137+
- **File**: `src/commands/status.ts` (lines 75, 126) & `src/commands/doctor.ts` (line 195)
138+
- **Category**: 3. Error Handling & Graceful Degradation
139+
- **Description**: Multiple scripts wrap `exec(..., { throwOnError: false })` in `try...catch` blocks. Because `exec` returns an exit code object rather than throwing, the `catch` blocks are completely dead code.
140+
- **Impact**: Fallbacks (like parsing older non-JSON `docker compose ps` formats) are never triggered. In `doctor.ts`, if `nvidia-smi` crashes, it parses an empty string into `NaN < 535` (false) and logs a blank successful driver string instead of failing.
141+
- **Suggested Fix**: Remove `{ throwOnError: false }` from `exec` calls that are expected to trigger a `catch` block on failure.
142+
143+
**10. Zero Test Coverage for Critical Execution Paths**
144+
145+
- **Severity**: MEDIUM
146+
- **File**: `tests/update.test.ts` & `tests/doctor.test.ts`
147+
- **Category**: 7. Test Coverage Gaps
148+
- **Description**: `update.test.ts` explicitly bypasses `selfUpdate()` using `skipSelfUpdate: true`. `doctor.test.ts` never imports or executes `doctor()`, but instead manually reproduces and tests its internal regex logic.
149+
- **Impact**: The highest-risk operations (binary replacement, SHA validation, system diagnostics) have 0% effective test coverage, which allowed the critical TOCTOU and checksum bugs to remain undetected.
150+
- **Suggested Fix**: Write integration tests that invoke the actual functions, mocking out `globalThis.fetch` and `fs.mkdtempSync` for updates, and `shell.exec` for the doctor command.
151+
152+
### LOW
153+
154+
**11. MacOS Incompatibility for Disk Check**
155+
156+
- **Severity**: LOW
157+
- **File**: `src/commands/doctor.ts` (line 179) & `src/phases/detection.ts` (line 58)
158+
- **Category**: 6. Docker & System Interaction
159+
- **Description**: The script executes `df -BG`. The `-B` flag is GNU-specific. On MacOS environments, this throws an "illegal option" error, bypassing the disk space check.
160+
- **Suggested Fix**: Use a POSIX-compliant format like `df -m` and perform the gigabyte division mathematically.
161+
162+
**12. Doctor Command Hides Valid Port Conflicts**
163+
164+
- **Severity**: LOW
165+
- **File**: `src/commands/doctor.ts`, lines 152-156
166+
- **Category**: 8. Code Quality & Maintainability
167+
- **Description**: `doctor.ts` silently hides port conflicts if Docker is running (`&& !composeCmd`). If an external service has hijacked a port while Docker is running, the user is never warned.
168+
- **Suggested Fix**: Surface the occupied ports via `ui.info`, appending "(expected if Dream Server is running)" rather than suppressing the output entirely.
169+
170+
---
171+
172+
## Summary
173+
174+
- **Total Counts per Severity**
175+
- **CRITICAL**: 2
176+
- **HIGH**: 4
177+
- **MEDIUM**: 4
178+
- **LOW**: 2
179+
180+
- **Top 3 Highest-Priority Fixes**
181+
182+
1. **Fix Command Injection**: Immediately swap `execSync` for `execFileSync` in `getUserHome()` to close the `SUDO_USER` root escalation vector.
183+
2. **Rewrite Self-Update Logic**: Use `fs.mkdtempSync` to prevent symlink exploits, ensure the downloaded file matches the `.sha256` payload name, and fix the `throwOnError: false` bug so rollbacks actually work.
184+
3. **Implement Uninstall Safeguards**: Add strict path validation to prevent catastrophic system directory deletion, and conditionally append `-v` to protect Docker volumes when `--keep-data` is requested.
185+
186+
- **Overall Production Readiness Assessment**: **NOT READY**
187+
The CLI installer has excellent orchestration logic and UX, but it ships with a root command injection vulnerability, catastrophic uninstallation data loss vectors, and a fundamentally broken self-update mechanism. Once the Critical and High severity findings are remediated, the codebase will be ready for production distribution.

dream-server/cli-installer/src/commands/config.ts

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -112,13 +112,23 @@ export async function config(opts: ConfigOptions): Promise<void> {
112112

113113
const [tierId, tierConfig] = tierEntries[tierChoice];
114114

115-
if (tierConfig.model !== currentModel) {
115+
const currentTier = getEnv('TIER');
116+
const tierChanged = tierId !== currentTier;
117+
const modelChanged = tierConfig.model !== currentModel;
118+
119+
if (tierChanged || modelChanged) {
116120
setEnv('LLM_MODEL', tierConfig.model);
117121
setEnv('GGUF_FILE', tierConfig.ggufFile);
118122
setEnv('CTX_SIZE', String(tierConfig.context));
119123
setEnv('MAX_CONTEXT', String(tierConfig.context));
124+
setEnv('TIER', tierId);
120125
changed = true;
121-
ui.ok(`Model: ${currentModel}${tierConfig.model}`);
126+
127+
if (modelChanged) {
128+
ui.ok(`Model: ${currentModel}${tierConfig.model}`);
129+
} else {
130+
ui.ok(`Tier ${currentTier}${tierId} (context: ${tierConfig.context})`);
131+
}
122132

123133
// Check if new model needs downloading
124134
const modelsDir = join(installDir, 'data', 'models');

dream-server/cli-installer/src/commands/uninstall.ts

Lines changed: 19 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ import { DEFAULT_INSTALL_DIR } from '../lib/config.ts';
66
import * as ui from '../lib/ui.ts';
77
import * as prompts from '../lib/prompts.ts';
88
import { existsSync } from 'node:fs';
9-
import { join } from 'node:path';
9+
import { join, resolve } from 'node:path';
1010

1111
export interface UninstallOptions {
1212
dir?: string;
@@ -94,14 +94,16 @@ export async function uninstall(opts: UninstallOptions): Promise<void> {
9494
}
9595
}
9696

97-
// ── Step 3: Remove Docker volumes ──
98-
try {
99-
await exec(
100-
[...composeCmd, 'down', '-v'],
101-
{ cwd: installDir, throwOnError: false, timeout: 30_000 },
102-
);
103-
ui.ok('Docker volumes removed');
104-
} catch { /* volumes may not exist */ }
97+
// ── Step 3: Remove Docker volumes (only if NOT keeping data) ──
98+
if (!opts.keepData) {
99+
try {
100+
await exec(
101+
[...composeCmd, 'down', '-v'],
102+
{ cwd: installDir, throwOnError: false, timeout: 30_000 },
103+
);
104+
ui.ok('Docker volumes removed');
105+
} catch { /* volumes may not exist */ }
106+
}
105107

106108
// ── Step 4: Remove network ──
107109
try {
@@ -118,6 +120,14 @@ export async function uninstall(opts: UninstallOptions): Promise<void> {
118120
} else {
119121
const deleteData = opts.force || await prompts.confirm(`Delete installation directory ${installDir}?`);
120122
if (deleteData) {
123+
// Safety guard: refuse to rm -rf critical system directories
124+
const target = resolve(installDir);
125+
const DANGEROUS_PATHS = ['/', '/home', '/root', '/usr', '/etc', '/var', '/boot', '/bin', '/sbin', '/lib', '/opt', '/tmp'];
126+
if (DANGEROUS_PATHS.includes(target) || target.split('/').filter(Boolean).length < 2) {
127+
ui.fail(`Safety check: refusing to delete system directory: ${target}`);
128+
return;
129+
}
130+
121131
const delSpinner = new ui.Spinner(`Removing ${installDir}...`);
122132
delSpinner.start();
123133
try {

0 commit comments

Comments
 (0)