docs: update all documentation for macOS Apple Silicon support

Grace Bradley · claude · Grace Bradley · commit a9c920a2f453 · 2026-03-05T23:54:49.000-05:00
- README.md: macOS status "Supported", add install section, Apple Silicon tier table
- SUPPORT-MATRIX.md: promote macOS from Tier C to Tier B with full runtime
- PLATFORM-TRUTH-TABLE.md: update macOS claims and release language guardrails
- KNOWN-GOOD-VERSIONS.md: add tested macOS configuration (M4, macOS 26, Docker 29.2.1)
- MACOS-QUICKSTART.md: new quickstart guide following Windows pattern
- docs/README.md: add macOS section to documentation index

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/README.md b/README.md
@@ -26,14 +26,14 @@
 > |----------|--------|
 > | **Linux** (NVIDIA + AMD) | **Supported** — install and run today |
 > | **Windows** (NVIDIA + AMD) | **Supported** — install and run today |
-> | **macOS** (Apple Silicon) | **Coming soon** — target mid-March 2026 |
+> | **macOS** (Apple Silicon) | **Supported** — install and run today |
 >
 > **Tested Linux distros:** Ubuntu 24.04/22.04, Debian 12, Fedora 41+, Arch Linux, CachyOS, openSUSE Tumbleweed. Other distros using apt, dnf, pacman, or zypper should also work — [open an issue](https://github.com/Light-Heart-Labs/DreamServer/issues) if yours doesn't.
 >
 > **Windows:** Requires Docker Desktop with WSL2 backend. NVIDIA GPUs use Docker GPU passthrough; AMD Strix Halo runs llama-server natively with Vulkan.
 >
-> The macOS installer currently provides system diagnostics and preflight checks only.
-> Full macOS runtime support is in active development.
+> **macOS:** Requires Apple Silicon (M1+) and Docker Desktop. llama-server runs natively with Metal GPU acceleration; all other services run in Docker.
+>
 > See the [Support Matrix](dream-server/docs/SUPPORT-MATRIX.md) for details.
 
 ---
@@ -98,16 +98,20 @@ The installer detects your GPU, picks the right model, generates credentials, st
 </details>
 
 <details>
-<summary><b>macOS (coming soon — not yet functional)</b></summary>
+<summary><b>macOS (Apple Silicon)</b></summary>
 
-The macOS installer currently runs **preflight diagnostics only** — it will check your system but will not produce a working AI stack yet. Full runtime support for Apple Silicon is in active development (target mid-March 2026).
+Requires Apple Silicon (M1+) and Docker Desktop.
 
 ```bash
 git clone https://github.com/Light-Heart-Labs/DreamServer.git
 cd DreamServer/dream-server
-./install.sh    # Runs preflight checks; full runtime coming soon
+./install.sh
 ```
 
+The installer detects your chip, picks the right model for your unified memory, launches llama-server natively with Metal acceleration, and starts all other services in Docker. Manage with `./dream-macos.sh status`.
+
+See the [macOS Quickstart](dream-server/docs/MACOS-QUICKSTART.md) for details.
+
 </details>
 
 ---
@@ -162,6 +166,15 @@ The installer detects your GPU and picks the optimal model automatically. No man
 | 64–89 GB | Qwen3 30B-A3B (30B MoE) | Ryzen AI MAX+ 395 (64GB) |
 | 90+ GB | Qwen3 Coder Next (80B MoE) | Ryzen AI MAX+ 395 (96GB) |
 
+### Apple Silicon (Unified Memory, Metal)
+
+| Unified RAM | Model | Example Hardware |
+|-------------|-------|-----------------|
+| 8 GB | Qwen3 4B (Q4_K_M) | M1/M2 base (8GB) |
+| 16 GB | Qwen3 8B (Q4_K_M) | M4 Mac Mini, M3 MacBook Air |
+| 32–48 GB | Qwen3 14B (Q4_K_M) | M4 Pro Mac Mini, M2 Max MacBook Pro |
+| 64+ GB | Qwen3 30B-A3B (30B MoE) | M2 Ultra Mac Studio, M4 Max MacBook Pro |
+
 Override tier selection: `./install.sh --tier 3`
 
 ---
diff --git a/dream-server/docs/KNOWN-GOOD-VERSIONS.md b/dream-server/docs/KNOWN-GOOD-VERSIONS.md
@@ -2,13 +2,42 @@
 
 Use these as minimum practical baselines for support triage.
 
-Last updated: 2026-03-02
+Last updated: 2026-03-05
+
+## macOS (Apple Silicon, Metal)
+
+Tested configuration:
+
+- macOS 15+ (Sequoia) / macOS 26 (Tahoe)
+- Apple M4 (base, 16GB unified memory)
+- Docker Desktop 4.20+ (29.2.1 tested)
+- llama.cpp release: b8210
+
+| Component | Tested version |
+|-----------|---------------|
+| macOS | 26.3 (25D125) |
+| Chip | Apple M4 (10-core GPU) |
+| RAM | 16 GB unified |
+| Docker Desktop | 29.2.1 |
+| llama.cpp | b8210 |
+| Model | Qwen3-8B-Q4_K_M |
+| Context | 32768 |
+| Services online | 16/17 (ComfyUI not deployed — no macOS GPU backend) |
+
+Quick checks:
+
+```bash
+uname -m           # Must be arm64
+sysctl -n machdep.cpu.brand_string
+system_profiler SPHardwareDataType | grep "Memory"
+docker version
+```
 
 ## Windows (WSL2 delegated path)
 
 - Windows 11 23H2+ (or Windows 10 with current WSL2 support)
 - WSL default version: `2`
-- Docker Desktop: 4.30+ (WSL2 backend enabled)
+- Docker Desktop: 4.20+ (WSL2 backend enabled)
 - NVIDIA driver (if using NVIDIA): current Studio/Game Ready with WSL support
 
 Quick checks:
@@ -27,20 +56,6 @@ docker info
 nvidia-smi
 ```
 
-## macOS (installer MVP / experimental runtime)
-
-- macOS 14+ recommended
-- Apple Silicon (arm64) strongly recommended
-- Docker Desktop: 4.30+
-
-Quick checks:
-
-```bash
-uname -m
-docker version
-df -g "$HOME"
-```
-
 ## Linux (native)
 
 - Ubuntu 22.04+ / Debian 12+ recommended
diff --git a/dream-server/docs/MACOS-QUICKSTART.md b/dream-server/docs/MACOS-QUICKSTART.md
@@ -0,0 +1,145 @@
+# Dream Server macOS Quickstart
+
+> **Status: Supported**
+>
+> The macOS installer runs end-to-end on Apple Silicon. One command gives you a full local AI stack with Metal-accelerated inference.
+
+---
+
+## Prerequisites
+
+- **Apple Silicon** Mac (M1, M2, M3, M4 or later)
+- **Docker Desktop** 4.20+ installed and running
+- **16 GB+ unified memory** recommended (8 GB minimum)
+- **20 GB+ free disk space** (model + Docker images)
+
+---
+
+## Install
+
+```bash
+git clone https://github.com/Light-Heart-Labs/DreamServer.git
+cd DreamServer/dream-server
+./install.sh
+```
+
+The installer will:
+
+1. **Detect your chip** — identifies Apple Silicon variant and unified memory
+2. **Pick the right model** — selects optimal model size for your RAM
+3. **Download llama-server** — native macOS arm64 binary with Metal support
+4. **Download your model** — GGUF file sized for your hardware
+5. **Start Docker services** — chat UI, search, workflows, voice, and more
+6. **Install OpenCode** — browser-based AI coding IDE on port 3003
+
+**Estimated time:** 5–15 minutes depending on download speed.
+
+---
+
+## Open the UI
+
+- **Chat UI:** http://localhost:3000
+- **Dashboard:** http://localhost:3001
+- **OpenCode (IDE):** http://localhost:3003
+
+First user on the Chat UI becomes admin. Start chatting immediately.
+
+---
+
+## Architecture
+
+```
+macOS Host
+  ├── llama-server (native, Metal GPU acceleration)
+  ├── OpenCode web IDE (native, LaunchAgent)
+  └── Docker Desktop
+        ├── Open WebUI (port 3000)
+        ├── Dashboard (port 3001)
+        ├── LiteLLM API Gateway (port 4000)
+        ├── n8n Workflows (port 5678)
+        ├── Qdrant Vector DB (port 6333)
+        ├── SearXNG Search (port 8888)
+        ├── Perplexica Deep Research (port 3004)
+        ├── OpenClaw Agents (port 7860)
+        ├── TEI Embeddings (port 8090)
+        ├── Whisper STT (port 9000)
+        ├── Kokoro TTS (port 8880)
+        └── Privacy Shield (port 8085)
+```
+
+llama-server runs natively for full Metal GPU utilization. Docker containers reach it via `host.docker.internal:8080`.
+
+---
+
+## Managing Your Stack
+
+```bash
+./dream-macos.sh status          # Health checks for all services
+./dream-macos.sh stop            # Stop everything
+./dream-macos.sh start           # Start everything
+./dream-macos.sh restart         # Restart everything
+./dream-macos.sh logs llama-server   # Tail llama-server logs
+```
+
+---
+
+## Hardware Tiers
+
+The installer auto-selects the best model for your unified memory:
+
+| Unified RAM | Tier | Model | Context |
+|-------------|------|-------|---------|
+| 8 GB | 1 | Qwen3 4B (Q4_K_M) | 8192 |
+| 16 GB | 2 | Qwen3 8B (Q4_K_M) | 32768 |
+| 32–48 GB | 3 | Qwen3 14B (Q4_K_M) | 32768 |
+| 64+ GB | 4 | Qwen3 30B-A3B (30B MoE) | 32768 |
+
+Override: `./install.sh --tier 3`
+
+---
+
+## Troubleshooting
+
+| Issue | Fix |
+|-------|-----|
+| "Docker not running" | Start Docker Desktop, wait for whale icon in menu bar |
+| "Not Apple Silicon" | Intel Macs are not supported — Apple Silicon (arm64) required |
+| "Port in use" | Check for conflicting services: `lsof -i :8080` |
+| llama-server crashes | Check memory — your model may be too large for available RAM |
+| Docker services slow to start | First launch pulls images (~10 GB); subsequent starts are fast |
+| TEI embeddings container restarts | Normal on arm64 — runs via Rosetta 2 emulation, may need a minute |
+
+---
+
+## Files & Locations
+
+| What | Where |
+|------|-------|
+| Install directory | `~/dream-server/` |
+| Config | `~/dream-server/.env` |
+| Models | `~/dream-server/data/models/` |
+| llama-server binary | `~/dream-server/llama-server/` |
+| OpenCode | `~/.opencode/bin/opencode` |
+| OpenCode config | `~/.config/opencode/opencode.json` |
+| LaunchAgent (OpenCode) | `~/Library/LaunchAgents/com.dreamserver.opencode-web.plist` |
+| CLI tool | `~/dream-server/dream-macos.sh` |
+
+---
+
+## Known Limitations
+
+- **ComfyUI (image generation)** is not available on macOS — requires NVIDIA GPU backend
+- **Dashboard GPU info** shows "Unknown" — macOS Metal is not detected by the Linux-based dashboard container
+- **TEI embeddings** runs under Rosetta 2 emulation (linux/amd64) — functional but slower than native
+
+---
+
+## Need Help?
+
+- Support matrix: [SUPPORT-MATRIX.md](SUPPORT-MATRIX.md)
+- General FAQ: [../FAQ.md](../FAQ.md)
+- General troubleshooting: [TROUBLESHOOTING.md](TROUBLESHOOTING.md)
+
+---
+
+*Last updated: 2026-03-05*
diff --git a/dream-server/docs/PLATFORM-TRUTH-TABLE.md b/dream-server/docs/PLATFORM-TRUTH-TABLE.md
@@ -10,14 +10,14 @@ Last updated: 2026-03-05
 | Linux AMD unified (Strix) | Preferred AMD path | Tier A | — | Real install + runtime benchmarks + doctor/preflight clean |
 | Linux NVIDIA | CUDA/llama-server path | Tier B | — | Real install + model load + runtime/throughput checks |
 | Windows (Docker Desktop + WSL2) | Standalone installer with full runtime | Tier B | — | `.\install.ps1` real run + GPU detection + Docker compose up + health checks pass |
-| macOS Apple Silicon | Preflight diagnostics only (no runtime) | Tier C | Mid-March 2026 | `installers/macos.sh` run + preflight/doctor pass + full runtime parity |
+| macOS Apple Silicon | Native Metal inference + Docker services | Tier B | — | `./install.sh` real run + chip detection + llama-server Metal healthy + 16/17 services online + health checks pass |
 
 ## Release language guardrails
 
 - Safe to claim now:
   - Linux support (AMD Strix Halo + NVIDIA).
   - Windows support (Docker Desktop + WSL2, NVIDIA/AMD GPU auto-detection).
-  - macOS **coming soon** with preflight diagnostics available now.
+  - macOS support (Apple Silicon with Metal acceleration).
 - Not safe to claim now:
-  - macOS **support** (implies a working runtime, which does not exist yet).
-  - Full macOS runtime parity with Linux.
+  - Full macOS runtime parity with Linux (ComfyUI not available on macOS — no GPU backend for image generation).
+  - macOS Tier A (needs broader hardware validation across M1/M2/M3/M4 variants).
diff --git a/dream-server/docs/README.md b/dream-server/docs/README.md
@@ -42,6 +42,12 @@
 | [DREAM-DOCTOR.md](DREAM-DOCTOR.md) | Operators | Diagnostic tool usage |
 | [PREFLIGHT-ENGINE.md](PREFLIGHT-ENGINE.md) | Developers | Preflight validation system |
 
+## macOS
+
+| Doc | Audience | Description |
+|-----|----------|-------------|
+| [MACOS-QUICKSTART.md](MACOS-QUICKSTART.md) | Operators | macOS Apple Silicon install guide |
+
 ## Windows
 
 | Doc | Audience | Description |
diff --git a/dream-server/docs/SUPPORT-MATRIX.md b/dream-server/docs/SUPPORT-MATRIX.md
@@ -4,19 +4,19 @@ Last updated: 2026-03-05
 
 ## What Works Today
 
-**Linux and Windows are fully supported.** macOS support is coming soon — the macOS installer currently provides system diagnostics and preflight checks only.
+**Linux, Windows, and macOS are fully supported.**
 
 | Platform | Status | What you get today |
 |----------|--------|-------------------|
 | **Linux + AMD Strix Halo (ROCm)** | **Fully supported** | Complete install and runtime. Primary development platform. |
 | **Linux + NVIDIA (CUDA)** | **Supported** | Complete install and runtime. Broader distro test matrix still expanding. |
 | **Windows (Docker Desktop + WSL2)** | **Supported** | Complete install and runtime via `.\install.ps1`. GPU auto-detection (NVIDIA/AMD). |
-| **macOS (Apple Silicon)** | **Coming soon** (target: mid-March 2026) | Preflight diagnostics and system readiness checks only. No runtime. |
+| **macOS (Apple Silicon)** | **Supported** | Complete install and runtime via `./install.sh`. Native Metal inference + Docker services. |
 
 ## Support Tiers
 
 - `Tier A` — fully supported and actively tested in this repo
-- `Tier B` — partially supported (works in some paths, gaps remain)
+- `Tier B` — supported (works end-to-end, broader validation ongoing)
 - `Tier C` — experimental or planned (installer diagnostics only, no runtime)
 
 ## Platform Matrix (detailed)
@@ -26,27 +26,25 @@ Last updated: 2026-03-05
 | Linux (Ubuntu/Debian family) | NVIDIA (llama-server/CUDA) | Tier B | Installer path exists in `install-core.sh`; broader distro test matrix still pending |
 | Linux (Strix Halo / AMD unified memory) | AMD (llama-server/ROCm) | Tier A | Primary path via `docker-compose.base.yml` + `docker-compose.amd.yml` |
 | Windows (Docker Desktop + WSL2) | NVIDIA/AMD via Docker Desktop | Tier B | Standalone installer (`.\install.ps1`) with GPU auto-detection, Docker orchestration, health checks, and desktop shortcuts |
-| macOS (Apple Silicon) | Metal/MLX-style local backend | Tier C | `installers/macos.sh` runs preflight + doctor with actionable reports; full runtime under development |
+| macOS (Apple Silicon) | Metal (native llama-server) | Tier B | Standalone installer (`./install.sh`) with chip detection, native Metal inference, Docker services, and LaunchAgent auto-start |
 
 ## Current Truth
 
-- **Linux and Windows are fully supported.**
+- **Linux, Windows, and macOS are fully supported.**
 - Linux + NVIDIA is supported but needs broader validation and CI matrix coverage.
 - Windows installs via `.\install.ps1` with Docker Desktop + WSL2 backend.
-- macOS installer runs preflight diagnostics only — it **will not produce a running AI stack**.
-- macOS full runtime support is targeted for mid-March 2026.
+- macOS installs via `./install.sh` — llama-server runs natively with Metal acceleration, all other services in Docker.
 - Version baselines for triage are in `docs/KNOWN-GOOD-VERSIONS.md`.
 
 ## Roadmap
 
 | Target | Milestone |
 |--------|-----------|
-| **Now** | Linux AMD + NVIDIA + Windows fully supported |
-| **Mid-March 2026** | macOS Apple Silicon full runtime support |
+| **Now** | Linux AMD + NVIDIA + Windows + macOS fully supported |
 | **Ongoing** | CI smoke matrix expansion for all platforms |
 
 ## Next Milestones
 
-1. Ship macOS Apple Silicon full runtime (installer + compose + runtime parity).
-2. Add CI smoke matrix for Linux NVIDIA/AMD and WSL logic checks.
-3. Promote macOS from Tier C to Tier B after validated real-hardware runs.
+1. Add CI smoke matrix for Linux NVIDIA/AMD and WSL logic checks.
+2. Expand macOS test coverage across M1/M2/M3/M4 variants and RAM tiers.
+3. Promote macOS from Tier B to Tier A after broader real-hardware validation.