|
| 1 | +# Dream Server macOS Quickstart |
| 2 | + |
| 3 | +> **Status: Supported** |
| 4 | +> |
| 5 | +> The macOS installer runs end-to-end on Apple Silicon. One command gives you a full local AI stack with Metal-accelerated inference. |
| 6 | +
|
| 7 | +--- |
| 8 | + |
| 9 | +## Prerequisites |
| 10 | + |
| 11 | +- **Apple Silicon** Mac (M1, M2, M3, M4 or later) |
| 12 | +- **Docker Desktop** 4.20+ installed and running |
| 13 | +- **16 GB+ unified memory** recommended (8 GB minimum) |
| 14 | +- **20 GB+ free disk space** (model + Docker images) |
| 15 | + |
| 16 | +--- |
| 17 | + |
| 18 | +## Install |
| 19 | + |
| 20 | +```bash |
| 21 | +git clone https://github.com/Light-Heart-Labs/DreamServer.git |
| 22 | +cd DreamServer/dream-server |
| 23 | +./install.sh |
| 24 | +``` |
| 25 | + |
| 26 | +The installer will: |
| 27 | + |
| 28 | +1. **Detect your chip** — identifies Apple Silicon variant and unified memory |
| 29 | +2. **Pick the right model** — selects optimal model size for your RAM |
| 30 | +3. **Download llama-server** — native macOS arm64 binary with Metal support |
| 31 | +4. **Download your model** — GGUF file sized for your hardware |
| 32 | +5. **Start Docker services** — chat UI, search, workflows, voice, and more |
| 33 | +6. **Install OpenCode** — browser-based AI coding IDE on port 3003 |
| 34 | + |
| 35 | +**Estimated time:** 5–15 minutes depending on download speed. |
| 36 | + |
| 37 | +--- |
| 38 | + |
| 39 | +## Open the UI |
| 40 | + |
| 41 | +- **Chat UI:** http://localhost:3000 |
| 42 | +- **Dashboard:** http://localhost:3001 |
| 43 | +- **OpenCode (IDE):** http://localhost:3003 |
| 44 | + |
| 45 | +First user on the Chat UI becomes admin. Start chatting immediately. |
| 46 | + |
| 47 | +--- |
| 48 | + |
| 49 | +## Architecture |
| 50 | + |
| 51 | +``` |
| 52 | +macOS Host |
| 53 | + ├── llama-server (native, Metal GPU acceleration) |
| 54 | + ├── OpenCode web IDE (native, LaunchAgent) |
| 55 | + └── Docker Desktop |
| 56 | + ├── Open WebUI (port 3000) |
| 57 | + ├── Dashboard (port 3001) |
| 58 | + ├── LiteLLM API Gateway (port 4000) |
| 59 | + ├── n8n Workflows (port 5678) |
| 60 | + ├── Qdrant Vector DB (port 6333) |
| 61 | + ├── SearXNG Search (port 8888) |
| 62 | + ├── Perplexica Deep Research (port 3004) |
| 63 | + ├── OpenClaw Agents (port 7860) |
| 64 | + ├── TEI Embeddings (port 8090) |
| 65 | + ├── Whisper STT (port 9000) |
| 66 | + ├── Kokoro TTS (port 8880) |
| 67 | + └── Privacy Shield (port 8085) |
| 68 | +``` |
| 69 | + |
| 70 | +llama-server runs natively for full Metal GPU utilization. Docker containers reach it via `host.docker.internal:8080`. |
| 71 | + |
| 72 | +--- |
| 73 | + |
| 74 | +## Managing Your Stack |
| 75 | + |
| 76 | +```bash |
| 77 | +./dream-macos.sh status # Health checks for all services |
| 78 | +./dream-macos.sh stop # Stop everything |
| 79 | +./dream-macos.sh start # Start everything |
| 80 | +./dream-macos.sh restart # Restart everything |
| 81 | +./dream-macos.sh logs llama-server # Tail llama-server logs |
| 82 | +``` |
| 83 | + |
| 84 | +--- |
| 85 | + |
| 86 | +## Hardware Tiers |
| 87 | + |
| 88 | +The installer auto-selects the best model for your unified memory: |
| 89 | + |
| 90 | +| Unified RAM | Tier | Model | Context | |
| 91 | +|-------------|------|-------|---------| |
| 92 | +| 8 GB | 1 | Qwen3 4B (Q4_K_M) | 8192 | |
| 93 | +| 16 GB | 2 | Qwen3 8B (Q4_K_M) | 32768 | |
| 94 | +| 32–48 GB | 3 | Qwen3 14B (Q4_K_M) | 32768 | |
| 95 | +| 64+ GB | 4 | Qwen3 30B-A3B (30B MoE) | 32768 | |
| 96 | + |
| 97 | +Override: `./install.sh --tier 3` |
| 98 | + |
| 99 | +--- |
| 100 | + |
| 101 | +## Troubleshooting |
| 102 | + |
| 103 | +| Issue | Fix | |
| 104 | +|-------|-----| |
| 105 | +| "Docker not running" | Start Docker Desktop, wait for whale icon in menu bar | |
| 106 | +| "Not Apple Silicon" | Intel Macs are not supported — Apple Silicon (arm64) required | |
| 107 | +| "Port in use" | Check for conflicting services: `lsof -i :8080` | |
| 108 | +| llama-server crashes | Check memory — your model may be too large for available RAM | |
| 109 | +| Docker services slow to start | First launch pulls images (~10 GB); subsequent starts are fast | |
| 110 | +| TEI embeddings container restarts | Normal on arm64 — runs via Rosetta 2 emulation, may need a minute | |
| 111 | + |
| 112 | +--- |
| 113 | + |
| 114 | +## Files & Locations |
| 115 | + |
| 116 | +| What | Where | |
| 117 | +|------|-------| |
| 118 | +| Install directory | `~/dream-server/` | |
| 119 | +| Config | `~/dream-server/.env` | |
| 120 | +| Models | `~/dream-server/data/models/` | |
| 121 | +| llama-server binary | `~/dream-server/llama-server/` | |
| 122 | +| OpenCode | `~/.opencode/bin/opencode` | |
| 123 | +| OpenCode config | `~/.config/opencode/opencode.json` | |
| 124 | +| LaunchAgent (OpenCode) | `~/Library/LaunchAgents/com.dreamserver.opencode-web.plist` | |
| 125 | +| CLI tool | `~/dream-server/dream-macos.sh` | |
| 126 | + |
| 127 | +--- |
| 128 | + |
| 129 | +## Known Limitations |
| 130 | + |
| 131 | +- **ComfyUI (image generation)** is not available on macOS — requires NVIDIA GPU backend |
| 132 | +- **Dashboard GPU info** shows "Unknown" — macOS Metal is not detected by the Linux-based dashboard container |
| 133 | +- **TEI embeddings** runs under Rosetta 2 emulation (linux/amd64) — functional but slower than native |
| 134 | + |
| 135 | +--- |
| 136 | + |
| 137 | +## Need Help? |
| 138 | + |
| 139 | +- Support matrix: [SUPPORT-MATRIX.md](SUPPORT-MATRIX.md) |
| 140 | +- General FAQ: [../FAQ.md](../FAQ.md) |
| 141 | +- General troubleshooting: [TROUBLESHOOTING.md](TROUBLESHOOTING.md) |
| 142 | + |
| 143 | +--- |
| 144 | + |
| 145 | +*Last updated: 2026-03-05* |
0 commit comments