Releases · Light-Heart-Labs/DreamServer

Dream Server v2.0.0 — Strix Halo

One command to a full local AI stack. Buy hardware. Run installer. AI running.

Highlights

AMD Strix Halo support — ROCm 7.2 with unified memory tiers (SH_LARGE, SH_COMPACT), running Qwen3 Coder Next 80B MoE on a single APU
NVIDIA ultra tier — NV_ULTRA for 90GB+ multi-GPU setups (A100/H100)
Modular installer — 2591-line monolith rewritten as 6 libraries + 13 phases, each in its own file
Bootstrap mode — chat in under 2 minutes with a tiny model while the full model downloads in the background
Extension system — every service is a manifest + compose fragment, hot-pluggable via dream enable/disable
13 integrated services — LLM inference, chat UI, voice (STT + TTS), AI agents, workflow automation, RAG, web search, deep research, image generation, and privacy tools
Management dashboard — real-time GPU metrics, service health, model info, all in one view
dream-cli — full stack management with mode switching (local/cloud/hybrid), model swapping, presets

What's Included

Service	Purpose
llama-server	LLM inference (CUDA + ROCm)
Open WebUI	Chat interface
LiteLLM	API gateway (local/cloud/hybrid)
OpenClaw	Autonomous AI agents
n8n	Workflow automation (400+ integrations)
Whisper	Speech-to-text
Kokoro	Text-to-speech
Qdrant	Vector database for RAG
SearXNG	Self-hosted web search
Perplexica	Deep research engine
ComfyUI	Image generation
Privacy Shield	PII scrubbing proxy
Dashboard	GPU metrics + service health

Hardware Auto-Detection

The installer detects your GPU and picks the optimal model:

Hardware	VRAM	Model
NVIDIA 8-11 GB	RTX 4060 Ti	Qwen 2.5 7B
NVIDIA 12-20 GB	RTX 3090	Qwen 2.5 14B
NVIDIA 20-40 GB	RTX 4090	Qwen 2.5 32B
NVIDIA 40+ GB	A100	Qwen 2.5 72B
NVIDIA 90+ GB	Multi-GPU	Qwen3 Coder Next 80B
AMD Strix Halo 64 GB	Ryzen AI MAX+ 395	Qwen3 30B-A3B
AMD Strix Halo 96 GB	Ryzen AI MAX+ 395	Qwen3 Coder Next 80B

Install

curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/DreamServer/main/dream-server/get-dream-server.sh | bash

Full Changelog

See CHANGELOG.md for the complete list of changes.

First public release of Dream Server -- your turnkey local AI stack.

Your hardware. Your data. Your rules.

One installer. Bare metal to a fully running local AI stack -- LLM inference, chat UI, voice agents, workflow automation, RAG, and privacy tools. No manual config. No dependency hell. Run one command and everything works.

Highlights

Full-stack local AI in one command -- vLLM inference, chat UI, voice agents, workflow automation, RAG, privacy shield, and a real-time dashboard, all wired together and running on your GPU.
Automatic hardware detection -- the installer probes your GPU, selects the optimal model (7B to 72B parameters), and configures VRAM allocation, context windows, and resource limits without manual tuning.
Bootstrap mode for instant start -- a lightweight 1.5B model boots in under a minute so you can start chatting immediately while the full model downloads in the background. Hot-swap with zero downtime when ready.
End-to-end voice pipeline -- Whisper speech-to-text, Kokoro text-to-speech, and LiveKit WebRTC voice agents let you have real-time spoken conversations with your local LLM entirely on-premises.
OpenClaw multi-agent support -- built-in integration with the OpenClaw agent framework, including a vLLM Tool Call Proxy, pre-configured workspace templates, and battle-tested configs for autonomous AI coordination on local hardware.

What's Included

Component	Image	Version	Profile
vLLM (LLM Inference)	`vllm/vllm-openai`	v0.15.1	core
Open WebUI (Chat Interface)	`ghcr.io/open-webui/open-webui`	v0.7.2	core
Dashboard UI (Control Center)	`dream-dashboard`	local build	core
Dashboard API (Status Backend)	`dream-dashboard-api`	local build	core
Whisper (Speech-to-Text)	`onerahmet/openai-whisper-asr-webservice`	v1.4.1	voice
Kokoro (Text-to-Speech)	`ghcr.io/remsky/kokoro-fastapi-cpu`	v0.2.4	voice
LiveKit (WebRTC Voice)	`dream-livekit`	local build	voice
LiveKit Voice Agent	`dream-voice-agent`	local build	voice
n8n (Workflow Automation)	`n8nio/n8n`	2.6.4	workflows
Qdrant (Vector Database)	`qdrant/qdrant`	v1.16.3	rag
Text Embeddings	`ghcr.io/huggingface/text-embeddings-inference`	cpu-1.9.1	rag
LiteLLM (API Gateway)	`ghcr.io/berriai/litellm`	v1.81.3-stable	monitoring
Token Spy (Usage Monitoring)	`lightheartlabs/token-spy`	latest	monitoring
TimescaleDB (Token Spy DB)	`timescale/timescaledb`	latest-pg15	monitoring
Redis (Rate Limiting)	`redis`	7-alpine	monitoring
Privacy Shield (PII Redaction)	`dream-privacy-shield`	local build	privacy
OpenClaw (Agent Framework)	`ghcr.io/openclaw/openclaw`	latest	openclaw
vLLM Tool Proxy	`dream-vllm-tool-proxy`	local build	openclaw

Hardware Support

The installer automatically detects your GPU and selects the optimal model:

Tier	VRAM	Model	Context	Example GPUs
Entry	< 12 GB	Qwen2.5-7B	8K	RTX 3080, RTX 4070
Prosumer	12 -- 20 GB	Qwen2.5-14B-AWQ	16K	RTX 3090, RTX 4080
Pro	20 -- 40 GB	Qwen2.5-32B-AWQ	32K	RTX 4090, A6000
Enterprise	40 GB+	Qwen2.5-72B-AWQ	32K	A100, H100, multi-GPU

Override with ./install.sh --tier 3 if you know what you want.

Install

One-liner (Linux / WSL):

curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/Lighthouse-AI/main/dream-server/get-dream-server.sh | bash

Manual clone:

git clone https://github.com/Light-Heart-Labs/Lighthouse-AI.git
cd Lighthouse-AI/dream-server
./install.sh

Windows (PowerShell):

Invoke-WebRequest -Uri "https://raw.githubusercontent.com/Light-Heart-Labs/Lighthouse-AI/main/dream-server/install.ps1" -OutFile install.ps1
.\install.ps1

The Windows installer handles WSL2 setup, Docker Desktop, and NVIDIA driver configuration automatically.

Requirements: Docker with Compose v2+, NVIDIA GPU with 8 GB+ VRAM (16 GB+ recommended), NVIDIA Container Toolkit, 40 GB+ disk space.

Operations Toolkit

Standalone tools for running persistent AI agents in production, included in the repo:

Guardian -- Self-healing process watchdog that monitors services, restores from backup, and runs as root so agents cannot kill it.
Memory Shepherd -- Periodic memory reset to prevent identity drift in long-running agents.
Token Spy -- API cost monitoring with real-time dashboard and auto-kill for runaway sessions.
vLLM Tool Proxy -- Makes local model tool calling work with OpenClaw via SSE re-wrapping and loop protection.
LLM Cold Storage -- Archives idle HuggingFace models to free disk while keeping them resolvable via symlink.

Known Limitations

First release -- expect rough edges.
LiveKit voice requires manual profile activation (--profile voice).
OpenClaw integration is experimental.
No ARM / Apple Silicon support yet (planned).
Models download on first run (20 GB+ for full-size models).
Token Spy and TimescaleDB images are pinned to latest -- consider pinning exact versions in production.

What's Next

ARM / Apple Silicon support
One-click model switching in dashboard
Automated backup and restore
Community workflow templates
Pinned versions for all remaining latest tags

Contributors

Built by Lightheart Labs and the OpenClaw Collective.

License: Apache 2.0 -- use it, modify it, ship it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Dream Server v2.0.0 — Strix Halo

Highlights

What's Included

Hardware Auto-Detection

Install

Full Changelog

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

What's Included

Hardware Support

Install

Operations Toolkit

Known Limitations

What's Next

Contributors

Uh oh!

Releases: Light-Heart-Labs/DreamServer

Dream Server v2.0.0 — Strix Halo

Dream Server v2.0.0 — Strix Halo

Highlights

What's Included

Hardware Auto-Detection

Install

Full Changelog

Uh oh!

Dream Server v1.0.0

Highlights

What's Included

Hardware Support

Install

Operations Toolkit

Known Limitations

What's Next

Contributors

Uh oh!