diff --git a/README.md b/README.md index 2edf6efe2..8c596034c 100644 --- a/README.md +++ b/README.md @@ -1,199 +1,160 @@
-# Lighthouse AI +# Dream Server -**Local AI infrastructure. Your hardware. Your data. Your rules.** +**Your turnkey local AI stack. Buy hardware. Run installer. AI running.** [![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) -[![GitHub Stars](https://img.shields.io/github/stars/Light-Heart-Labs/Lighthouse-AI)](https://github.com/Light-Heart-Labs/Lighthouse-AI/stargazers) -[![Release](https://img.shields.io/github/v/release/Light-Heart-Labs/Lighthouse-AI)](https://github.com/Light-Heart-Labs/Lighthouse-AI/releases) -[![CI](https://img.shields.io/github/actions/workflow/status/Light-Heart-Labs/Lighthouse-AI/lint-python.yml?label=CI)](https://github.com/Light-Heart-Labs/Lighthouse-AI/actions) +[![GitHub Stars](https://img.shields.io/github/stars/Light-Heart-Labs/DreamServer)](https://github.com/Light-Heart-Labs/DreamServer/stargazers) +[![Release](https://img.shields.io/github/v/release/Light-Heart-Labs/DreamServer)](https://github.com/Light-Heart-Labs/DreamServer/releases) +[![Docker](https://img.shields.io/badge/Docker-Required-2496ED?logo=docker)](https://docs.docker.com/get-docker/)
--- -## Dream Server โ€” One Command, Full AI Stack +## 5-Minute Quickstart -One installer gets you from bare metal to a fully running local AI stack โ€” LLM inference, chat UI, voice agents, workflow automation, RAG, and privacy tools. No manual config. No dependency hell. No six months of piecing it together. Run one command, answer a few questions, everything works. +```bash +# One-line install (Linux/WSL) +curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/DreamServer/main/dream-server/get-dream-server.sh | bash +``` + +Or manually: ```bash -curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/Lighthouse-AI/main/dream-server/get-dream-server.sh | bash +git clone https://github.com/Light-Heart-Labs/DreamServer.git +cd DreamServer/dream-server +./install.sh ``` -

- Dream Server installer โ€” auto-detects GPU, recommends model tier, and lets you choose your stack -
- The installer detects your hardware, picks the optimal model, and asks how deep you want to go. -

+The installer auto-detects your GPU, picks the right model, generates secure passwords, and starts everything. Open **http://localhost:3000** and start chatting. ---- +### ๐Ÿš€ Instant Start (Bootstrap Mode) -## Dashboard +By default, Dream Server uses **bootstrap mode** for instant gratification: -Everything running, at a glance. GPU metrics, service health, one-click access to Chat, Voice, Workflows, Agents, and Documents. +1. Starts immediately with a tiny 1.5B model (downloads in <1 minute) +2. You can start chatting within **2 minutes** of running the installer +3. The full model downloads in the background +4. When ready, hot-swap to the full model with zero downtime -

- Dream Server dashboard โ€” GPU metrics, service status, feature cards -

+No more staring at download bars. Start playing immediately. ---- +### Windows -## Architecture - -```mermaid -graph TB - subgraph User["  You  "] - Browser(["Browser"]) - Mic(["Microphone"]) - API(["API Client"]) - end - - subgraph DreamServer["Dream Server (Docker Compose)"] - subgraph Core["Core"] - VLLM["vLLM ยท :8000
LLM Inference"] - WebUI["Open WebUI ยท :3000
Chat Interface"] - Dashboard["Dashboard ยท :3001
GPU Metrics"] - end - - subgraph Voice["Voice"] - Whisper["Whisper ยท :9000
Speech โ†’ Text"] - Kokoro["Kokoro ยท :8880
Text โ†’ Speech"] - LiveKit["LiveKit ยท :7880
WebRTC"] - VoiceAgent["Voice Agent"] - end - - subgraph RAGp["RAG"] - Qdrant["Qdrant ยท :6333
Vector DB"] - Embeddings["Embeddings ยท :8090"] - end - - subgraph Workflows["Workflows"] - N8N["n8n ยท :5678
400+ Integrations"] - end - - subgraph Agents["Agents"] - OpenClaw["OpenClaw ยท :7860
Multi-Agent"] - ToolProxy["Tool Proxy
vLLM Bridge"] - end - - subgraph Privacy["Privacy"] - Shield["Privacy Shield ยท :8085
PII Redaction"] - end - end - - Browser --> WebUI - Browser --> Dashboard - Browser --> N8N - Mic --> LiveKit - API --> VLLM - - WebUI --> VLLM - VoiceAgent --> Whisper - VoiceAgent --> Kokoro - VoiceAgent --> VLLM - LiveKit --> VoiceAgent - OpenClaw --> ToolProxy - ToolProxy --> VLLM - Shield -.->|PII scrubbed| VLLM - - style Core fill:#e8f0fe,stroke:#1a73e8,color:#1a1a1a - style Voice fill:#fce8e6,stroke:#d93025,color:#1a1a1a - style RAGp fill:#e6f4ea,stroke:#1e8e3e,color:#1a1a1a - style Workflows fill:#fef7e0,stroke:#f9ab00,color:#1a1a1a - style Agents fill:#f3e8fd,stroke:#9334e6,color:#1a1a1a - style Privacy fill:#e8eaed,stroke:#5f6368,color:#1a1a1a +```powershell +# Download and run +Invoke-WebRequest -Uri "https://raw.githubusercontent.com/Light-Heart-Labs/DreamServer/main/install.ps1" -OutFile install.ps1 +.\install.ps1 ``` -The installer auto-detects your GPU and activates the right profiles. Core services start immediately; voice, RAG, workflows, and agents activate based on your hardware and preferences. +Windows installer checks prerequisites (WSL2, Docker, NVIDIA), then delegates to the Linux install path. --- -## Who Is This For? - -**Hobbyists** โ€” Want local ChatGPT without subscriptions? Install Dream Server, open `localhost:3000`, start chatting. Voice mode, document Q&A, and workflow automation are one toggle away. - -**Developers** โ€” Building AI agents? Dream Server gives you a local OpenAI-compatible API (vLLM), multi-agent coordination (OpenClaw), and a workflow engine (n8n) โ€” all on your GPU. No API keys, no rate limits, no cost per token. +## What You Get -**Teams** โ€” Need private AI infrastructure? Everything runs on your hardware. The Privacy Shield scrubs PII before anything leaves your network. Deploy once, use from any device on your LAN. +One installer. Full AI stack. Zero config. + +| Component | Purpose | Port | +|-----------|---------|------| +| **llama-server** | LLM inference engine with continuous batching | 8080 | +| **Open WebUI** | Beautiful chat interface with history & web search | 3000 | +| **Dashboard** | Real-time GPU metrics, service health, model management | 3001 | +| **LiteLLM** | Multi-model API gateway | 4000 | +| **OpenClaw** | Autonomous AI agent framework | 7860 | +| **SearXNG** | Self-hosted web search | 8888 | +| **Perplexica** | Deep research engine | 3004 | +| **n8n** | Workflow automation (400+ integrations) | 5678 | +| **Qdrant** | Vector database for RAG | 6333 | +| **Whisper** | Speech-to-text | 9000 | +| **Kokoro** | Text-to-speech | 8880 | +| **ComfyUI** | Image generation | 8188 | +| **Privacy Shield** | PII scrubbing proxy | 8085 | --- -## What You Get +## Hardware Support -| Component | What It Does | -|-----------|-------------| -| **vLLM** | GPU-accelerated LLM inference with continuous batching โ€” auto-selects 7B to 72B models for your hardware | -| **Open WebUI** | Full-featured chat interface with conversation history, model switching, web search | -| **Dashboard** | Real-time GPU metrics (VRAM, temp, utilization), service health, model management | -| **Whisper** | Speech-to-text โ€” local, fast, private | -| **Kokoro** | Text-to-speech โ€” natural-sounding voices, no cloud | -| **LiveKit** | Real-time WebRTC voice conversations โ€” talk to your AI like a phone call | -| **n8n** | Visual workflow automation with 400+ integrations (GitHub, Slack, email, webhooks) | -| **Qdrant** | Vector database for document Q&A (RAG) | -| **OpenClaw** | Multi-agent AI framework โ€” agents coordinating autonomously on your GPU | -| **Privacy Shield** | PII redaction proxy โ€” scrubs personal data before any external API call | - -### Hardware Tiers (Auto-Detected) +The installer **automatically detects your GPU** and selects the optimal configuration: + +### NVIDIA GPUs | Tier | VRAM | Model | Example GPUs | |------|------|-------|--------------| -| Entry | <12GB | Qwen2.5-7B | RTX 3080, RTX 4070 | -| Prosumer | 12โ€“20GB | Qwen2.5-14B-AWQ | RTX 3090, RTX 4080 | -| Pro | 20โ€“40GB | Qwen2.5-32B-AWQ | RTX 4090, A6000 | -| Enterprise | 40GB+ | Qwen2.5-72B-AWQ | A100, H100, multi-GPU | - -**Bootstrap mode:** Chat in 2 minutes. A tiny model starts instantly while the full model downloads in the background. Hot-swap with zero downtime when ready. +| Tier 1 | 8-11GB | qwen2.5-7b-instruct (Q4_K_M) | RTX 4060 Ti, RTX 3060 12GB | +| Tier 2 | 12-15GB | qwen2.5-14b-instruct (Q4_K_M) | RTX 3080 12GB, RTX 4070 Ti | +| Tier 3 | 16-23GB | qwen2.5-32b-instruct (Q4_K_M) | RTX 4090, RTX 3090, A5000 | +| Tier 4 | 24GB+ | qwen2.5-72b-instruct (Q4_K_M) | 2x RTX 4090, A100 | -### How It Compares +### AMD APUs (Strix Halo) -| | Dream Server | Ollama + Open WebUI | LocalAI | -|---|:---:|:---:|:---:| -| Full-stack install (LLM + voice + workflows + RAG + privacy) | **One command** | Manual assembly | Manual assembly | -| Hardware auto-detection + model selection | **Yes** | No | No | -| Voice agents (STT + TTS + WebRTC) | **Built in** | No | Partial | -| Inference engine | **vLLM** (continuous batching) | llama.cpp | llama.cpp | -| Workflow automation | **n8n (400+ integrations)** | No | No | -| PII redaction | **Built in** | No | No | -| Multi-agent framework | **OpenClaw** | No | No | +| Tier | Unified Memory | Model | Hardware | +|------|---------------|-------|----------| +| SH_LARGE | 90GB+ | qwen3-coder-next (80B MoE) | Ryzen AI MAX+ 395 (96GB) | +| SH_COMPACT | 64-89GB | qwen3-30b-a3b (30B MoE) | Ryzen AI MAX+ 395 (64GB) | -Ollama is great for running models locally. Dream Server is a complete AI platform โ€” inference, voice, workflows, RAG, agents, privacy, and monitoring in one installer. +All models auto-selected based on available VRAM. No manual configuration. --- -## Operations Toolkit +## Documentation + +| | | +|---|---| +| [**Quickstart**](dream-server/QUICKSTART.md) | Step-by-step install guide with troubleshooting | +| [**FAQ**](dream-server/FAQ.md) | Common questions, hardware advice, configuration | +| [**Changelog**](dream-server/CHANGELOG.md) | Version history and release notes | +| [**Contributing**](dream-server/CONTRIBUTING.md) | How to contribute to Dream Server | +| [**Architecture**](dream-server/docs/INSTALLER-ARCHITECTURE.md) | Modular installer design deep dive | +| [**Extensions**](dream-server/docs/EXTENSIONS.md) | How to add custom services | + +--- -Standalone tools for running persistent AI agents in production. Each works independently โ€” grab what you need. +## Repository Structure -| Tool | Purpose | -|------|---------| -| [**Guardian**](guardian/) | Self-healing process watchdog โ€” monitors services, auto-restores from backup, runs as root so agents can't kill it | -| [**Memory Shepherd**](memory-shepherd/) | Periodic memory reset to prevent identity drift in long-running agents | -| [**Token Spy**](token-spy/) | API cost monitoring with real-time dashboard and auto-kill for runaway sessions | -| [**vLLM Tool Proxy**](dream-server/vllm-tool-proxy/) | Makes local vLLM tool calling work with OpenClaw โ€” SSE re-wrapping, extraction, loop protection | -| [**LLM Cold Storage**](scripts/llm-cold-storage.sh) | Archives idle HuggingFace models to free disk, keeps them resolvable via symlink | +``` +DreamServer/ +โ”œโ”€โ”€ dream-server/ # v2.0.0 - Production-ready local AI stack +โ”‚ โ”œโ”€โ”€ install.sh # Linux/WSL installer +โ”‚ โ”œโ”€โ”€ docker-compose.*.yml +โ”‚ โ”œโ”€โ”€ installers/ # Modular installer (13 phases) +โ”‚ โ”œโ”€โ”€ extensions/ # Drop-in service integrations +โ”‚ โ””โ”€โ”€ docs/ # 30+ documentation files +โ”‚ +โ”œโ”€โ”€ install.sh # Root installer (delegates to dream-server/) +โ”œโ”€โ”€ install.ps1 # Windows installer +โ”‚ +โ””โ”€โ”€ archive/ # Legacy projects (reference only) + โ”œโ”€โ”€ guardian/ # Process watchdog + โ”œโ”€โ”€ memory-shepherd/ # Agent memory lifecycle + โ”œโ”€โ”€ token-spy/ # API cost monitoring + โ””โ”€โ”€ docs/ # Historical documentation +``` -These tools were born from the [OpenClaw Collective](COLLECTIVE.md) โ€” 3 AI agents running autonomously on local GPUs, producing 3,464 commits in 8 days. Dream Server packages the infrastructure they built into something anyone can use. +**Shipping:** `dream-server/` is the v2.0.0 release. +**Archive:** Legacy tools from the [OpenClaw Collective](archive/COLLECTIVE.md) development period. --- -## Documentation +## What's New in v2.0.0 -| | | -|---|---| -| [**Quickstart**](dream-server/QUICKSTART.md) | Step-by-step install guide with troubleshooting | -| [**FAQ**](dream-server/FAQ.md) | Common questions, hardware advice, configuration | -| [**Hardware Guide**](dream-server/docs/HARDWARE-GUIDE.md) | GPU recommendations with real prices | -| [**Cookbook**](docs/cookbook/) | Recipes: voice agents, RAG pipelines, code assistant, privacy proxy | -| [**Architecture**](docs/ARCHITECTURE.md) | Deep dive into the system design | -| [**Contributing**](CONTRIBUTING.md) | How to contribute to Lighthouse AI | +- **Modular installer**: 2591-line monolith โ†’ 6 libraries + 13 phases +- **Zero-config service discovery**: Extensions auto-register via manifests +- **AMD Strix Halo support**: ROCm 6.3 with unified memory models +- **Bootstrap mode**: Chat in 2 minutes, upgrade later +- **Comprehensive testing**: `make gate` runs lint + test + smoke + simulate +- **30+ docs**: Installation, troubleshooting, Windows guides, extensions -Windows: [`install.ps1`](dream-server/README.md#windows) handles WSL2 + Docker + NVIDIA drivers automatically. +See [`dream-server/CHANGELOG.md`](dream-server/CHANGELOG.md) for full release notes. --- ## License -Apache 2.0 โ€” see [LICENSE](LICENSE). Use it, modify it, ship it. +Apache 2.0 โ€” Use it, modify it, ship it. See [LICENSE](LICENSE). + +--- -Built by [Lightheart Labs](https://github.com/Light-Heart-Labs) and the [OpenClaw Collective](COLLECTIVE.md). +*Built by [The Collective](https://github.com/Light-Heart-Labs/DreamServer) โ€” Android-17, Todd, and friends* diff --git a/COLLECTIVE.md b/archive/COLLECTIVE.md similarity index 100% rename from COLLECTIVE.md rename to archive/COLLECTIVE.md diff --git a/RELEASE-v1.0.0.md b/archive/RELEASE-v1.0.0.md similarity index 100% rename from RELEASE-v1.0.0.md rename to archive/RELEASE-v1.0.0.md diff --git a/compose/.env.example b/archive/compose/.env.example similarity index 100% rename from compose/.env.example rename to archive/compose/.env.example diff --git a/compose/docker-compose.nano.yml b/archive/compose/docker-compose.nano.yml similarity index 100% rename from compose/docker-compose.nano.yml rename to archive/compose/docker-compose.nano.yml diff --git a/compose/docker-compose.pro.yml b/archive/compose/docker-compose.pro.yml similarity index 100% rename from compose/docker-compose.pro.yml rename to archive/compose/docker-compose.pro.yml diff --git a/config.yaml b/archive/config.yaml similarity index 100% rename from config.yaml rename to archive/config.yaml diff --git a/configs/models.json b/archive/configs/models.json similarity index 100% rename from configs/models.json rename to archive/configs/models.json diff --git a/configs/openclaw-gateway.service b/archive/configs/openclaw-gateway.service similarity index 100% rename from configs/openclaw-gateway.service rename to archive/configs/openclaw-gateway.service diff --git a/configs/openclaw.json b/archive/configs/openclaw.json similarity index 100% rename from configs/openclaw.json rename to archive/configs/openclaw.json diff --git a/docs/ARCHITECTURE.md b/archive/docs/ARCHITECTURE.md similarity index 100% rename from docs/ARCHITECTURE.md rename to archive/docs/ARCHITECTURE.md diff --git a/docs/DESIGN-DECISIONS.md b/archive/docs/DESIGN-DECISIONS.md similarity index 100% rename from docs/DESIGN-DECISIONS.md rename to archive/docs/DESIGN-DECISIONS.md diff --git a/docs/GUARDIAN.md b/archive/docs/GUARDIAN.md similarity index 100% rename from docs/GUARDIAN.md rename to archive/docs/GUARDIAN.md diff --git a/docs/MULTI-AGENT-PATTERNS.md b/archive/docs/MULTI-AGENT-PATTERNS.md similarity index 100% rename from docs/MULTI-AGENT-PATTERNS.md rename to archive/docs/MULTI-AGENT-PATTERNS.md diff --git a/docs/OPERATIONAL-LESSONS.md b/archive/docs/OPERATIONAL-LESSONS.md similarity index 100% rename from docs/OPERATIONAL-LESSONS.md rename to archive/docs/OPERATIONAL-LESSONS.md diff --git a/docs/PATTERNS.md b/archive/docs/PATTERNS.md similarity index 100% rename from docs/PATTERNS.md rename to archive/docs/PATTERNS.md diff --git a/docs/PHILOSOPHY.md b/archive/docs/PHILOSOPHY.md similarity index 100% rename from docs/PHILOSOPHY.md rename to archive/docs/PHILOSOPHY.md diff --git a/docs/SETUP.md b/archive/docs/SETUP.md similarity index 100% rename from docs/SETUP.md rename to archive/docs/SETUP.md diff --git a/docs/TOKEN-MONITOR-PRODUCT-SCOPE.md b/archive/docs/TOKEN-MONITOR-PRODUCT-SCOPE.md similarity index 100% rename from docs/TOKEN-MONITOR-PRODUCT-SCOPE.md rename to archive/docs/TOKEN-MONITOR-PRODUCT-SCOPE.md diff --git a/docs/TOKEN-SPY.md b/archive/docs/TOKEN-SPY.md similarity index 100% rename from docs/TOKEN-SPY.md rename to archive/docs/TOKEN-SPY.md diff --git a/docs/cookbook/01-voice-agent-setup.md b/archive/docs/cookbook/01-voice-agent-setup.md similarity index 100% rename from docs/cookbook/01-voice-agent-setup.md rename to archive/docs/cookbook/01-voice-agent-setup.md diff --git a/docs/cookbook/02-document-qa-setup.md b/archive/docs/cookbook/02-document-qa-setup.md similarity index 100% rename from docs/cookbook/02-document-qa-setup.md rename to archive/docs/cookbook/02-document-qa-setup.md diff --git a/docs/cookbook/03-code-assistant-setup.md b/archive/docs/cookbook/03-code-assistant-setup.md similarity index 100% rename from docs/cookbook/03-code-assistant-setup.md rename to archive/docs/cookbook/03-code-assistant-setup.md diff --git a/docs/cookbook/04-privacy-proxy-setup.md b/archive/docs/cookbook/04-privacy-proxy-setup.md similarity index 100% rename from docs/cookbook/04-privacy-proxy-setup.md rename to archive/docs/cookbook/04-privacy-proxy-setup.md diff --git a/docs/cookbook/05-multi-gpu-cluster.md b/archive/docs/cookbook/05-multi-gpu-cluster.md similarity index 100% rename from docs/cookbook/05-multi-gpu-cluster.md rename to archive/docs/cookbook/05-multi-gpu-cluster.md diff --git a/docs/cookbook/06-swarm-patterns.md b/archive/docs/cookbook/06-swarm-patterns.md similarity index 100% rename from docs/cookbook/06-swarm-patterns.md rename to archive/docs/cookbook/06-swarm-patterns.md diff --git a/docs/cookbook/08-n8n-local-llm.md b/archive/docs/cookbook/08-n8n-local-llm.md similarity index 100% rename from docs/cookbook/08-n8n-local-llm.md rename to archive/docs/cookbook/08-n8n-local-llm.md diff --git a/docs/cookbook/README.md b/archive/docs/cookbook/README.md similarity index 100% rename from docs/cookbook/README.md rename to archive/docs/cookbook/README.md diff --git a/docs/cookbook/agent-template-code.md b/archive/docs/cookbook/agent-template-code.md similarity index 100% rename from docs/cookbook/agent-template-code.md rename to archive/docs/cookbook/agent-template-code.md diff --git a/docs/images/dream-server-dashboard.png b/archive/docs/images/dream-server-dashboard.png similarity index 100% rename from docs/images/dream-server-dashboard.png rename to archive/docs/images/dream-server-dashboard.png diff --git a/docs/images/dream-server-install.png b/archive/docs/images/dream-server-install.png similarity index 100% rename from docs/images/dream-server-install.png rename to archive/docs/images/dream-server-install.png diff --git a/docs/research/GPU-TTS-BENCHMARK.md b/archive/docs/research/GPU-TTS-BENCHMARK.md similarity index 100% rename from docs/research/GPU-TTS-BENCHMARK.md rename to archive/docs/research/GPU-TTS-BENCHMARK.md diff --git a/docs/research/HARDWARE-GUIDE.md b/archive/docs/research/HARDWARE-GUIDE.md similarity index 100% rename from docs/research/HARDWARE-GUIDE.md rename to archive/docs/research/HARDWARE-GUIDE.md diff --git a/docs/research/OSS-MODEL-LANDSCAPE-2026-02.md b/archive/docs/research/OSS-MODEL-LANDSCAPE-2026-02.md similarity index 100% rename from docs/research/OSS-MODEL-LANDSCAPE-2026-02.md rename to archive/docs/research/OSS-MODEL-LANDSCAPE-2026-02.md diff --git a/docs/research/README.md b/archive/docs/research/README.md similarity index 100% rename from docs/research/README.md rename to archive/docs/research/README.md diff --git a/guardian/README.md b/archive/guardian/README.md similarity index 100% rename from guardian/README.md rename to archive/guardian/README.md diff --git a/guardian/docs/HEALTH-CHECKS.md b/archive/guardian/docs/HEALTH-CHECKS.md similarity index 100% rename from guardian/docs/HEALTH-CHECKS.md rename to archive/guardian/docs/HEALTH-CHECKS.md diff --git a/guardian/guardian.conf.example b/archive/guardian/guardian.conf.example similarity index 100% rename from guardian/guardian.conf.example rename to archive/guardian/guardian.conf.example diff --git a/guardian/guardian.service b/archive/guardian/guardian.service similarity index 100% rename from guardian/guardian.service rename to archive/guardian/guardian.service diff --git a/guardian/guardian.sh b/archive/guardian/guardian.sh similarity index 100% rename from guardian/guardian.sh rename to archive/guardian/guardian.sh diff --git a/guardian/install.sh b/archive/guardian/install.sh similarity index 100% rename from guardian/install.sh rename to archive/guardian/install.sh diff --git a/guardian/uninstall.sh b/archive/guardian/uninstall.sh similarity index 100% rename from guardian/uninstall.sh rename to archive/guardian/uninstall.sh diff --git a/memory-shepherd/README.md b/archive/memory-shepherd/README.md similarity index 100% rename from memory-shepherd/README.md rename to archive/memory-shepherd/README.md diff --git a/memory-shepherd/baselines/example-agent-MEMORY.md b/archive/memory-shepherd/baselines/example-agent-MEMORY.md similarity index 100% rename from memory-shepherd/baselines/example-agent-MEMORY.md rename to archive/memory-shepherd/baselines/example-agent-MEMORY.md diff --git a/memory-shepherd/docs/WRITING-BASELINES.md b/archive/memory-shepherd/docs/WRITING-BASELINES.md similarity index 100% rename from memory-shepherd/docs/WRITING-BASELINES.md rename to archive/memory-shepherd/docs/WRITING-BASELINES.md diff --git a/memory-shepherd/install.sh b/archive/memory-shepherd/install.sh similarity index 100% rename from memory-shepherd/install.sh rename to archive/memory-shepherd/install.sh diff --git a/memory-shepherd/memory-shepherd.conf.example b/archive/memory-shepherd/memory-shepherd.conf.example similarity index 100% rename from memory-shepherd/memory-shepherd.conf.example rename to archive/memory-shepherd/memory-shepherd.conf.example diff --git a/memory-shepherd/memory-shepherd.sh b/archive/memory-shepherd/memory-shepherd.sh similarity index 100% rename from memory-shepherd/memory-shepherd.sh rename to archive/memory-shepherd/memory-shepherd.sh diff --git a/memory-shepherd/uninstall.sh b/archive/memory-shepherd/uninstall.sh similarity index 100% rename from memory-shepherd/uninstall.sh rename to archive/memory-shepherd/uninstall.sh diff --git a/scripts/llm-cold-storage.sh b/archive/scripts/llm-cold-storage.sh similarity index 100% rename from scripts/llm-cold-storage.sh rename to archive/scripts/llm-cold-storage.sh diff --git a/scripts/session-cleanup.sh b/archive/scripts/session-cleanup.sh similarity index 100% rename from scripts/session-cleanup.sh rename to archive/scripts/session-cleanup.sh diff --git a/scripts/start-proxy.sh b/archive/scripts/start-proxy.sh similarity index 100% rename from scripts/start-proxy.sh rename to archive/scripts/start-proxy.sh diff --git a/scripts/start-vllm.sh b/archive/scripts/start-vllm.sh similarity index 100% rename from scripts/start-vllm.sh rename to archive/scripts/start-vllm.sh diff --git a/scripts/vllm-tool-proxy.py b/archive/scripts/vllm-tool-proxy.py similarity index 100% rename from scripts/vllm-tool-proxy.py rename to archive/scripts/vllm-tool-proxy.py diff --git a/systemd/llm-cold-storage.service b/archive/systemd/llm-cold-storage.service similarity index 100% rename from systemd/llm-cold-storage.service rename to archive/systemd/llm-cold-storage.service diff --git a/systemd/llm-cold-storage.timer b/archive/systemd/llm-cold-storage.timer similarity index 100% rename from systemd/llm-cold-storage.timer rename to archive/systemd/llm-cold-storage.timer diff --git a/systemd/openclaw-session-cleanup.service b/archive/systemd/openclaw-session-cleanup.service similarity index 100% rename from systemd/openclaw-session-cleanup.service rename to archive/systemd/openclaw-session-cleanup.service diff --git a/systemd/openclaw-session-cleanup.timer b/archive/systemd/openclaw-session-cleanup.timer similarity index 100% rename from systemd/openclaw-session-cleanup.timer rename to archive/systemd/openclaw-session-cleanup.timer diff --git a/systemd/token-spy@.service b/archive/systemd/token-spy@.service similarity index 100% rename from systemd/token-spy@.service rename to archive/systemd/token-spy@.service diff --git a/systemd/vllm-tool-proxy.service b/archive/systemd/vllm-tool-proxy.service similarity index 100% rename from systemd/vllm-tool-proxy.service rename to archive/systemd/vllm-tool-proxy.service diff --git a/token-spy/.env.example b/archive/token-spy/.env.example similarity index 100% rename from token-spy/.env.example rename to archive/token-spy/.env.example diff --git a/token-spy/README.md b/archive/token-spy/README.md similarity index 100% rename from token-spy/README.md rename to archive/token-spy/README.md diff --git a/token-spy/TOKEN-SPY-GUIDE.md b/archive/token-spy/TOKEN-SPY-GUIDE.md similarity index 100% rename from token-spy/TOKEN-SPY-GUIDE.md rename to archive/token-spy/TOKEN-SPY-GUIDE.md diff --git a/token-spy/db.py b/archive/token-spy/db.py similarity index 100% rename from token-spy/db.py rename to archive/token-spy/db.py diff --git a/token-spy/db_postgres.py b/archive/token-spy/db_postgres.py similarity index 100% rename from token-spy/db_postgres.py rename to archive/token-spy/db_postgres.py diff --git a/token-spy/main.py b/archive/token-spy/main.py similarity index 100% rename from token-spy/main.py rename to archive/token-spy/main.py diff --git a/token-spy/providers/__init__.py b/archive/token-spy/providers/__init__.py similarity index 100% rename from token-spy/providers/__init__.py rename to archive/token-spy/providers/__init__.py diff --git a/token-spy/providers/anthropic.py b/archive/token-spy/providers/anthropic.py similarity index 100% rename from token-spy/providers/anthropic.py rename to archive/token-spy/providers/anthropic.py diff --git a/token-spy/providers/base.py b/archive/token-spy/providers/base.py similarity index 100% rename from token-spy/providers/base.py rename to archive/token-spy/providers/base.py diff --git a/token-spy/providers/openai.py b/archive/token-spy/providers/openai.py similarity index 100% rename from token-spy/providers/openai.py rename to archive/token-spy/providers/openai.py diff --git a/token-spy/providers/registry.py b/archive/token-spy/providers/registry.py similarity index 100% rename from token-spy/providers/registry.py rename to archive/token-spy/providers/registry.py diff --git a/token-spy/requirements.txt b/archive/token-spy/requirements.txt similarity index 100% rename from token-spy/requirements.txt rename to archive/token-spy/requirements.txt diff --git a/token-spy/session-manager.sh b/archive/token-spy/session-manager.sh similarity index 100% rename from token-spy/session-manager.sh rename to archive/token-spy/session-manager.sh diff --git a/token-spy/start.sh b/archive/token-spy/start.sh similarity index 100% rename from token-spy/start.sh rename to archive/token-spy/start.sh diff --git a/workspace/IDENTITY.md b/archive/workspace/IDENTITY.md similarity index 100% rename from workspace/IDENTITY.md rename to archive/workspace/IDENTITY.md diff --git a/workspace/MEMORY.md b/archive/workspace/MEMORY.md similarity index 100% rename from workspace/MEMORY.md rename to archive/workspace/MEMORY.md diff --git a/workspace/SOUL.md b/archive/workspace/SOUL.md similarity index 100% rename from workspace/SOUL.md rename to archive/workspace/SOUL.md diff --git a/workspace/TOOLS.md b/archive/workspace/TOOLS.md similarity index 100% rename from workspace/TOOLS.md rename to archive/workspace/TOOLS.md diff --git a/dream-server/.env.example b/dream-server/.env.example new file mode 100644 index 000000000..d1db9a334 --- /dev/null +++ b/dream-server/.env.example @@ -0,0 +1,137 @@ +# Dream Server Configuration +# Copy this file to .env and edit values before starting: +# cp .env.example .env +# +# The installer (install-core.sh) generates .env automatically with +# secure random secrets. This file documents all available variables. + +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# REQUIRED โ€” these must be set or docker compose will refuse to start +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• + +# Session signing for Open WebUI (generate: openssl rand -hex 32) +WEBUI_SECRET=CHANGEME + +# n8n workflow automation credentials +N8N_USER=admin +N8N_PASS=CHANGEME + +# LiteLLM API gateway key (generate: echo "sk-dream-$(openssl rand -hex 16)") +LITELLM_KEY=CHANGEME + +# OpenClaw agent framework token (generate: openssl rand -hex 24) +OPENCLAW_TOKEN=CHANGEME + +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# LLM Backend Mode +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• + +# local = llama-server (default, requires GPU or CPU inference) +# cloud = LiteLLM -> cloud APIs (no local GPU needed) +# hybrid = local primary, cloud fallback +DREAM_MODE=local +LLM_API_URL=http://llama-server:8080 + +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# Cloud API Keys (only needed for cloud/hybrid modes) +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• + +ANTHROPIC_API_KEY= +OPENAI_API_KEY= +TOGETHER_API_KEY= + +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# LLM Settings (llama-server) +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• + +# Model GGUF filename (must exist in data/models/) +GGUF_FILE=Qwen3-8B-Q4_K_M.gguf + +# Context window size (tokens) +CTX_SIZE=16384 + +# GPU backend: nvidia or amd +GPU_BACKEND=nvidia + +# Model name (used by OpenClaw and dashboard) +LLM_MODEL=qwen3-8b + +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# Ports โ€” all overridable, defaults shown +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• + +# OLLAMA_PORT=11434 # llama-server API (external โ†’ internal 8080) +# WEBUI_PORT=3000 # Open WebUI (external โ†’ internal 8080) +# SEARXNG_PORT=8888 # SearXNG metasearch (external โ†’ internal 8080) +# PERPLEXICA_PORT=3004 # Perplexica deep research (external โ†’ internal 3000) +# WHISPER_PORT=9000 # Whisper STT (external โ†’ internal 8000) +# TTS_PORT=8880 # Kokoro TTS (external โ†’ internal 8880) +# N8N_PORT=5678 # n8n workflows (external โ†’ internal 5678) +# QDRANT_PORT=6333 # Qdrant vector DB (external โ†’ internal 6333) +# QDRANT_GRPC_PORT=6334 # Qdrant gRPC (external โ†’ internal 6334) +# EMBEDDINGS_PORT=8090 # Text embeddings (external โ†’ internal 80) +# LITELLM_PORT=4000 # LiteLLM gateway (external โ†’ internal 4000) +# OPENCLAW_PORT=7860 # OpenClaw agent (external โ†’ internal 18789) +# SHIELD_PORT=8085 # Privacy Shield (external โ†’ internal 8085) +# DASHBOARD_API_PORT=3002 # Dashboard API (external โ†’ internal 3002) +# DASHBOARD_PORT=3001 # Dashboard UI (external โ†’ internal 3001) +# COMFYUI_PORT=8188 # ComfyUI image gen (external โ†’ internal 8188) + +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# Optional Security +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• + +# Dashboard API key (generate: openssl rand -hex 32) +# DASHBOARD_API_KEY= + +# Open WebUI authentication (true/false) +# WEBUI_AUTH=true + +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# Optional โ€” Voice, Web UI, n8n +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• + +# Whisper model (tiny, base, small, medium, large-v3-turbo) +# WHISPER_MODEL=base + +# System timezone (used by Open WebUI and n8n) +# TIMEZONE=UTC + +# n8n settings +# N8N_AUTH=true # Enable n8n basic auth +# N8N_HOST=localhost # n8n hostname +# N8N_WEBHOOK_URL=http://localhost:5678 # n8n webhook URL (for external access) + +# Embedding model for RAG +# EMBEDDING_MODEL=BAAI/bge-base-en-v1.5 + +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# AMD-specific (only needed with GPU_BACKEND=amd) +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• + +# VIDEO_GID=44 # `getent group video | cut -d: -f3` +# RENDER_GID=992 # `getent group render | cut -d: -f3` + +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# Advanced +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• + +# Container user/group IDs +# UID=1000 +# GID=1000 + +# Privacy Shield settings +# PII_CACHE_ENABLED=true +# PII_CACHE_SIZE=1000 +# PII_CACHE_TTL=300 +# LOG_LEVEL=info + +# OpenClaw bootstrap model (small model for instant startup) +# BOOTSTRAP_MODEL=qwen3:8b-q4_K_M + +# Dashboard API internal URLs (usually Docker-internal, not user-facing) +# KOKORO_URL=http://tts:8880 +# N8N_URL=http://n8n:5678 + +# llama-server memory limit (Docker) +# LLAMA_SERVER_MEMORY_LIMIT=64G diff --git a/dream-server/.env.schema.json b/dream-server/.env.schema.json new file mode 100644 index 000000000..199f71229 --- /dev/null +++ b/dream-server/.env.schema.json @@ -0,0 +1,313 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "title": "Dream Server Environment Configuration", + "description": "Schema for Dream Server .env file validation", + "type": "object", + "required": [ + "WEBUI_SECRET", + "N8N_USER", + "N8N_PASS", + "LITELLM_KEY", + "OPENCLAW_TOKEN" + ], + "properties": { + "DREAM_MODE": { + "type": "string", + "description": "LLM backend mode: local, cloud, or hybrid", + "enum": ["local", "cloud", "hybrid"], + "default": "local" + }, + "LLM_API_URL": { + "type": "string", + "description": "URL where all services send LLM requests", + "default": "http://llama-server:8080" + }, + "ANTHROPIC_API_KEY": { + "type": "string", + "description": "Anthropic API key (cloud/hybrid modes)" + }, + "OPENAI_API_KEY": { + "type": "string", + "description": "OpenAI API key (cloud/hybrid modes)" + }, + "TOGETHER_API_KEY": { + "type": "string", + "description": "Together AI API key (optional)" + }, + "WEBUI_SECRET": { + "type": "string", + "description": "Session signing secret for Open WebUI", + "secret": true + }, + "N8N_USER": { + "type": "string", + "description": "n8n admin username" + }, + "N8N_PASS": { + "type": "string", + "description": "n8n admin password", + "secret": true + }, + "LITELLM_KEY": { + "type": "string", + "description": "LiteLLM API gateway master key", + "secret": true + }, + "OPENCLAW_TOKEN": { + "type": "string", + "description": "OpenClaw agent framework token", + "secret": true + }, + "GGUF_FILE": { + "type": "string", + "description": "Model GGUF filename in data/models/" + }, + "CTX_SIZE": { + "type": "integer", + "description": "Context window size in tokens", + "default": 16384 + }, + "MAX_CONTEXT": { + "type": "integer", + "description": "Context window (installer variable, maps to CTX_SIZE)" + }, + "GPU_BACKEND": { + "type": "string", + "description": "GPU backend: nvidia, amd, apple, or cpu", + "default": "nvidia" + }, + "LLM_MODEL": { + "type": "string", + "description": "Model name used by OpenClaw and dashboard" + }, + "TIER": { + "type": "string", + "description": "Hardware tier (1, 2, 3, 4, CLOUD, SH_COMPACT, SH_LARGE, NV_ULTRA)" + }, + "OLLAMA_PORT": { + "type": "integer", + "description": "llama-server external port", + "default": 11434 + }, + "WEBUI_PORT": { + "type": "integer", + "description": "Open WebUI external port", + "default": 3000 + }, + "SEARXNG_PORT": { + "type": "integer", + "description": "SearXNG external port", + "default": 8888 + }, + "PERPLEXICA_PORT": { + "type": "integer", + "description": "Perplexica external port", + "default": 3004 + }, + "WHISPER_PORT": { + "type": "integer", + "description": "Whisper STT external port", + "default": 9000 + }, + "TTS_PORT": { + "type": "integer", + "description": "Kokoro TTS external port", + "default": 8880 + }, + "N8N_PORT": { + "type": "integer", + "description": "n8n external port", + "default": 5678 + }, + "QDRANT_PORT": { + "type": "integer", + "description": "Qdrant vector DB external port", + "default": 6333 + }, + "QDRANT_GRPC_PORT": { + "type": "integer", + "description": "Qdrant gRPC external port", + "default": 6334 + }, + "EMBEDDINGS_PORT": { + "type": "integer", + "description": "Text embeddings external port", + "default": 8090 + }, + "LITELLM_PORT": { + "type": "integer", + "description": "LiteLLM gateway external port", + "default": 4000 + }, + "OPENCLAW_PORT": { + "type": "integer", + "description": "OpenClaw agent external port", + "default": 7860 + }, + "SHIELD_PORT": { + "type": "integer", + "description": "Privacy Shield external port", + "default": 8085 + }, + "DASHBOARD_API_PORT": { + "type": "integer", + "description": "Dashboard API external port", + "default": 3002 + }, + "DASHBOARD_PORT": { + "type": "integer", + "description": "Dashboard UI external port", + "default": 3001 + }, + "COMFYUI_PORT": { + "type": "integer", + "description": "ComfyUI external port", + "default": 8188 + }, + "TOKEN_SPY_PORT": { + "type": "integer", + "description": "Token Spy external port", + "default": 3003 + }, + "LLAMA_SERVER_PORT": { + "type": "integer", + "description": "llama-server internal port", + "default": 8080 + }, + "DASHBOARD_API_KEY": { + "type": "string", + "description": "Dashboard API authentication key", + "secret": true + }, + "OPENCODE_SERVER_PASSWORD": { + "type": "string", + "description": "OpenCode web UI authentication password", + "secret": true + }, + "OPENCODE_PORT": { + "type": "integer", + "description": "OpenCode web UI external port", + "default": 3003 + }, + "WEBUI_AUTH": { + "type": "boolean", + "description": "Enable Open WebUI authentication", + "default": true + }, + "WHISPER_MODEL": { + "type": "string", + "description": "Whisper STT model size", + "default": "base" + }, + "TIMEZONE": { + "type": "string", + "description": "System timezone", + "default": "UTC" + }, + "N8N_AUTH": { + "type": "boolean", + "description": "Enable n8n basic auth", + "default": true + }, + "N8N_HOST": { + "type": "string", + "description": "n8n hostname", + "default": "localhost" + }, + "N8N_WEBHOOK_URL": { + "type": "string", + "description": "n8n webhook URL for external access" + }, + "EMBEDDING_MODEL": { + "type": "string", + "description": "Embedding model for RAG", + "default": "BAAI/bge-base-en-v1.5" + }, + "VIDEO_GID": { + "type": "integer", + "description": "Video group ID (AMD only)" + }, + "RENDER_GID": { + "type": "integer", + "description": "Render group ID (AMD only)" + }, + "HSA_OVERRIDE_GFX_VERSION": { + "type": "string", + "description": "AMD ROCm GFX version override" + }, + "ROCBLAS_USE_HIPBLASLT": { + "type": "integer", + "description": "AMD ROCm BLAS setting" + }, + "UID": { + "type": "integer", + "description": "Container user ID", + "default": 1000 + }, + "GID": { + "type": "integer", + "description": "Container group ID", + "default": 1000 + }, + "PII_CACHE_ENABLED": { + "type": "boolean", + "description": "Privacy Shield PII cache", + "default": true + }, + "PII_CACHE_SIZE": { + "type": "integer", + "description": "Privacy Shield PII cache size", + "default": 1000 + }, + "PII_CACHE_TTL": { + "type": "integer", + "description": "Privacy Shield PII cache TTL (seconds)", + "default": 300 + }, + "LOG_LEVEL": { + "type": "string", + "description": "Logging level", + "default": "info" + }, + "BOOTSTRAP_MODEL": { + "type": "string", + "description": "OpenClaw bootstrap model (small, fast startup)" + }, + "KOKORO_URL": { + "type": "string", + "description": "Kokoro TTS internal URL", + "default": "http://tts:8880" + }, + "N8N_URL": { + "type": "string", + "description": "n8n internal URL", + "default": "http://n8n:5678" + }, + "LLAMA_SERVER_MEMORY_LIMIT": { + "type": "string", + "description": "Docker memory limit for llama-server", + "default": "64G" + }, + "LIVEKIT_API_KEY": { + "type": "string", + "description": "LiveKit API key" + }, + "LIVEKIT_API_SECRET": { + "type": "string", + "description": "LiveKit API secret", + "secret": true + }, + "ENABLE_WEB_SEARCH": { + "type": "boolean", + "description": "Enable web search in Open WebUI" + }, + "WEB_SEARCH_ENGINE": { + "type": "string", + "description": "Web search engine backend" + }, + "TTS_VOICE": { + "type": "string", + "description": "Text-to-speech voice" + } + } +} diff --git a/dream-server/.github/ISSUE_TEMPLATE/bug_report.md b/dream-server/.github/ISSUE_TEMPLATE/bug_report.md new file mode 100644 index 000000000..88a41d6fa --- /dev/null +++ b/dream-server/.github/ISSUE_TEMPLATE/bug_report.md @@ -0,0 +1,34 @@ +--- +name: Bug Report +about: Something isn't working as expected +labels: bug +--- + +**Hardware** +- GPU: (e.g., RTX 4090 24GB, Strix Halo 96GB, none) +- RAM: +- OS: (e.g., Ubuntu 24.04, Windows 11 + WSL2, macOS 15) +- Tier: (e.g., 2, SH_LARGE) + +**What happened?** +A clear description of the bug. + +**What did you expect?** +What should have happened instead. + +**Steps to reproduce** +1. +2. +3. + +**Logs** +``` +Paste relevant output from: + docker compose logs | tail -50 + cat /tmp/dream-server-install.log | tail -50 +``` + +**Installer version** +``` +grep VERSION installers/lib/constants.sh +``` diff --git a/dream-server/.github/ISSUE_TEMPLATE/feature_request.md b/dream-server/.github/ISSUE_TEMPLATE/feature_request.md new file mode 100644 index 000000000..e5d5b0b94 --- /dev/null +++ b/dream-server/.github/ISSUE_TEMPLATE/feature_request.md @@ -0,0 +1,21 @@ +--- +name: Feature Request +about: Suggest an improvement or new capability +labels: enhancement +--- + +**What problem does this solve?** +A clear description of the use case. + +**Proposed solution** +How you'd like it to work. + +**Alternatives considered** +Other approaches you've thought about. + +**Which area does this affect?** +- [ ] Installer (tiers, phases, detection) +- [ ] Docker services (compose, health checks) +- [ ] Dashboard (UI, API, plugins) +- [ ] Documentation +- [ ] Other: ___ diff --git a/dream-server/.github/pull_request_template.md b/dream-server/.github/pull_request_template.md new file mode 100644 index 000000000..c02d0bd28 --- /dev/null +++ b/dream-server/.github/pull_request_template.md @@ -0,0 +1,19 @@ +## Summary + +What does this PR do? (1-3 sentences) + +## Changes + +- + +## Testing + +- [ ] `bash -n` passes on all changed `.sh` files +- [ ] `bash tests/test-tier-map.sh` passes (if tier/model changes) +- [ ] `bash tests/integration-test.sh` passes +- [ ] Relevant smoke tests pass (`tests/smoke/`) +- [ ] Dashboard builds (if frontend changed): `cd dashboard && npm run build` + +## Related Issues + +Closes # diff --git a/dream-server/.github/workflows/dashboard.yml b/dream-server/.github/workflows/dashboard.yml new file mode 100644 index 000000000..71b80d9b5 --- /dev/null +++ b/dream-server/.github/workflows/dashboard.yml @@ -0,0 +1,47 @@ +name: Dashboard + +on: + pull_request: + push: + branches: + - main + - master + +jobs: + frontend: + runs-on: ubuntu-latest + defaults: + run: + working-directory: dashboard + steps: + - name: Checkout + uses: actions/checkout@v4 + + - name: Setup Node + uses: actions/setup-node@v4 + with: + node-version: "20" + + - name: Install Dependencies + run: npm install + + - name: Lint + run: npm run lint + + - name: Build + run: npm run build + + api: + runs-on: ubuntu-latest + steps: + - name: Checkout + uses: actions/checkout@v4 + + - name: Setup Python + uses: actions/setup-python@v5 + with: + python-version: "3.11" + + - name: API Syntax Check + run: python -m py_compile dashboard-api/main.py dashboard-api/agent_monitor.py + diff --git a/dream-server/.github/workflows/lint-powershell.yml b/dream-server/.github/workflows/lint-powershell.yml new file mode 100644 index 000000000..ed063ad25 --- /dev/null +++ b/dream-server/.github/workflows/lint-powershell.yml @@ -0,0 +1,40 @@ +name: Lint PowerShell + +on: + pull_request: + push: + branches: + - main + - master + +jobs: + powershell-lint: + runs-on: ubuntu-latest + steps: + - name: Checkout + uses: actions/checkout@v4 + + - name: Install PSScriptAnalyzer + shell: pwsh + run: | + Set-PSRepository PSGallery -InstallationPolicy Trusted + Install-Module PSScriptAnalyzer -Force -Scope CurrentUser + + - name: Run PowerShell Script Analyzer + shell: pwsh + run: | + $scripts = Get-ChildItem -Path installers -Filter *.ps1 -Recurse + if (-not $scripts) { + Write-Host "No PowerShell scripts found." + exit 0 + } + $failed = $false + foreach ($script in $scripts) { + Write-Host "Analyzing $($script.FullName)" + $results = Invoke-ScriptAnalyzer -Path $script.FullName -Settings ./PSScriptAnalyzerSettings.psd1 -Severity Error,Warning + if ($results) { + $results | Format-Table RuleName, Severity, Message, ScriptName, Line -AutoSize + $failed = $true + } + } + if ($failed) { exit 1 } diff --git a/dream-server/.github/workflows/lint-shell.yml b/dream-server/.github/workflows/lint-shell.yml new file mode 100644 index 000000000..41153fc44 --- /dev/null +++ b/dream-server/.github/workflows/lint-shell.yml @@ -0,0 +1,39 @@ +name: Lint Shell + +on: + pull_request: + push: + branches: + - main + - master + +jobs: + shell-syntax: + runs-on: ubuntu-latest + steps: + - name: Checkout + uses: actions/checkout@v4 + + - name: Bash Syntax Check + run: | + set -euo pipefail + mapfile -t files < <(git ls-files '*.sh') + if [ "${#files[@]}" -eq 0 ]; then + echo "No shell scripts found" + exit 0 + fi + for f in "${files[@]}"; do + bash -n "$f" + done + + - name: ShellCheck + run: | + set -euo pipefail + sudo apt-get -qq install -y shellcheck + mapfile -t files < <(git ls-files '*.sh') + if [ "${#files[@]}" -eq 0 ]; then + echo "No shell scripts found" + exit 0 + fi + shellcheck -x -S warning "${files[@]}" + diff --git a/dream-server/.github/workflows/matrix-smoke.yml b/dream-server/.github/workflows/matrix-smoke.yml new file mode 100644 index 000000000..8e444b420 --- /dev/null +++ b/dream-server/.github/workflows/matrix-smoke.yml @@ -0,0 +1,34 @@ +name: Matrix Smoke + +on: + pull_request: + push: + branches: + - main + - master + +jobs: + linux-smoke: + runs-on: ubuntu-latest + steps: + - name: Checkout + uses: actions/checkout@v4 + + - name: AMD Path Smoke + run: bash tests/smoke/linux-amd.sh + + - name: NVIDIA Path Smoke + run: bash tests/smoke/linux-nvidia.sh + + - name: WSL Logic Smoke + run: bash tests/smoke/wsl-logic.sh + + macos-smoke: + runs-on: macos-latest + steps: + - name: Checkout + uses: actions/checkout@v4 + + - name: macOS Dispatch Smoke + run: bash tests/smoke/macos-dispatch.sh + diff --git a/dream-server/.github/workflows/test-linux.yml b/dream-server/.github/workflows/test-linux.yml new file mode 100644 index 000000000..b3bcb176f --- /dev/null +++ b/dream-server/.github/workflows/test-linux.yml @@ -0,0 +1,55 @@ +name: Test Linux + +on: + pull_request: + push: + branches: + - main + - master + +jobs: + integration-smoke: + runs-on: ubuntu-latest + steps: + - name: Checkout + uses: actions/checkout@v4 + + - name: Integration Smoke + run: bash tests/integration-test.sh + + - name: Phase C P1 Static Checks + run: bash tests/test-phase-c-p1.sh + + - name: Manifest Compatibility Checks + run: | + bash scripts/check-compatibility.sh + bash scripts/check-release-claims.sh + + - name: Tier Map Unit Tests + run: bash tests/test-tier-map.sh + + - name: Installer Contract Checks + run: | + bash tests/contracts/test-installer-contracts.sh + bash tests/contracts/test-preflight-fixtures.sh + + - name: Installer Simulation Harness + run: | + bash scripts/simulate-installers.sh + test -f artifacts/installer-sim/summary.json + test -f artifacts/installer-sim/SUMMARY.md + python3 scripts/validate-sim-summary.py artifacts/installer-sim/summary.json + + - name: Upload Installer Simulation Artifacts + uses: actions/upload-artifact@v4 + with: + name: installer-sim + path: | + artifacts/installer-sim/summary.json + artifacts/installer-sim/SUMMARY.md + artifacts/installer-sim/linux-dryrun.log + artifacts/installer-sim/macos-installer.log + artifacts/installer-sim/windows-preflight-sim.json + artifacts/installer-sim/macos-preflight.json + artifacts/installer-sim/macos-doctor.json + artifacts/installer-sim/doctor.json diff --git a/dream-server/.gitignore b/dream-server/.gitignore index 072bdb1cd..c0a18762b 100644 --- a/dream-server/.gitignore +++ b/dream-server/.gitignore @@ -1,10 +1,25 @@ # Runtime / secrets .env .env.* +!.env.example +!.env.schema.json +.current-mode +.profiles +.target-model +.target-quantization # Install-time data directories data/ models/ +artifacts/ +logs/ + +# User presets (dream preset save/load) +presets/ + +# Python cache +**/__pycache__/ +*.pyc # OpenClaw workspace (runtime state) config/openclaw/workspace/ diff --git a/dream-server/.shellcheckrc b/dream-server/.shellcheckrc new file mode 100644 index 000000000..67f46f812 --- /dev/null +++ b/dream-server/.shellcheckrc @@ -0,0 +1,10 @@ +# ShellCheck configuration for Dream Server +# https://www.shellcheck.net/wiki/ + +# Allow sourcing files that can't be resolved statically +# (libs are sourced by install-core.sh at runtime) +disable=SC1090 +disable=SC1091 + +# Allow using $'...' in older bash (we target bash 4+) +disable=SC3003 diff --git a/dream-server/CHANGELOG.md b/dream-server/CHANGELOG.md new file mode 100644 index 000000000..b591ee550 --- /dev/null +++ b/dream-server/CHANGELOG.md @@ -0,0 +1,46 @@ +# Changelog + +All notable changes to Dream Server will be documented in this file. + +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). + +## [Unreleased] + +## [2.0.0] - 2026-03-03 + +### Added +- Documentation index (`docs/README.md`) for navigating 30+ doc files +- `.env.example` with all required and optional variables documented +- `docker-compose.override.yml` auto-include for custom service extensions +- Real shell function tests for `resolve_tier_config()` (replaces tautological Python tests) +- Dry-run reporting for phases 06, 07, 09, 10, 12 +- `Makefile` with `lint`, `test`, `smoke`, `gate` targets +- ShellCheck integration in CI +- `CHANGELOG.md`, `CODE_OF_CONDUCT.md`, issue/PR templates + +### Changed +- Modular installer: 2591-line monolith split into 6 libraries + 13 phases +- All services now core in `docker-compose.base.yml` (profiles removed) +- Models switched from AWQ to GGUF Q4_K_M quantization + +### Fixed +- Tier error message now auto-updates when new tiers are added +- Phase 12 (health) no longer crashes in dry-run mode +- n8n timezone default changed from `America/New_York` to `UTC` +- Stale variable names in INTEGRATION-GUIDE.md +- Embeddings port in INTEGRATION-GUIDE.md (9103 โ†’ 8090) +- Purged all stale `--profile` references across codebase (12+ files) +- Purged all stale `docker-compose.yml` references in docs +- AWQ references in QUICKSTART.md updated to GGUF Q4_K_M +- `make lint` no longer silently swallows errors +- Makefile now uses `find` to discover all .sh files instead of hardcoded globs + +### Removed +- Token Spy (service, docs, installer refs, systemd units, dashboard-api integration) +- `docker-compose.strix-halo.yml` (deprecated, merged into base + amd overlay) +- Tautological Python test suite (`test_installer.py`) +- `asyncpg` dependency from dashboard-api (was only used by Token Spy) + +## [0.3.0-dev] - 2025-05-01 + +Initial development release with modular installer architecture. diff --git a/dream-server/CODE_OF_CONDUCT.md b/dream-server/CODE_OF_CONDUCT.md new file mode 100644 index 000000000..0f4c07035 --- /dev/null +++ b/dream-server/CODE_OF_CONDUCT.md @@ -0,0 +1,40 @@ +# Contributor Covenant Code of Conduct + +## Our Pledge + +We as members, contributors, and leaders pledge to make participation in our +community a harassment-free experience for everyone, regardless of age, body +size, visible or invisible disability, ethnicity, sex characteristics, gender +identity and expression, level of experience, education, socio-economic status, +nationality, personal appearance, race, caste, color, religion, or sexual +identity and orientation. + +## Our Standards + +Examples of behavior that contributes to a positive environment: + +* Using welcoming and inclusive language +* Being respectful of differing viewpoints and experiences +* Gracefully accepting constructive criticism +* Focusing on what is best for the community +* Showing empathy towards other community members + +Examples of unacceptable behavior: + +* The use of sexualized language or imagery, and sexual attention or advances of any kind +* Trolling, insulting or derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information without explicit permission +* Other conduct which could reasonably be considered inappropriate in a professional setting + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be +reported to the project team at **conduct@lightheartlabs.com**. + +All complaints will be reviewed and investigated promptly and fairly. The project +team is obligated to maintain confidentiality with regard to the reporter. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org/), version 2.1. diff --git a/dream-server/CONTRIBUTING.md b/dream-server/CONTRIBUTING.md index 235a58d42..ab55f969c 100644 --- a/dream-server/CONTRIBUTING.md +++ b/dream-server/CONTRIBUTING.md @@ -1,63 +1,84 @@ # Contributing to Dream Server -Thanks for wanting to help! Here's how to get involved. +Thanks for building with us. -## Reporting Issues +## Fast Path -Found a bug? Please open an issue with: -- Your hardware (GPU, RAM, OS) -- What you expected to happen -- What actually happened -- Logs if relevant (`docker compose logs`) +If you want to add or extend services, start here: +- [docs/EXTENSIONS.md](docs/EXTENSIONS.md) โ€” extending services (Docker containers, dashboards) +- [docs/INSTALLER-ARCHITECTURE.md](docs/INSTALLER-ARCHITECTURE.md) โ€” modding the installer itself -## Pull Requests +That guide includes a practical "add a service in 30 minutes" path with templates and checks. -1. Fork the repo -2. Create a feature branch (`git checkout -b feature/cool-thing`) -3. Make your changes -4. Test on your hardware -5. Submit PR with clear description +## Reporting Issues -## What We're Looking For +Open an issue with: +- hardware details (GPU, RAM, OS) +- expected behavior +- actual behavior +- relevant logs (`docker compose logs`) -**High Value:** -- New workflow templates (n8n JSON exports) -- Hardware-specific optimizations -- Better error messages -- Documentation improvements +## Pull Requests -**Good First Issues:** -- Fix typos in docs -- Add more troubleshooting cases -- Improve comments in install.sh +1. Fork and create a branch (`git checkout -b feature/my-change`) +2. Keep PR scope focused (one milestone-sized change) +3. Run validation locally +4. Submit PR with clear description, impact, and test evidence -**Harder But Appreciated:** -- Multi-GPU support improvements -- New model presets -- Alternative TTS/STT engines +## Contributor Validation Checklist -## Testing Your Changes +The fastest way to validate everything: +```bash +make gate # lint + test + smoke + simulate +``` +Or run individual steps: ```bash -# Fresh install test -rm -rf ~/dream-server -./install.sh --dry-run # Check what would happen -./install.sh # Actually install +make lint # Shell syntax + Python compile checks +make test # Tier map unit tests + installer contracts +make smoke # Platform smoke tests +``` -# Run the status check -./status.sh +Full manual checklist: +```bash +# Shell/API checks +bash -n install.sh install-core.sh installers/lib/*.sh installers/phases/*.sh scripts/*.sh tests/*.sh 2>/dev/null || true +python3 -m py_compile dashboard-api/main.py dashboard-api/agent_monitor.py + +# Unit tests +bash tests/test-tier-map.sh + +# Integration/smoke checks +bash tests/integration-test.sh +bash tests/smoke/linux-amd.sh +bash tests/smoke/linux-nvidia.sh +bash tests/smoke/wsl-logic.sh +bash tests/smoke/macos-dispatch.sh +``` + +If your change touches dashboard frontend and Node is available: +```bash +cd dashboard +npm install +npm run lint +npm run build ``` -## Code Style +## High-Value Contributions -- Bash: Use ShellCheck. We're not religious about style, just be consistent. -- YAML: 2-space indent, no tabs. -- Markdown: Keep it readable. No 80-char wrapping. +- extension manifests and service integrations +- dashboard plugin/registry improvements +- installer mods: new tiers, themes, phases (see [docs/INSTALLER-ARCHITECTURE.md](docs/INSTALLER-ARCHITECTURE.md)) +- installer portability and platform support +- workflow catalog quality and docs +- CI coverage and deterministic tests -## Questions? +## Style -Open an issue or find us in Discord. +- Bash: predictable, defensive, and syntax-clean +- YAML/JSON: stable keys, minimal noise, no tabs +- Docs: concrete commands and compatibility notes ---- +## Questions -*Your contributions help bring local AI to everyone.* +Open an issue and include enough context to reproduce the problem quickly. diff --git a/dream-server/EDGE-QUICKSTART.md b/dream-server/EDGE-QUICKSTART.md index 8f27566dc..dd5b09a39 100644 --- a/dream-server/EDGE-QUICKSTART.md +++ b/dream-server/EDGE-QUICKSTART.md @@ -1,5 +1,14 @@ # Dream Server โ€” Edge Quickstart +> **Status: Planned โ€” Not Yet Available.** +> +> This guide describes a future edge deployment mode. The referenced `docker-compose.edge.yml` does not exist yet. **Do not follow these instructions** โ€” they will not work. +> +> For CPU-only machines without a GPU, use `--cloud` mode instead: +> ```bash +> ./install-core.sh --cloud +> ``` + *For Raspberry Pi 5, Mac Mini, or any 8GB+ system without a dedicated GPU.* --- @@ -26,8 +35,8 @@ ```bash # 1. Clone and enter -git clone https://github.com/Light-Heart-Labs/Lighthouse-AI.git -cd Lighthouse-AI/dream-server +git clone https://github.com/Light-Heart-Labs/DreamServer.git +cd DreamServer # 2. Start core services docker compose -f docker-compose.edge.yml up -d @@ -174,9 +183,9 @@ docker compose -f docker-compose.edge.yml up -d ## Next Steps -- Configure voice assistant: See `docs/VOICE-SETUP.md` - Add OpenClaw agent: See `docs/OPENCLAW-INTEGRATION.md` - Create automations: Use n8n at http://localhost:5678 +- Full documentation index: See `docs/README.md` --- diff --git a/dream-server/FAQ.md b/dream-server/FAQ.md index f309fd2cc..fa66c1fbd 100644 --- a/dream-server/FAQ.md +++ b/dream-server/FAQ.md @@ -10,7 +10,7 @@ Frequently asked questions about installing, running, and troubleshooting Dream ### What is Dream Server? Dream Server is a turnkey local AI stack that runs entirely on your own hardware. It includes: -- LLM inference via vLLM (Qwen2.5-32B-Instruct-AWQ) +- LLM inference via llama-server (qwen2.5-32b-instruct) - Web dashboard for chat and model management - Voice capabilities (STT via Whisper, TTS via Kokoro) - Workflow automation via n8n @@ -115,9 +115,9 @@ sudo systemctl restart docker ### "CUDA out of memory" errors Your GPU doesn't have enough VRAM. Options: -1. Use a smaller model (Qwen2.5-7B instead of 32B) -2. Enable quantization (AWQ format uses ~60% less VRAM) -3. Reduce `max_model_len` in docker-compose.yml +1. Use a smaller model (qwen2.5-7b-instruct instead of 32b) +2. All models use GGUF Q4_K_M quantization by default +3. Reduce `CTX_SIZE` in `.env` (try 4096) 4. Run on CPU only (slower but works) ### Windows: WSL2 installation fails @@ -138,7 +138,7 @@ docker compose ps **Check logs:** ```bash docker compose logs dashboard-api -docker compose logs vllm +docker compose logs llama-server ``` **Common fixes:** @@ -250,7 +250,7 @@ docker compose logs -f **Specific service:** ```bash -docker compose logs -f vllm +docker compose logs -f llama-server docker compose logs -f dashboard-api docker compose logs -f voice-agent ``` @@ -268,7 +268,7 @@ docker compose up -d Or restart specific services: ```bash -docker compose restart vllm +docker compose restart llama-server ``` ### "Connection refused" to API @@ -287,7 +287,7 @@ Models need ~20GB per model. Free up space if needed. **Check model download:** ```bash -ls -la models/ +ls -la data/models/ ``` If empty or incomplete, re-download: @@ -356,9 +356,9 @@ docker compose down -v ## Advanced ### How do I add a custom model? -1. Download model to `models/` directory -2. Edit `docker-compose.yml` โ€” change `LLM_MODEL` environment variable -3. Restart: `docker compose up -d vllm` +1. Download model to `data/models/` directory +2. Edit `.env` โ€” change `LLM_MODEL` and `GGUF_FILE` variables +3. Restart: `docker compose up -d llama-server` Supported formats: AWQ, GPTQ, EXL2, GGUF (via llama.cpp adapter) @@ -373,11 +373,8 @@ caddy reverse-proxy --from your-domain.com --to localhost:3000 For local development, browsers accept self-signed certs at `https://localhost`. ### Can I run on multiple GPUs? -Yes! Edit `docker-compose.yml`: +Yes! Edit `docker-compose.nvidia.yml` to expose multiple GPUs: ```yaml -environment: - - TENSOR_PARALLEL_SIZE=2 # Use 2 GPUs - - GPU_MEMORY_UTILIZATION=0.95 deploy: resources: reservations: @@ -388,9 +385,9 @@ deploy: ``` ### How do I backup my data? -**Configs and workflows:** +**Configs and data:** ```bash -tar -czf dream-server-backup.tar.gz .env workflows/ n8n-data/ +tar -czf dream-server-backup.tar.gz .env data/ ``` **Models (large):** @@ -445,7 +442,7 @@ curl http://localhost:3001/api/metrics | 3000 | Open WebUI (chat interface) | | 3001 | Dashboard | | 3002 | Dashboard API | -| 8000 | vLLM API | +| 8080 | llama-server API | | 8085 | Privacy Shield | | 5678 | n8n workflow editor | | 7880 | LiveKit voice server | @@ -468,11 +465,11 @@ Then restart: `docker compose up -d` ### Documentation - Main README: `dream-server/README.md` -- Architecture: `docs/ARCHITECTURE.md` +- Installer Architecture: `docs/INSTALLER-ARCHITECTURE.md` - Security: `SECURITY.md` ### Community -- GitHub Issues: https://github.com/Light-Heart-Labs/Lighthouse-AI/issues +- GitHub Issues: https://github.com/Light-Heart-Labs/DreamServer/issues - Discord: #general channel ### Debug info for bug reports diff --git a/dream-server/LICENSE b/dream-server/LICENSE new file mode 100644 index 000000000..261eeb9e9 --- /dev/null +++ b/dream-server/LICENSE @@ -0,0 +1,201 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/dream-server/Makefile b/dream-server/Makefile new file mode 100644 index 000000000..3400e193c --- /dev/null +++ b/dream-server/Makefile @@ -0,0 +1,43 @@ +# Dream Server โ€” Developer Targets +# Run `make help` to see available commands. + +SHELL_FILES := $(shell find . -name '*.sh' -not -path './node_modules/*' -not -path './.git/*' -not -path './data/*' -not -path './token-spy/*') + +.PHONY: help lint test smoke simulate gate doctor + +help: ## Show this help + @grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | \ + awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-15s\033[0m %s\n", $$1, $$2}' + +lint: ## Syntax check all shell scripts + Python compile check + @echo "=== Shell syntax ===" + @fail=0; for f in $(SHELL_FILES); do bash -n "$$f" || fail=1; done; [ $$fail -eq 0 ] + @echo "=== Python compile ===" + @python3 -m py_compile dashboard-api/main.py dashboard-api/agent_monitor.py + @echo "All lint checks passed." + +test: ## Run unit and contract tests + @echo "=== Tier map tests ===" + @bash tests/test-tier-map.sh + @echo "" + @echo "=== Installer contracts ===" + @bash tests/contracts/test-installer-contracts.sh + @bash tests/contracts/test-preflight-fixtures.sh + +smoke: ## Run platform smoke tests + @echo "=== Smoke tests ===" + @bash tests/smoke/linux-amd.sh + @bash tests/smoke/linux-nvidia.sh + @bash tests/smoke/wsl-logic.sh + @bash tests/smoke/macos-dispatch.sh + @echo "All smoke tests passed." + +simulate: ## Run installer simulation harness + @bash scripts/simulate-installers.sh + +doctor: ## Run diagnostic report + @bash scripts/dream-doctor.sh + +gate: lint test smoke simulate ## Full pre-release validation (lint + test + smoke + simulate) + @echo "" + @echo "Release gate passed." diff --git a/dream-server/PSScriptAnalyzerSettings.psd1 b/dream-server/PSScriptAnalyzerSettings.psd1 new file mode 100644 index 000000000..85d6107e5 --- /dev/null +++ b/dream-server/PSScriptAnalyzerSettings.psd1 @@ -0,0 +1,16 @@ +@{ + Rules = @{ + PSAvoidUsingWriteHost = @{ + Enable = $false + } + PSAvoidUsingConvertToSecureStringWithPlainText = @{ + Enable = $true + } + PSUseApprovedVerbs = @{ + Enable = $true + } + PSUseDeclaredVarsMoreThanAssignments = @{ + Enable = $true + } + } +} diff --git a/dream-server/QUICKSTART.md b/dream-server/QUICKSTART.md index a8a10291e..7bf5eb448 100644 --- a/dream-server/QUICKSTART.md +++ b/dream-server/QUICKSTART.md @@ -2,21 +2,30 @@ One command to a fully running local AI stack. No manual config, no dependency hell. +See [`docs/SUPPORT-MATRIX.md`](docs/SUPPORT-MATRIX.md) before installing to confirm current platform support. + ## Prerequisites -**Linux:** +**Linux (NVIDIA GPU):** - Docker with Compose v2+ ([Install](https://docs.docker.com/get-docker/)) - NVIDIA GPU with 8GB+ VRAM (16GB+ recommended) - NVIDIA Container Toolkit ([Install](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html)) - 40GB+ disk space (for models) +**Linux (AMD Strix Halo):** +- Docker with Compose v2+ ([Install](https://docs.docker.com/get-docker/)) +- AMD Ryzen AI MAX+ APU with 64GB+ unified memory +- ROCm-compatible kernel (6.17+ recommended, 6.18.4+ ideal) +- `/dev/kfd` and `/dev/dri` accessible (user in `video` + `render` groups) +- 60GB+ disk space (for GGUF model files) + **Windows:** - Windows 10 21H2+ or Windows 11 - NVIDIA GPU with drivers - Docker Desktop (installer will prompt if missing) - WSL2 (installer will enable if needed) -For Windows, use `install.ps1` instead โ€” see [README.md](README.md#windows). +For Windows and macOS status, see [README.md](README.md#platform-support) and [`docs/SUPPORT-MATRIX.md`](docs/SUPPORT-MATRIX.md). ## Step 1: Run the Installer @@ -26,14 +35,19 @@ For Windows, use `install.ps1` instead โ€” see [README.md](README.md#windows). The installer will: 1. **Detect your GPU** and auto-select the right tier: - - Tier 1 (Entry): <12GB VRAM โ†’ Qwen2.5-7B, 8K context - - Tier 2 (Prosumer): 12-20GB VRAM โ†’ Qwen2.5-14B-AWQ, 16K context - - Tier 3 (Pro): 20-40GB VRAM โ†’ Qwen2.5-32B-AWQ, 32K context - - Tier 4 (Enterprise): 40GB+ VRAM โ†’ Qwen2.5-72B-AWQ, 32K context -2. Check Docker and NVIDIA toolkit + - **AMD Strix Halo (unified memory)**: + - SH_LARGE (90GB+): qwen3-coder-next (80B MoE), 128K context + - SH_COMPACT (64-89GB): qwen3-30b-a3b (30B MoE), 128K context + - **NVIDIA (discrete GPU)**: + - Tier 1 (Entry): <12GB VRAM โ†’ qwen2.5-7b-instruct (GGUF Q4_K_M), 16K context + - Tier 2 (Prosumer): 12-20GB VRAM โ†’ qwen2.5-14b-instruct (GGUF Q4_K_M), 16K context + - Tier 3 (Pro): 20-40GB VRAM โ†’ qwen2.5-32b-instruct (GGUF Q4_K_M), 32K context + - Tier 4 (Enterprise): 40GB+ VRAM โ†’ qwen2.5-72b-instruct (GGUF Q4_K_M), 32K context +2. Check Docker and GPU toolkit (NVIDIA Container Toolkit or ROCm devices) 3. Ask which optional components to enable (voice, workflows, RAG) 4. Generate secure passwords and configuration -5. Start all services +5. Apply system tuning (AMD: sysctl, amdgpu modprobe, etc.) +6. Start all services **Override tier manually:** `./install.sh --tier 3` @@ -41,13 +55,24 @@ The installer will: ## Step 2: Wait for Model Download -First run downloads the LLM (~20GB for 32B AWQ). Watch progress: +**NVIDIA:** First run downloads the LLM (~20GB for 32B GGUF). Watch progress: + +```bash +docker compose logs -f llama-server +``` + +When you see `server is listening on`, you're ready! + +**AMD Strix Halo:** The GGUF model downloads in the background (~25-52GB). Watch progress: ```bash -docker compose logs -f vllm +tail -f ~/dream-server/logs/model-download.log + +# Or check llama-server readiness: +docker compose -f docker-compose.base.yml -f docker-compose.amd.yml logs -f llama-server ``` -When you see `Application startup complete`, you're ready! +When you see `server is listening on`, the model is loaded and ready. ## Step 3: Validate Installation @@ -76,11 +101,22 @@ Visit: **http://localhost:3000** ## Step 5: Test the API +**NVIDIA:** +```bash +curl http://localhost:8080/v1/chat/completions \ + -H "Content-Type: application/json" \ + -d '{ + "model": "qwen2.5-32b-instruct", + "messages": [{"role": "user", "content": "Hello!"}] + }' +``` + +**AMD Strix Halo:** ```bash -curl http://localhost:8000/v1/chat/completions \ +curl http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ - "model": "Qwen/Qwen2.5-32B-Instruct-AWQ", + "model": "qwen3-coder-next", "messages": [{"role": "user", "content": "Hello!"}] }' ``` @@ -91,12 +127,21 @@ curl http://localhost:8000/v1/chat/completions \ The installer auto-detects your GPU and selects the optimal configuration: +**AMD Strix Halo:** + +| Tier | Unified VRAM | Model | Hardware | +|------|-------------|-------|----------| +| SH_LARGE | 90GB+ | qwen3-coder-next (80B MoE) | Ryzen AI MAX+ (96GB config) | +| SH_COMPACT | 64-89GB | qwen3:30b-a3b (30B MoE) | Ryzen AI MAX+ (64GB config) | + +**NVIDIA:** + | Tier | VRAM | Model | Example GPUs | |------|------|-------|--------------| | 1 (Entry) | <12GB | Qwen2.5-7B | RTX 3080, RTX 4070 | -| 2 (Prosumer) | 12-20GB | Qwen2.5-14B-AWQ | RTX 3090, RTX 4080 | -| 3 (Pro) | 20-40GB | Qwen2.5-32B-AWQ | RTX 4090, A6000 | -| 4 (Enterprise) | 40GB+ | Qwen2.5-72B-AWQ | A100, H100 | +| 2 (Prosumer) | 12-20GB | Qwen2.5-14B (GGUF Q4_K_M) | RTX 3090, RTX 4080 | +| 3 (Pro) | 20-40GB | Qwen2.5-32B (GGUF Q4_K_M) | RTX 4090, A6000 | +| 4 (Enterprise) | 40GB+ | Qwen2.5-72B (GGUF Q4_K_M) | A100, H100 | To check what tier you'd get without installing: @@ -108,61 +153,79 @@ To check what tier you'd get without installing: ## Common Issues -### "OOM" or "CUDA out of memory" +### "OOM" or "CUDA out of memory" (NVIDIA) Reduce context window in `.env`: ``` -MAX_CONTEXT=4096 # or even 2048 +CTX_SIZE=4096 # or even 2048 ``` Or switch to a smaller model: ``` -LLM_MODEL=Qwen/Qwen2.5-7B-Instruct +LLM_MODEL=qwen2.5-7b-instruct ``` +### AMD: llama-server crash loop + +Check logs: `docker compose -f docker-compose.base.yml -f docker-compose.amd.yml logs llama-server` + +Common causes: +- GGUF file not found: ensure `data/models/*.gguf` exists +- Wrong GGUF format: use upstream llama.cpp GGUFs (NOT Ollama blobs) +- Missing ROCm env vars: `HSA_OVERRIDE_GFX_VERSION=11.5.1` must be set + ### Model download fails 1. Check disk space: `df -h` -2. Try again: `docker compose restart vllm` -3. Or pre-download with Hugging Face CLI +2. **NVIDIA:** Try again: `docker compose restart llama-server` +3. **AMD:** Resume download: `wget -c -O data/models/.gguf ` ### WebUI shows "No models available" -vLLM is still loading. Check: `docker compose logs vllm` +The inference engine is still loading. +- **NVIDIA:** Check: `docker compose logs llama-server` +- **AMD:** Check: `docker compose -f docker-compose.base.yml -f docker-compose.amd.yml logs llama-server` ### Port conflicts Edit `.env` to change ports: ``` WEBUI_PORT=3001 -VLLM_PORT=8001 +LLAMA_SERVER_PORT=8081 # LLM inference port ``` --- ## Next Steps -- **Enable voice**: `docker compose --profile voice up -d` -- **Try voice-to-voice**: Import `workflows/05-voice-to-voice.json` into n8n โ€” speak, get spoken answers back -- **Add workflows**: `docker compose --profile workflows up -d` (see `workflows/README.md`) -- **Set up RAG**: `docker compose --profile rag up -d` -- **Connect OpenClaw**: Use this as your local inference backend +- **Add workflows**: Open n8n at http://localhost:5678 to create custom automation workflows +- **Connect OpenClaw**: Use this as your local inference backend at http://localhost:7860 +- **Dashboard**: Monitor services, GPU, and health at http://localhost:3001 --- ## Stopping ```bash +# NVIDIA docker compose down + +# AMD Strix Halo +docker compose -f docker-compose.base.yml -f docker-compose.amd.yml down ``` ## Updating ```bash +# NVIDIA docker compose pull docker compose up -d + +# AMD Strix Halo +docker compose -f docker-compose.base.yml -f docker-compose.amd.yml pull +docker compose -f docker-compose.base.yml -f docker-compose.amd.yml up -d --build ``` --- -Built by The Collective โ€ข [Lighthouse AI](https://github.com/Light-Heart-Labs/Lighthouse-AI) +Built by The Collective โ€ข [DreamServer](https://github.com/Light-Heart-Labs/DreamServer) diff --git a/dream-server/README.md b/dream-server/README.md index 82392edd5..4cd33363e 100644 --- a/dream-server/README.md +++ b/dream-server/README.md @@ -3,24 +3,42 @@ [![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](../LICENSE) [![Docker](https://img.shields.io/badge/Docker-Required-2496ED?logo=docker)](https://docs.docker.com/get-docker/) [![NVIDIA](https://img.shields.io/badge/NVIDIA-GPU%20Accelerated-76B900?logo=nvidia)](https://developer.nvidia.com/cuda-toolkit) +[![AMD](https://img.shields.io/badge/AMD-Strix%20Halo%20ROCm-ED1C24?logo=amd)](https://rocm.docs.amd.com/) [![n8n](https://img.shields.io/badge/n8n-Workflows-FF6D5A?logo=n8n)](https://n8n.io) **Your turnkey local AI stack.** Buy hardware. Run installer. AI running. --- +## Platform Support + +See [`docs/SUPPORT-MATRIX.md`](docs/SUPPORT-MATRIX.md) for current support tiers and platform status. +Launch-claim guardrails: [`docs/PLATFORM-TRUTH-TABLE.md`](docs/PLATFORM-TRUTH-TABLE.md) +Known-good version baselines: [`docs/KNOWN-GOOD-VERSIONS.md`](docs/KNOWN-GOOD-VERSIONS.md) + +## Installer Evidence + +- Run simulation suite: `bash scripts/simulate-installers.sh` +- Output artifacts: + - `artifacts/installer-sim/summary.json` + - `artifacts/installer-sim/SUMMARY.md` +- CI uploads these artifacts on each PR via `.github/workflows/test-linux.yml` +- One-command maintainer gate: `bash scripts/release-gate.sh` + +--- + ## 5-Minute Quickstart ```bash # One-line install (Linux/WSL) -curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/Lighthouse-AI/main/dream-server/get-dream-server.sh | bash +curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/DreamServer/main/get-dream-server.sh | bash ``` Or manually: ```bash -git clone https://github.com/Light-Heart-Labs/Lighthouse-AI.git -cd Lighthouse-AI/dream-server +git clone https://github.com/Light-Heart-Labs/DreamServer.git +cd DreamServer ./install.sh ``` @@ -42,41 +60,58 @@ To skip bootstrap and wait for the full model: `./install.sh --no-bootstrap` ### Windows ```powershell -Invoke-WebRequest -Uri "https://raw.githubusercontent.com/Light-Heart-Labs/Lighthouse-AI/main/dream-server/install.ps1" -OutFile install.ps1 -.\install.ps1 +.\installers\windows.ps1 ``` -The Windows installer handles WSL2 setup, Docker Desktop, and NVIDIA drivers automatically. - -**Requirements:** Windows 10 21H2+ or Windows 11, NVIDIA GPU, Docker Desktop +Windows installer performs prerequisite checks, emits a preflight report, and delegates to WSL2 install path. See [`docs/SUPPORT-MATRIX.md`](docs/SUPPORT-MATRIX.md) for exact support level. --- ## What's Included -| Component | Purpose | Port | -|-----------|---------|------| -| **vLLM** | High-performance LLM inference | 8000 | -| **Open WebUI** | Beautiful chat interface | 3000 | -| **Dashboard** | System status, GPU metrics, service health | 3001 | -| **Privacy Shield** | PII redaction for external API calls | 8085 | -| **Whisper** | Speech-to-text (optional) | 9000 | -| **Kokoro** | Text-to-speech (optional) | 8880 | -| **LiveKit** | Real-time WebRTC voice chat (optional) | 7880 | -| **n8n** | Workflow automation (optional) | 5678 | -| **Qdrant** | Vector database for RAG (optional) | 6333 | -| **LiteLLM** | Multi-model API gateway (optional) | 4000 | +| Component | Purpose | Port | Backend | +|-----------|---------|------|---------| +| **llama-server** | LLM inference engine | 8080 | Both | +| **Open WebUI** | Beautiful chat interface | 3000 | Both | +| **Dashboard** | System status, GPU metrics, service health | 3001 | Both | +| **Dashboard API** | Backend API for dashboard | 3002 | Both | +| **LiteLLM** | Multi-model API gateway | 4000 | Both | +| **OpenClaw** | Autonomous AI agent framework | 7860 | Both | +| **SearXNG** | Self-hosted web search | 8888 | Both | +| **Perplexica** | Deep research engine | 3004 | Both | +| **n8n** | Workflow automation | 5678 | Both | +| **Qdrant** | Vector database for RAG | 6333 | Both | +| **Embeddings** | Text embeddings for RAG | 8090 | Both | +| **Whisper** | Speech-to-text | 9000 | Both | +| **Kokoro** | Text-to-speech | 8880 | Both | +| **Privacy Shield** | PII protection for API calls | 8085 | Both | +| **Memory Shepherd** | Agent memory lifecycle management | โ€” | AMD | +| **ComfyUI** | Image generation | 8188 | Both | ## Hardware Tiers The installer **automatically detects your GPU** and selects the right configuration: -| Tier | VRAM | Model | Context | Example GPUs | -|------|------|-------|---------|--------------| -| 1 (Entry) | <12GB | Qwen2.5-7B | 8K | RTX 3080, RTX 4070 | -| 2 (Prosumer) | 12-20GB | Qwen2.5-14B-AWQ | 16K | RTX 3090, RTX 4080 | -| 3 (Pro) | 20-40GB | Qwen2.5-32B-AWQ | 32K | RTX 4090, A6000 | -| 4 (Enterprise) | 40GB+ | Qwen2.5-72B-AWQ | 32K | A100, H100, multi-GPU | +### AMD Strix Halo (Unified Memory) + +| Tier | Unified VRAM | Model | Context | Example Hardware | +|------|-------------|-------|---------|-----------------| +| SH_LARGE | 90GB+ | qwen3-coder-next (80B MoE, 3B active) | 128K | Ryzen AI MAX+ 395 (96GB VRAM config) | +| SH_COMPACT | 64-89GB | qwen3-30b-a3b (30B MoE, 3B active) | 128K | Ryzen AI MAX+ 395 (64GB VRAM config) | + +Both tiers use `qwen2.5:7b` as a bootstrap model for instant startup. The full model downloads in the background via GGUF from HuggingFace. + +**Inference backend:** llama-server via ROCm 7.2 (Docker image: `kyuz0/amd-strix-halo-toolboxes:rocm-7.2`) + +### NVIDIA (Discrete GPU) + +| Tier | VRAM | Model | Quant | Context | Example GPUs | +|------|------|-------|-------|---------|--------------| +| NV_ULTRA | 90GB+ | qwen3-coder-next | GGUF Q4_K_M | 128K | Multi-GPU A100/H100 | +| 1 (Entry) | <12GB | qwen2.5-7b-instruct | GGUF Q4_K_M | 16K | RTX 3080, RTX 4070 | +| 2 (Prosumer) | 12-20GB | qwen2.5-14b-instruct | GGUF Q4_K_M | 16K | RTX 3090, RTX 4080 | +| 3 (Pro) | 20-40GB | qwen2.5-32b-instruct | GGUF Q4_K_M | 32K | RTX 4090, A6000 | +| 4 (Enterprise) | 40GB+ | qwen2.5-72b-instruct | GGUF Q4_K_M | 32K | A100, H100, multi-GPU | Override with: `./install.sh --tier 3` @@ -86,6 +121,33 @@ See [docs/HARDWARE-GUIDE.md](docs/HARDWARE-GUIDE.md) for buying recommendations. ## Architecture +### AMD Strix Halo (llama-server + ROCm) + +``` +โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” +โ”‚ Open WebUI โ”‚ +โ”‚ (localhost:3000) โ”‚ +โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ + โ”‚ +โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” +โ”‚ llama-server (ROCm 7.2) โ”‚ +โ”‚ (localhost:8080/v1/...) โ”‚ +โ”‚ qwen3-coder-next / qwen3-30b-a3b โ”‚ +โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ + โ”‚ โ”‚ +โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” +โ”‚ OpenClaw โ”‚ โ”‚ Dashboard โ”‚ +โ”‚ (Agent :7860) โ”‚ โ”‚ (Status :3001) โ”‚ +โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ + +โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” +โ”‚ n8n (:5678) โ”‚ โ”‚Qdrant(:6333)โ”‚ โ”‚LiteLLM(:4000)โ”‚ +โ”‚ Workflows โ”‚ โ”‚ Vector DB โ”‚ โ”‚ API Gateway โ”‚ +โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ +``` + +### NVIDIA (llama-server + CUDA) + ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Open WebUI โ”‚ @@ -93,9 +155,9 @@ See [docs/HARDWARE-GUIDE.md](docs/HARDWARE-GUIDE.md) for buying recommendations. โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” -โ”‚ vLLM โ”‚ -โ”‚ (localhost:8000/v1/...) โ”‚ -โ”‚ Qwen2.5-32B-Instruct-AWQ โ”‚ +โ”‚ llama-server (CUDA) โ”‚ +โ”‚ (localhost:8080/v1/...) โ”‚ +โ”‚ qwen2.5-32b-instruct โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” @@ -104,57 +166,98 @@ See [docs/HARDWARE-GUIDE.md](docs/HARDWARE-GUIDE.md) for buying recommendations. โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” -โ”‚ n8n (:5678) โ”‚ โ”‚Qdrant(:6333)โ”‚ โ”‚LiteLLM(:4K) โ”‚ +โ”‚ n8n (:5678) โ”‚ โ”‚Qdrant(:6333)โ”‚ โ”‚LiteLLM(:4000)โ”‚ โ”‚ Workflows โ”‚ โ”‚ Vector DB โ”‚ โ”‚ API Gateway โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` -## Optional Profiles +## Modding & Customization + +### Extension Services + +Each service under `extensions/services/` IS the mod. Drop in a directory, run `dream enable `, and it appears in compose, CLI, dashboard, and health checks. -Enable components with Docker Compose profiles: +``` +extensions/services/ + my-service/ + manifest.yaml # Service metadata, aliases, category + compose.yaml # Docker Compose fragment (auto-merged) +``` ```bash -# Voice (STT + TTS) -docker compose --profile voice up -d +dream enable my-service # Enable an extension +dream disable my-service # Disable it +dream list # See all services and status +``` -# Workflows (n8n) -docker compose --profile workflows up -d +Full guide: [docs/EXTENSIONS.md](docs/EXTENSIONS.md) -# RAG (Qdrant + embeddings) -docker compose --profile rag up -d +### Installer Architecture -# LiveKit Voice Chat (real-time WebRTC voice) -docker compose --profile livekit --profile voice up -d +The installer is modular โ€” 6 libraries and 13 phases, each in its own file. +Want to add a hardware tier, swap the theme, or skip a phase? Edit one file. -# Everything -docker compose --profile voice --profile workflows --profile rag --profile livekit up -d +``` +installers/lib/ # Pure function libraries (colors, GPU detection, tier mapping) +installers/phases/ # Sequential install steps (01-preflight through 13-summary) +install-core.sh # Thin orchestrator (~150 lines) ``` -### LiveKit Voice Chat +Every file has a standardized header: Purpose, Expects, Provides, Modder notes. -Real-time voice conversation with your local AI: +Full guide with copy-paste recipes: [docs/INSTALLER-ARCHITECTURE.md](docs/INSTALLER-ARCHITECTURE.md) -1. Enable the profile: `docker compose --profile livekit --profile voice up -d` -2. Open http://localhost:7880 for LiveKit playground -3. Or integrate with any LiveKit-compatible client +## Configuration -**What it does:** -- WebRTC voice streaming (low latency) -- Whisper STT โ†’ Local LLM โ†’ Kokoro TTS pipeline -- Works with browser, mobile apps, or custom clients +The installer generates `.env` automatically. Key settings: -See `agents/voice/` for the agent implementation. +```bash +# NVIDIA +LLM_MODEL=qwen2.5-32b-instruct # Model (auto-set by installer) +CTX_SIZE=32768 # Context window + +# AMD Strix Halo +LLM_MODEL=qwen3-coder-next # or qwen3-30b-a3b for compact tier +CTX_SIZE=131072 # Context window +GPU_BACKEND=amd # Set automatically by installer +``` -## Configuration +## dream-cli -Copy `.env.example` to `.env` and customize: +The `dream` CLI is the primary management tool. It's installed automatically at `~/dream-server/dream-cli` and can be symlinked to your PATH. ```bash -LLM_MODEL=Qwen/Qwen2.5-32B-Instruct-AWQ # Model (auto-set by installer) -MAX_CONTEXT=8192 # Context window -GPU_UTIL=0.9 # VRAM allocation (0.0-1.0) +# Service management +dream status # Health checks + GPU status +dream list # Show all services and their state +dream logs # Tail logs (accepts aliases: llm, stt, tts) +dream restart [service] # Restart one or all services +dream start / stop # Start or stop the stack + +# LLM mode switching +dream mode # Show current mode (local/cloud/hybrid) +dream mode cloud # Switch to cloud APIs via LiteLLM +dream mode local # Switch to local llama-server +dream mode hybrid # Local primary, cloud fallback + +# Model management (local mode) +dream model current # Show active model +dream model list # List available tiers +dream model swap T3 # Switch to a different tier + +# Extensions +dream enable n8n # Enable an extension +dream disable whisper # Disable an extension + +# Configuration +dream config show # View .env (secrets masked) +dream config edit # Open .env in editor +dream preset save # Snapshot current config +dream preset load # Restore a saved preset ``` +Full mode-switching documentation: [docs/MODE-SWITCH.md](docs/MODE-SWITCH.md) + ## Showcase & Demos ```bash @@ -171,41 +274,50 @@ GPU_UTIL=0.9 # VRAM allocation (0.0-1.0) ## Useful Commands ```bash -cd ~/dream-server -docker compose ps # Check status -docker compose logs -f vllm # Watch vLLM logs -docker compose restart # Restart services -docker compose down # Stop everything -./status.sh # Health check all services +# dream-cli handles compose flags automatically (works on AMD and NVIDIA) +dream status # Check all services +dream list # See available services and status +dream logs llm # Watch llama-server logs (alias: llm) +dream logs stt # Watch Whisper logs (alias: stt) +dream restart whisper # Restart a service +dream enable n8n # Enable an extension +dream disable comfyui # Disable an extension +dream stop # Stop everything +dream start # Start everything + +# Management scripts +./scripts/session-cleanup.sh # Clean up bloated agent sessions +./scripts/llm-cold-storage.sh --status # Check model hot/cold storage +dream mode status # Show current mode ``` ## Comparison | Feature | Dream Server | Ollama + WebUI | LocalAI | |---------|:---:|:---:|:---:| -| Full-stack one-command install | **LLM + voice + workflows + RAG + privacy** | LLM + chat only | LLM only | -| Hardware auto-detect + model selection | **Yes** | No | No | -| Voice agents (STT + TTS + WebRTC) | **Built in** | No | Limited | -| Inference engine | **vLLM** (continuous batching) | llama.cpp | llama.cpp | +| Full-stack one-command install | **LLM + agent + workflows + RAG** | LLM + chat only | LLM only | +| Hardware auto-detect + model selection | **NVIDIA + AMD Strix Halo** | No | No | +| AMD APU / unified memory support | **ROCm + llama-server** | Partial (Vulkan) | No | +| Inference engine | **llama-server** (all GPUs) | llama.cpp | llama.cpp | +| Autonomous AI agent | **OpenClaw** | No | No | | Workflow automation | **n8n (400+ integrations)** | No | No | -| PII redaction / privacy tools | **Built in** | No | No | -| Multi-GPU | **Yes** | Partial | Partial | +| LLM usage monitoring | **Open WebUI built-in** | No | No | +| Multi-GPU | **Yes** (NVIDIA) | Partial | Partial | --- ## Troubleshooting FAQ -**vLLM won't start / OOM errors** -- Reduce `MAX_CONTEXT` in `.env` (try 4096) -- Lower `GPU_UTIL` to 0.85 +**llama-server won't start / OOM errors** +- Reduce `CTX_SIZE` in `.env` (try 4096) - Use a smaller model: `./install.sh --tier 1` **"Model not found" on first boot** - First launch downloads the model (10-30 min depending on size) -- Watch progress: `docker compose logs -f vllm` +- Watch progress: `dream logs llm` **Open WebUI shows "Connection error"** -- vLLM is still loading. Wait for health check to pass: `curl localhost:8000/health` +- llama-server is still loading. Wait for health check to pass: `curl localhost:8080/health` **Port already in use** - Change ports in `.env` (e.g., `WEBUI_PORT=3001`) @@ -220,16 +332,29 @@ docker compose down # Stop everything - Verify with `nvidia-smi` inside WSL - Ensure Docker Desktop has WSL integration enabled +**AMD Strix Halo: llama-server won't start** +- Check GGUF model exists: `ls -lh data/models/*.gguf` +- Watch logs: `docker compose -f docker-compose.base.yml -f docker-compose.amd.yml logs -f llama-server` +- Verify GPU devices: `ls /dev/kfd /dev/dri/renderD128` +- Ensure ROCm env: `HSA_OVERRIDE_GFX_VERSION=11.5.1` must be set + +**AMD: "missing tensor" errors** +- Use upstream llama.cpp GGUF files (from `unsloth/` on HuggingFace) +- Ollama's GGUF format has incompatible tensor naming for qwen3next architecture +- Do NOT use Ollama blob files with llama-server + --- ## Documentation +- [docs/README.md](docs/README.md) โ€” **Full documentation index** (start here) - [QUICKSTART.md](QUICKSTART.md) โ€” Detailed setup guide - [HARDWARE-GUIDE.md](docs/HARDWARE-GUIDE.md) โ€” What to buy -- [TROUBLESHOOTING.md](docs/TROUBLESHOOTING.md) โ€” Extended troubleshooting +- [EXTENSIONS.md](docs/EXTENSIONS.md) โ€” Add services, manifests, dashboard plugins +- [INSTALLER-ARCHITECTURE.md](docs/INSTALLER-ARCHITECTURE.md) โ€” Modding the installer +- [INTEGRATION-GUIDE.md](docs/INTEGRATION-GUIDE.md) โ€” Connect your apps - [SECURITY.md](SECURITY.md) โ€” Security best practices -- [OPENCLAW-INTEGRATION.md](docs/OPENCLAW-INTEGRATION.md) โ€” Connect OpenClaw agents -- [Workflows README](workflows/README.md) โ€” Pre-built n8n workflows +- [CHANGELOG.md](CHANGELOG.md) โ€” Version history ## License @@ -237,4 +362,4 @@ Apache 2.0 โ€” Use it, modify it, sell it. Just don't blame us. --- -*Built by [The Collective](https://github.com/Light-Heart-Labs/Lighthouse-AI) โ€” Android-17, Todd, and friends* +*Built by [The Collective](https://github.com/Light-Heart-Labs/DreamServer) โ€” Android-17, Todd, and friends* diff --git a/dream-server/SECURITY.md b/dream-server/SECURITY.md index fbfedda3b..823df56da 100644 --- a/dream-server/SECURITY.md +++ b/dream-server/SECURITY.md @@ -61,7 +61,7 @@ For access from other devices on your network: ```bash # Allow specific ports from local network sudo ufw allow from 192.168.0.0/24 to any port 3000 # WebUI -sudo ufw allow from 192.168.0.0/24 to any port 8000 # LLM API +sudo ufw allow from 192.168.0.0/24 to any port 8080 # LLM API ``` ### Exposing to Internet (Not Recommended) @@ -92,7 +92,7 @@ server { location / { limit_req zone=ai burst=5; - proxy_pass http://127.0.0.1:8000; + proxy_pass http://127.0.0.1:8080; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; } @@ -111,7 +111,7 @@ Prevent runaway containers: ```yaml services: - vllm: + llama-server: deploy: resources: limits: @@ -122,7 +122,7 @@ services: ### Principle of Least Privilege -The docker-compose.yml uses: +The docker-compose files use: - Non-root users where possible - Read-only volumes where appropriate - GPU access only for services that need it @@ -166,10 +166,10 @@ gpg -d dream-backup-YYYYMMDD.tar.gz.gpg | tar -xz ### Recommended Architecture ``` -Client โ†’ LiteLLM (with API key) โ†’ vLLM (localhost only) +Client โ†’ LiteLLM (with API key) โ†’ llama-server (localhost only) ``` -vLLM has no authentication by default. Use LiteLLM as your authenticated gateway for remote access. +llama-server has no authentication by default. Use LiteLLM as your authenticated gateway for remote access. ### Service-Specific @@ -177,7 +177,7 @@ vLLM has no authentication by default. Use LiteLLM as your authenticated gateway |---------|------|-------| | Open WebUI | Built-in | Change admin password, disable signups | | n8n | Basic auth | Use strong password, enable 2FA | -| vLLM | None | Keep localhost-only, use LiteLLM for remote | +| llama-server | None | Keep localhost-only, use LiteLLM for remote | | LiteLLM | API key | Set `LITELLM_KEY` in .env | --- @@ -186,7 +186,7 @@ vLLM has no authentication by default. Use LiteLLM as your authenticated gateway ```bash # Watch for errors -docker compose logs -f vllm | grep -i error +docker compose logs -f llama-server | grep -i error # Monitor resource usage watch -n 5 'nvidia-smi; docker stats --no-stream' @@ -209,7 +209,7 @@ docker compose pull docker compose up -d ``` -Watch for security updates to: vLLM, Open WebUI, n8n, base images. +Watch for security updates to: llama-server, Open WebUI, n8n, base images. --- diff --git a/dream-server/agents/templates/README.md b/dream-server/agents/templates/README.md index 3fc5a7e2d..ae57842ee 100644 --- a/dream-server/agents/templates/README.md +++ b/dream-server/agents/templates/README.md @@ -3,7 +3,7 @@ **Mission:** M7 (OpenClaw Frontier Pushing) **Status:** 5 templates created, awaiting validation -Validated agent templates that work reliably on local Qwen2.5-32B-Instruct-AWQ. +Validated agent templates that work reliably on local Qwen3-14B. ## Templates @@ -29,12 +29,12 @@ Validated agent templates that work reliably on local Qwen2.5-32B-Instruct-AWQ. agent: template: code-assistant override: - model: local-vllm/Qwen/Qwen2.5-32B-Instruct-AWQ + model: local-llama/qwen3-14b ``` ## Validation Results (2026-02-11) -Tested on: Qwen2.5-32B-Instruct-AWQ-Instruct-AWQ (local) +Tested on: Qwen3-14B-Instruct-AWQ (local) Test command: `python3 tests/validate-agent-templates.py` | Template | Tests | Passed | Status | @@ -55,7 +55,7 @@ Test command: `python3 tests/validate-agent-templates.py` ## Design Principles -1. **Local-first:** Templates optimized for Qwen2.5-32B-Instruct-AWQ (free, fast, private) +1. **Local-first:** Templates optimized for Qwen3-14B (free, fast, private) 2. **Fallback-aware:** Creative tasks route to Kimi; technical tasks stay local 3. **Tool-appropriate:** Each template gets only the tools it needs 4. **Safety-conscious:** Dangerous operations flagged (system-admin) diff --git a/dream-server/agents/templates/code-assistant.yaml b/dream-server/agents/templates/code-assistant.yaml index c05046702..336d3e048 100644 --- a/dream-server/agents/templates/code-assistant.yaml +++ b/dream-server/agents/templates/code-assistant.yaml @@ -1,13 +1,13 @@ # Code Assistant Agent Template # Mission: M7 (OpenClaw Frontier Pushing) -# Validated on: Qwen2.5-32B-Instruct-AWQ +# Validated on: Qwen3-14B # Purpose: Programming help, debugging, code review agent: name: code-assistant description: "Programming assistant for code generation, debugging, and review" - model: local-vllm/Qwen/Qwen2.5-32B-Instruct-AWQ + model: local-llama/qwen3-14b # Qwen Coder excels at programming tasks - no fallback needed system_prompt: | @@ -59,7 +59,7 @@ agent: # /agent load code-assistant notes: - - Optimized for Qwen2.5-Coder - works reliably on local hardware + - Optimized for Qwen3 - works reliably on local hardware - Handles Python, JavaScript, Go, Rust, and most common languages - For very large codebases, consider splitting into smaller chunks - Tested on RTX 3090 (24GB) with ~500ms response time diff --git a/dream-server/agents/templates/data-analyst.yaml b/dream-server/agents/templates/data-analyst.yaml index 9a9ffcb6c..962390ec1 100644 --- a/dream-server/agents/templates/data-analyst.yaml +++ b/dream-server/agents/templates/data-analyst.yaml @@ -1,13 +1,13 @@ # Data Analyst Agent Template # Mission: M7 (OpenClaw Frontier Pushing) -# Validated on: Qwen2.5-32B-Instruct-AWQ +# Validated on: Qwen3-14B # Purpose: CSV/JSON analysis, data processing, visualization guidance agent: name: data-analyst description: "Data analysis assistant for processing CSV, JSON, and structured data" - model: local-vllm/Qwen/Qwen2.5-32B-Instruct-AWQ + model: local-llama/qwen3-14b # Coder model excels at data manipulation tasks system_prompt: | diff --git a/dream-server/agents/templates/research-assistant.yaml b/dream-server/agents/templates/research-assistant.yaml index 3c98251aa..641307738 100644 --- a/dream-server/agents/templates/research-assistant.yaml +++ b/dream-server/agents/templates/research-assistant.yaml @@ -1,13 +1,13 @@ # Research Assistant Agent Template # Mission: M7 (OpenClaw Frontier Pushing) -# Validated on: Qwen2.5-32B-Instruct-AWQ +# Validated on: Qwen3-14B # Purpose: Web research, summarization, fact-checking agent: name: research-assistant description: "Research assistant for web search, summarization, and analysis" - model: local-vllm/Qwen/Qwen2.5-32B-Instruct-AWQ + model: local-llama/qwen3-14b # Falls back to Kimi for complex synthesis if needed fallback_model: moonshot/kimi-k2-0711-preview diff --git a/dream-server/agents/templates/system-admin.yaml b/dream-server/agents/templates/system-admin.yaml index 265ce50d9..e0c81a025 100644 --- a/dream-server/agents/templates/system-admin.yaml +++ b/dream-server/agents/templates/system-admin.yaml @@ -1,13 +1,13 @@ # System Admin Assistant Agent Template # Mission: M7 (OpenClaw Frontier Pushing) -# Validated on: Qwen2.5-32B-Instruct-AWQ +# Validated on: Qwen3-14B # Purpose: Docker management, server administration, troubleshooting agent: name: system-admin description: "System administration assistant for Docker, Linux, and server management" - model: local-vllm/Qwen/Qwen2.5-32B-Instruct-AWQ + model: local-llama/qwen3-14b # Coder model excels at system commands and scripting system_prompt: | diff --git a/dream-server/agents/templates/writing-assistant.yaml b/dream-server/agents/templates/writing-assistant.yaml index a5af4089d..6e54e4044 100644 --- a/dream-server/agents/templates/writing-assistant.yaml +++ b/dream-server/agents/templates/writing-assistant.yaml @@ -1,6 +1,6 @@ # Writing Assistant Agent Template # Mission: M7 (OpenClaw Frontier Pushing) -# Validated on: Qwen2.5-32B-Instruct-AWQ +# Validated on: Qwen3-14B # Purpose: Creative writing, editing, style improvement # NOTE: Local Qwen has limitations on creative tasks - use with fallback @@ -8,8 +8,8 @@ agent: name: writing-assistant description: "Writing assistant for drafting, editing, and improving text" - model: local-vllm/Qwen/Qwen2.5-32B-Instruct-AWQ - # IMPORTANT: Qwen Coder is NOT optimized for creative writing + model: local-llama/qwen3-14b + # IMPORTANT: Qwen3 is NOT optimized for creative writing # This template uses fallback for creative generation tasks fallback_model: moonshot/kimi-k2-0711-preview @@ -79,7 +79,7 @@ agent: import: "agents/templates/writing-assistant.yaml" notes: - - CRITICAL: Local Qwen Coder struggles with creative generation + - CRITICAL: Local Qwen3 struggles with creative generation - Use this template for EDITING tasks (grammar, clarity, structure) - Creative generation automatically routes to fallback model - For pure creative work, consider using Kimi/Claude directly diff --git a/dream-server/agents/voice-offline/Dockerfile b/dream-server/agents/voice-offline/Dockerfile deleted file mode 100644 index 3a4b442a9..000000000 --- a/dream-server/agents/voice-offline/Dockerfile +++ /dev/null @@ -1,47 +0,0 @@ -# Dream Server Voice Agent - OFFLINE MODE -# Local-only voice chat using LiveKit + local LLM -# M1 Phase 2 - Zero cloud dependencies -# -# Build: docker build -t dream-voice-agent-offline . -# Run: docker run --network dream-network-offline dream-voice-agent-offline - -FROM python:3.11-slim - -WORKDIR /app - -# Install system deps (portaudio for audio, ffmpeg for transcoding) -RUN apt-get update && apt-get install -y --no-install-recommends \ - gcc \ - libffi-dev \ - libportaudio2 \ - libportaudiocpp0 \ - portaudio19-dev \ - ffmpeg \ - curl \ - wget \ - && rm -rf /var/lib/apt/lists/* - -# Install Python deps -COPY requirements.txt . -RUN pip install --no-cache-dir -r requirements.txt - -# Copy agent code -COPY agent.py . -COPY entrypoint.sh . -RUN chmod +x entrypoint.sh - -# Copy deterministic module -COPY deterministic/ ./deterministic/ - -# Copy offline-specific flows -COPY flows/ ./flows/ - -# Create health check endpoint -COPY health_check.py . - -# Non-root user for security -RUN useradd -m -u 1000 agent && chown -R agent:agent /app -USER agent - -# Run the agent -CMD ["./entrypoint.sh"] \ No newline at end of file diff --git a/dream-server/agents/voice-offline/agent.py b/dream-server/agents/voice-offline/agent.py deleted file mode 100644 index e93c25a5d..000000000 --- a/dream-server/agents/voice-offline/agent.py +++ /dev/null @@ -1,316 +0,0 @@ -#!/usr/bin/env python3 -""" -Dream Server Voice Agent - Offline Mode -Main agent implementation for local-only voice chat -M1 Phase 2 - Zero cloud dependencies - -Uses LiveKit Agents SDK v1.4+ with local model backends: -- LLM: vLLM (OpenAI-compatible) -- STT: Whisper (OpenAI-compatible API) -- TTS: Kokoro (OpenAI-compatible API) -- VAD: Silero (built-in) -""" - -import os -import asyncio -import logging -import signal -from typing import Optional - -from livekit.agents import ( - JobContext, - JobProcess, - WorkerOptions, - cli, -) -from livekit.agents import Agent, AgentSession -from livekit.plugins import silero, openai as openai_plugin - -# Configure logging -logging.basicConfig( - level=logging.INFO, - format='%(asctime)s | %(name)s | %(levelname)s | %(message)s' -) -logger = logging.getLogger("dream-voice-offline") - -# Environment config -LIVEKIT_URL = os.getenv("LIVEKIT_URL", "ws://localhost:7880") -LLM_URL = os.getenv("LLM_URL", "http://vllm:8000/v1") -LLM_MODEL = os.getenv("LLM_MODEL", "Qwen/Qwen2.5-32B-Instruct-AWQ") -STT_URL = os.getenv("STT_URL", "http://whisper:9000/v1") -TTS_URL = os.getenv("TTS_URL", "http://tts:8880/v1") -TTS_VOICE = os.getenv("TTS_VOICE", "af_heart") - -# Offline mode settings -OFFLINE_MODE = os.getenv("OFFLINE_MODE", "true").lower() == "true" - -# System prompt for offline mode -OFFLINE_SYSTEM_PROMPT = """You are Dream Agent running in offline mode on local hardware. -You have access to local tools and services only. Be helpful, accurate, and maintain privacy. -Keep responses conversational and concise - this is voice, not text. - -Key capabilities: -- Answer questions using local knowledge -- Help with file operations and system tasks -- Provide technical assistance for local services -- Maintain conversation context - -Limitations: -- Cannot access external websites or APIs -- Cannot provide real-time information -- Cannot perform web searches -- All processing happens locally on this machine - -Always acknowledge when asked about external information that you operate in offline mode.""" - - -async def check_service_health(url: str, name: str, timeout: int = 5) -> bool: - """Check if a service is healthy before starting.""" - import aiohttp - try: - async with aiohttp.ClientSession() as session: - async with session.get(url, timeout=aiohttp.ClientTimeout(total=timeout)) as resp: - healthy = resp.status == 200 - if healthy: - logger.info(f" {name} is healthy") - else: - logger.warning(f" {name} returned status {resp.status}") - return healthy - except Exception as e: - logger.warning(f" {name} unreachable: {e}") - return False - - -class OfflineVoiceAgent(Agent): - """ - Voice agent for offline/local-only operation. - - Features: - - Greets user on entry - - Handles interruptions (user can stop bot speech) - - Uses only local services (no cloud dependencies) - - Falls back gracefully if services fail - """ - - def __init__(self) -> None: - super().__init__( - instructions=OFFLINE_SYSTEM_PROMPT, - allow_interruptions=True, - ) - self.error_count = 0 - self.max_errors = 3 - - async def on_enter(self): - """Called when agent becomes active. Send greeting.""" - logger.info("Agent entered - sending greeting") - try: - self.session.generate_reply( - instructions="Greet the user warmly and briefly introduce yourself as their local offline voice assistant." - ) - except Exception as e: - logger.error(f"Failed to send greeting: {e}") - self.error_count += 1 - - async def on_exit(self): - """Called when agent is shutting down.""" - logger.info("Agent exiting - cleanup") - - async def on_error(self, error: Exception): - """Handle errors gracefully.""" - self.error_count += 1 - logger.error(f"Agent error ({self.error_count}/{self.max_errors}): {error}") - - if self.error_count >= self.max_errors: - logger.critical("Max errors reached, agent will restart") - raise error - - -async def create_llm() -> Optional[openai_plugin.LLM]: - """Create local LLM instance.""" - try: - llm = openai_plugin.LLM( - model=LLM_MODEL, - base_url=LLM_URL, - api_key="not-needed", # Local vLLM doesn't require API key - ) - logger.info(f" LLM configured: {LLM_MODEL}") - return llm - except Exception as e: - logger.error(f" Failed to create LLM: {e}") - return None - - -async def create_stt() -> Optional[openai_plugin.STT]: - """Create local STT instance.""" - try: - stt_base_url = STT_URL.removesuffix('/v1').removesuffix('/') - healthy = await check_service_health(f"{stt_base_url}/health", "STT (Whisper)") - if not healthy: - logger.warning("STT service not healthy, continuing without speech recognition") - return None - - stt = openai_plugin.STT( - model="whisper-1", - base_url=STT_URL, - api_key="not-needed", - ) - logger.info(" STT configured") - return stt - except Exception as e: - logger.error(f" Failed to create STT: {e}") - logger.warning("Continuing without speech recognition") - return None - - -async def create_tts() -> Optional[openai_plugin.TTS]: - """Create local TTS instance.""" - try: - tts_base_url = TTS_URL.removesuffix('/v1').removesuffix('/') - healthy = await check_service_health(f"{tts_base_url}/health", "TTS (Kokoro)") - if not healthy: - logger.warning("TTS service not healthy, continuing without speech synthesis") - return None - - tts = openai_plugin.TTS( - model="kokoro", - voice=TTS_VOICE, - base_url=TTS_URL, - api_key="not-needed", - ) - logger.info(f" TTS configured with voice: {TTS_VOICE}") - return tts - except Exception as e: - logger.error(f" Failed to create TTS: {e}") - logger.warning("Continuing without speech synthesis") - return None - - -async def entrypoint(ctx: JobContext): - """ - Main entry point for the offline voice agent job. - - Includes: - - Service health checks - - Graceful degradation if services fail - - Reconnection logic - """ - logger.info(f"Voice agent connecting to room: {ctx.room.name}") - - # Health check phase - logger.info("Performing service health checks...") - llm_healthy = await check_service_health(f"{LLM_URL}/models", "LLM (vLLM)") - - if not llm_healthy: - logger.error("LLM service not healthy - cannot start agent") - raise RuntimeError("LLM service required but not available") - - # Create components with error handling - llm = await create_llm() - if not llm: - raise RuntimeError("Failed to create LLM - agent cannot start") - - stt = await create_stt() - tts = await create_tts() - - # Create VAD from prewarmed cache or load fresh - try: - vad = ctx.proc.userdata.get("vad") or silero.VAD.load() - logger.info(" VAD loaded") - except Exception as e: - logger.error(f" Failed to load VAD: {e}") - logger.warning("Starting without voice activity detection") - vad = None - - # Create session - only include working components - session_kwargs = {"llm": llm} - if stt: - session_kwargs["stt"] = stt - if tts: - session_kwargs["tts"] = tts - if vad: - session_kwargs["vad"] = vad - - session = AgentSession(**session_kwargs) - - # Create agent - agent = OfflineVoiceAgent() - - # Setup graceful shutdown - shutdown_event = asyncio.Event() - - def signal_handler(sig, frame): - logger.info("Shutdown signal received") - shutdown_event.set() - - signal.signal(signal.SIGTERM, signal_handler) - signal.signal(signal.SIGINT, signal_handler) - - # Connect to room first (required by LiveKit SDK) - max_retries = 3 - for attempt in range(max_retries): - try: - await ctx.connect() - logger.info("Connected to room") - break - except Exception as e: - logger.error(f"Room connection failed (attempt {attempt + 1}/{max_retries}): {e}") - if attempt == max_retries - 1: - raise - await asyncio.sleep(1) - - # Start session after room connection - for attempt in range(max_retries): - try: - await session.start(agent=agent, room=ctx.room) - logger.info("Offline voice agent session started") - break - except Exception as e: - logger.error(f"Session start failed (attempt {attempt + 1}/{max_retries}): {e}") - if attempt == max_retries - 1: - raise - await asyncio.sleep(1) - - # Wait for shutdown signal - try: - await shutdown_event.wait() - except asyncio.CancelledError: - logger.info("Agent task cancelled") - finally: - logger.info("Shutting down offline voice agent...") - try: - await session.close() - except Exception as e: - logger.error(f"Error during shutdown: {e}") - - -def prewarm(proc: JobProcess): - """Prewarm function - load models before first job.""" - logger.info("Prewarming offline voice agent...") - try: - proc.userdata["vad"] = silero.VAD.load() - logger.info(" VAD model loaded") - except Exception as e: - logger.error(f" Failed to load VAD: {e}") - proc.userdata["vad"] = None - - -if __name__ == "__main__": - agent_port = int(os.getenv("AGENT_PORT", "8181")) - - # Log startup info - logger.info("=" * 60) - logger.info("Dream Server Voice Agent - OFFLINE MODE") - logger.info(f"Port: {agent_port}") - logger.info(f"LLM: {LLM_URL}") - logger.info(f"STT: {STT_URL}") - logger.info(f"TTS: {TTS_URL}") - logger.info(f"Offline Mode: {OFFLINE_MODE}") - logger.info("=" * 60) - - cli.run_app( - WorkerOptions( - entrypoint_fnc=entrypoint, - prewarm_fnc=prewarm, - port=agent_port, - ) - ) diff --git a/dream-server/agents/voice-offline/deterministic/__init__.py b/dream-server/agents/voice-offline/deterministic/__init__.py deleted file mode 100644 index 07c997bff..000000000 --- a/dream-server/agents/voice-offline/deterministic/__init__.py +++ /dev/null @@ -1,216 +0,0 @@ -#!/usr/bin/env python3 -""" -Deterministic classifier for offline voice agent -Handles intent classification using local models -""" - -import os -import json -import logging -from typing import Dict, List, Optional, Tuple -import numpy as np - -from .router import DeterministicRouter - -logger = logging.getLogger(__name__) - - -class KeywordClassifier: - """Simple keyword-based intent classifier for offline mode""" - - def __init__(self, keywords: Dict[str, List[str]]): - """ - Args: - keywords: Dict mapping intent names to keyword lists - """ - self.keywords = keywords or {} - - def classify(self, text: str) -> tuple[str, float]: - """Classify text by keyword matching""" - text_lower = text.lower() - best_intent = "fallback" - best_score = 0.0 - - for intent, kw_list in self.keywords.items(): - matches = sum(1 for kw in kw_list if kw.lower() in text_lower) - if matches > 0: - score = matches / len(kw_list) - if score > best_score: - best_score = score - best_intent = intent - - return best_intent, best_score - - -class FSMExecutor: - """Finite State Machine executor for deterministic flows""" - - def __init__(self, flows_dir: str): - self.flows_dir = flows_dir - self.flows: Dict[str, dict] = {} - self.current_flow: Optional[str] = None - self.current_state: Optional[str] = None - self._load_flows() - - def _load_flows(self): - """Load flow definitions from JSON files""" - if not os.path.exists(self.flows_dir): - logger.warning(f"Flows directory not found: {self.flows_dir}") - return - - for filename in os.listdir(self.flows_dir): - if filename.endswith('.json'): - filepath = os.path.join(self.flows_dir, filename) - try: - with open(filepath, 'r') as f: - flow = json.load(f) - flow_name = flow.get('name', filename.replace('.json', '')) - self.flows[flow_name] = flow - logger.info(f"Loaded flow: {flow_name}") - except Exception as e: - logger.error(f"Failed to load flow {filename}: {e}") - - def start_flow(self, flow_name: str) -> Optional[str]: - """Start a flow and return initial response""" - if flow_name not in self.flows: - return None - - self.current_flow = flow_name - flow = self.flows[flow_name] - self.current_state = flow.get('initial_state', 'start') - - # Return initial greeting if defined - states = flow.get('states', {}) - if self.current_state in states: - return states[self.current_state].get('say') - return None - - def process(self, text: str) -> Optional[str]: - """Process user input and return response""" - if not self.current_flow or not self.current_state: - return None - - flow = self.flows[self.current_flow] - states = flow.get('states', {}) - current = states.get(self.current_state, {}) - - # Simple transition logic - look for next state - transitions = current.get('transitions', {}) - for trigger, next_state in transitions.items(): - if trigger.lower() in text.lower() or trigger == '*': - self.current_state = next_state - if next_state in states: - return states[next_state].get('say') - - # No matching transition - return default or None - return current.get('fallback_say') - -class DeterministicClassifier: - """Simple rule-based classifier for offline mode""" - - def __init__(self, flows_dir: str): - self.flows_dir = flows_dir - self.intents = {} - self.patterns = {} - - async def initialize(self): - """Load deterministic flows""" - try: - await self._load_flows() - logger.info(f"Loaded {len(self.intents)} deterministic intents") - except Exception as e: - logger.warning(f"Failed to load deterministic flows: {e}") - - async def _load_flows(self): - """Load flow definitions from JSON files""" - if not os.path.exists(self.flows_dir): - logger.warning(f"Flows directory not found: {self.flows_dir}") - return - - for filename in os.listdir(self.flows_dir): - if filename.endswith('.json'): - filepath = os.path.join(self.flows_dir, filename) - try: - with open(filepath, 'r') as f: - flow = json.load(f) - intent_name = flow.get('intent', filename.replace('.json', '')) - self.intents[intent_name] = flow - - # Extract patterns - if 'patterns' in flow: - self.patterns[intent_name] = flow['patterns'] - except Exception as e: - logger.error(f"Failed to load flow {filename}: {e}") - - async def classify(self, text: str, confidence_threshold: float = 0.85) -> Tuple[str, float]: - """ - Classify intent using rule-based matching - Returns (intent, confidence) - """ - text_lower = text.lower().strip() - - best_intent = "general" - best_confidence = 0.0 - - for intent_name, patterns in self.patterns.items(): - for pattern in patterns: - if isinstance(pattern, str): - # Simple substring matching - if pattern.lower() in text_lower: - confidence = min(1.0, len(pattern) / len(text_lower)) - if confidence > best_confidence: - best_confidence = confidence - best_intent = intent_name - elif isinstance(pattern, dict): - # More complex pattern matching - keywords = pattern.get('keywords', []) - required_all = pattern.get('required_all', False) - - matches = 0 - total_keywords = len(keywords) - - for keyword in keywords: - if keyword.lower() in text_lower: - matches += 1 - - if required_all and matches == total_keywords: - confidence = 1.0 - elif not required_all and matches > 0: - confidence = matches / total_keywords - else: - confidence = 0.0 - - if confidence > best_confidence: - best_confidence = confidence - best_intent = intent_name - - # Apply threshold - if best_confidence < confidence_threshold: - return "general", 0.0 - - return best_intent, best_confidence - - async def get_intent_info(self, intent: str) -> Optional[Dict]: - """Get intent configuration""" - return self.intents.get(intent) - -# Example usage -if __name__ == "__main__": - import asyncio - - async def test(): - classifier = DeterministicClassifier("./flows") - await classifier.initialize() - - test_texts = [ - "I need to book a restaurant reservation", - "What's the weather like", - "Can you help me with my order", - "Hello, how are you" - ] - - for text in test_texts: - intent, confidence = await classifier.classify(text) - print(f"Text: '{text}' -> Intent: {intent} (confidence: {confidence})") - - asyncio.run(test()) \ No newline at end of file diff --git a/dream-server/agents/voice-offline/deterministic/router.py b/dream-server/agents/voice-offline/deterministic/router.py deleted file mode 100644 index 2b3c7af72..000000000 --- a/dream-server/agents/voice-offline/deterministic/router.py +++ /dev/null @@ -1,145 +0,0 @@ -#!/usr/bin/env python3 -""" -Deterministic router for offline voice agent -Routes conversations based on classified intents -""" - -import json -import logging -from typing import Dict, Any, List -from datetime import datetime, timezone - -logger = logging.getLogger(__name__) - -class DeterministicRouter: - """Routes conversations based on deterministic flows""" - - def __init__(self, flows_dir: str = None, classifier=None, fsm=None, fallback_threshold: float = 0.85): - self.flows_dir = flows_dir - self.classifier = classifier - self.fsm = fsm - self.fallback_threshold = fallback_threshold - self.flows = {} - self.current_flows = {} # Track active flows per session - - async def initialize(self): - """Load flow definitions""" - import os - if not os.path.exists(self.flows_dir): - logger.warning(f"Flows directory not found: {self.flows_dir}") - return - - for filename in os.listdir(self.flows_dir): - if filename.endswith('.json'): - filepath = os.path.join(self.flows_dir, filename) - try: - with open(filepath, 'r') as f: - flow = json.load(f) - flow_name = filename.replace('.json', '') - self.flows[flow_name] = flow - except Exception as e: - logger.error(f"Failed to load flow {filename}: {e}") - - async def get_response(self, session_id: str, intent: str, user_input: str, context: Dict[str, Any] = None) -> str: - """Get response based on flow and current state""" - if intent not in self.flows: - return self.get_fallback_response(user_input) - - flow = self.flows[intent] - - # Initialize session if new - if session_id not in self.current_flows: - self.current_flows[session_id] = { - "intent": intent, - "current_step": 0, - "data": {}, - "started": datetime.now(timezone.utc).isoformat() - } - - session = self.current_flows[session_id] - - # Get current step - steps = flow.get("steps", []) - current_step = session["current_step"] - - if current_step >= len(steps): - # Flow completed - response = flow.get("completion_message", "Thank you! Is there anything else I can help you with?") - del self.current_flows[session_id] # Clean up - return response - - step = steps[current_step] - - # Validate required fields - if "validation" in step: - validation = step["validation"] - if validation.get("type") == "regex": - import re - pattern = validation.get("pattern", ".*") - if not re.match(pattern, user_input, re.IGNORECASE): - return validation.get("error_message", "I didn't understand that. Please try again.") - - # Store user response - if "field" in step: - session["data"][step["field"]] = user_input - - # Get next response - response = step.get("response", "Thank you for your input.") - - # Advance to next step - session["current_step"] += 1 - - return response - - def get_fallback_response(self, user_input: str) -> str: - """Get fallback response for unmatched intents""" - return "I understand you're asking about that, but I'm running in offline mode and can only help with tasks I have specific flows for. Would you like me to help with something else, or can you try rephrasing your request?" - - def reset_session(self, session_id: str): - """Reset session state""" - if session_id in self.current_flows: - del self.current_flows[session_id] - - def get_session_info(self, session_id: str) -> Dict[str, Any]: - """Get current session info""" - return self.current_flows.get(session_id, {}) - - def list_available_flows(self) -> List[str]: - """List available flow names""" - return list(self.flows.keys()) - -# Example flows -EXAMPLE_FLOWS = { - "restaurant_reservation": { - "steps": [ - { - "response": "I'd be happy to help you book a restaurant reservation. What date would you like?", - "field": "date" - }, - { - "response": "What time would you prefer?", - "field": "time" - }, - { - "response": "How many people will be dining?", - "field": "party_size" - }, - { - "response": "Do you have any dietary restrictions or special requests?", - "field": "special_requests" - } - ], - "completion_message": "Perfect! I've collected all the details for your reservation. In a real system, I would now process this booking." - } -} - -if __name__ == "__main__": - import asyncio - - async def test(): - router = DeterministicRouter("./flows") - await router.initialize() - - print("Available flows:", router.list_available_flows()) - - asyncio.run(test()) \ No newline at end of file diff --git a/dream-server/agents/voice-offline/entrypoint.sh b/dream-server/agents/voice-offline/entrypoint.sh deleted file mode 100644 index 088ae5bee..000000000 --- a/dream-server/agents/voice-offline/entrypoint.sh +++ /dev/null @@ -1,70 +0,0 @@ -#!/bin/bash -# Entrypoint script for Dream Server Voice Agent - Offline Mode -# M1 Phase 2 - Zero cloud dependencies - -set -e - -echo "=== Dream Server Voice Agent (Offline Mode) ===" -echo "Starting at $(date)" - -# Environment validation -if [[ -z "${LIVEKIT_URL}" ]]; then - echo "ERROR: LIVEKIT_URL not set" - exit 1 -fi - -if [[ -z "${LIVEKIT_API_KEY}" ]]; then - echo "ERROR: LIVEKIT_API_KEY not set" - exit 1 -fi - -if [[ -z "${LIVEKIT_API_SECRET}" ]]; then - echo "ERROR: LIVEKIT_API_SECRET not set" - exit 1 -fi - -# Health check dependencies -echo "=== Health Check Dependencies ===" -for service in vllm whisper tts; do - # Map service names to environment variable names - case "$service" in - vllm) url_var="LLM_URL" ;; - whisper) url_var="STT_URL" ;; - tts) url_var="TTS_URL" ;; - esac - url="${!url_var}" - if [[ -n "$url" ]]; then - echo "Checking $service at $url..." - if [[ "$service" == "vllm" ]]; then - curl -f "${url}/health" || echo "WARNING: vLLM health check failed" - elif [[ "$service" == "whisper" ]]; then - curl -f "${url}/" || echo "WARNING: Whisper health check failed" - elif [[ "$service" == "tts" ]]; then - curl -f "${url}/health" || echo "WARNING: TTS health check failed" - fi - fi -done - -# Set default values -export LLM_MODEL=${LLM_MODEL:-"Qwen/Qwen2.5-32B-Instruct-AWQ"} -export STT_MODEL=${STT_MODEL:-"base"} -export TTS_VOICE=${TTS_VOICE:-"af_heart"} -export DETERMINISTIC_ENABLED=${DETERMINISTIC_ENABLED:-"true"} -export DETERMINISTIC_THRESHOLD=${DETERMINISTIC_THRESHOLD:-"0.85"} -export OFFLINE_MODE=${OFFLINE_MODE:-"true"} - -echo "=== Configuration ===" -echo "LLM Model: ${LLM_MODEL}" -echo "STT Model: ${STT_MODEL}" -echo "TTS Voice: ${TTS_VOICE}" -echo "Deterministic Flows: ${DETERMINISTIC_ENABLED}" -echo "Offline Mode: ${OFFLINE_MODE}" - -# Start health check server in background -echo "Starting health check server..." -python health_check.py & -HEALTH_PID=$! - -# Start the main agent -echo "Starting voice agent..." -exec python agent.py \ No newline at end of file diff --git a/dream-server/agents/voice-offline/flows/restaurant_reservation.json b/dream-server/agents/voice-offline/flows/restaurant_reservation.json deleted file mode 100644 index a43815574..000000000 --- a/dream-server/agents/voice-offline/flows/restaurant_reservation.json +++ /dev/null @@ -1,52 +0,0 @@ -{ - "intent": "restaurant_reservation", - "patterns": [ - "book a table", - "make a reservation", - "restaurant booking", - "reserve a table", - "dinner reservation", - "lunch reservation", - "want to eat out", - "book restaurant" - ], - "steps": [ - { - "response": "I'd be happy to help you make a restaurant reservation! What date would you like to dine?", - "field": "date", - "validation": { - "type": "regex", - "pattern": "\\d{1,2}[/\\-]\\d{1,2}[/\\-]\\d{4}|today|tomorrow|next\\s+\\w+", - "error_message": "Please provide a valid date (e.g., 'today', 'tomorrow', '12/15/2024', or 'next Friday')." - } - }, - { - "response": "What time would you prefer for your reservation?", - "field": "time", - "validation": { - "type": "regex", - "pattern": "\\d{1,2}:\\d{2}|\\d{1,2}\\s*(am|pm)", - "error_message": "Please provide a valid time (e.g., '7:30 PM' or '19:30')." - } - }, - { - "response": "How many people will be in your party?", - "field": "party_size", - "validation": { - "type": "regex", - "pattern": "\\d+", - "error_message": "Please tell me the number of people (e.g., '2', 'party of 4')." - } - }, - { - "response": "Do you have any dietary restrictions, allergies, or special requests for your reservation?", - "field": "special_requests", - "validation": { - "type": "any", - "error_message": "Please let me know about any special requirements." - } - } - ], - "completion_message": "Excellent! I've collected all the details for your restaurant reservation:\n\n๐Ÿ“… Date: {date}\n๐Ÿ• Time: {time}\n๐Ÿ‘ฅ Party Size: {party_size} people\n๐Ÿ“ Special Requests: {special_requests}\n\nIn a real system, I would now process this booking and provide you with a confirmation number. Thank you for choosing our service!", - "fallback_response": "I'm having trouble understanding that. Would you like me to help you make a restaurant reservation? I can assist with booking a table for you." -} \ No newline at end of file diff --git a/dream-server/agents/voice-offline/health_check.py b/dream-server/agents/voice-offline/health_check.py deleted file mode 100644 index 5fa43240b..000000000 --- a/dream-server/agents/voice-offline/health_check.py +++ /dev/null @@ -1,102 +0,0 @@ -#!/usr/bin/env python3 -""" -Health check server for Dream Server Voice Agent - Offline Mode -Simple HTTP server for container health checks -""" - -import http.server -import socketserver -import json -import os -import requests -import threading -from datetime import datetime, timezone - -class HealthHandler(http.server.BaseHTTPRequestHandler): - """Health check handler - only serves /health endpoint, no file serving""" - - def log_message(self, format, *args): - """Suppress default request logging""" - pass - - def do_GET(self): - if self.path == '/health': - self.send_health_check() - else: - self.send_error(404, "Not Found") - - def send_health_check(self): - """Perform health check on all dependencies""" - checks = { - "status": "healthy", - "timestamp": datetime.now(timezone.utc).isoformat(), - "version": "1.0.0-offline", - "checks": {} - } - - # Check local services - services = { - "vllm": { - "url": os.getenv("LLM_URL", "http://vllm:8000/v1").removesuffix("/v1").removesuffix("/") + "/health", - "timeout": 5 - }, - "whisper": { - "url": os.getenv("STT_URL", "http://whisper:9000/v1").removesuffix("/v1").removesuffix("/") + "/health", - "timeout": 5 - }, - "tts": { - "url": os.getenv("TTS_URL", "http://tts:8880/v1").removesuffix("/v1").removesuffix("/") + "/health", - "timeout": 5 - } - } - - all_healthy = True - - for service, config in services.items(): - try: - response = requests.get(config["url"], timeout=config["timeout"]) - if response.status_code == 200: - checks["checks"][service] = { - "status": "healthy", - "response_time": response.elapsed.total_seconds() - } - else: - checks["checks"][service] = { - "status": "unhealthy", - "status_code": response.status_code - } - all_healthy = False - except Exception as e: - checks["checks"][service] = { - "status": "unhealthy", - "error": str(e) - } - all_healthy = False - - if not all_healthy: - checks["status"] = "unhealthy" - - # Check LiveKit credentials - if not os.getenv("LIVEKIT_API_SECRET"): - checks["checks"]["livekit"] = { - "status": "unhealthy", - "error": "LIVEKIT_API_SECRET not set" - } - checks["status"] = "unhealthy" - else: - checks["checks"]["livekit"] = {"status": "healthy"} - - self.send_response(200 if all_healthy else 503) - self.send_header('Content-type', 'application/json') - self.end_headers() - self.wfile.write(json.dumps(checks, indent=2).encode()) - -def start_health_server(): - """Start health check server""" - port = 8080 - with socketserver.TCPServer(("", port), HealthHandler) as httpd: - print(f"Health check server started on port {port}") - httpd.serve_forever() - -if __name__ == "__main__": - start_health_server() \ No newline at end of file diff --git a/dream-server/agents/voice-offline/requirements.txt b/dream-server/agents/voice-offline/requirements.txt deleted file mode 100644 index 1c9bfde83..000000000 --- a/dream-server/agents/voice-offline/requirements.txt +++ /dev/null @@ -1,42 +0,0 @@ -# Dream Server Voice Agent - Offline Mode Dependencies -# Pinned for reproducibility - verified 2026-02-12 -# -# Versions optimized for offline usage - -# LiveKit core -livekit>=0.17.0 -livekit-agents>=1.0.0 -livekit-plugins-silero>=0.8.0 - -# OFFLINE MODE: Use local OpenAI-compatible endpoints instead of cloud -livekit-plugins-openai>=0.10.0 # Required for local vLLM/Whisper/Kokoro compatibility - -# HTTP clients -httpx>=0.27.0 -aiohttp>=3.9.0 - -# OpenAI SDK for local vLLM compatibility -openai>=1.60.0 - -# Audio processing -numpy>=1.26.0 -sounddevice>=0.5.0 -pydub>=0.25.0 - -# Environment and configuration -python-dotenv>=1.0.0 -pydantic>=2.0.0 - -# Health checks -requests>=2.31.0 - -# Local model integration -# transformers>=4.39.0 # Not needed - using vLLM endpoints -# torch>=2.2.0 # Not needed - using vLLM endpoints - -# Logging -structlog>=24.1.0 - -# API server for health checks -fastapi>=0.109.0 -uvicorn>=0.27.0 \ No newline at end of file diff --git a/dream-server/agents/voice/Dockerfile b/dream-server/agents/voice/Dockerfile deleted file mode 100644 index a2cd82c18..000000000 --- a/dream-server/agents/voice/Dockerfile +++ /dev/null @@ -1,36 +0,0 @@ -# Dream Server Voice Agent -# Real-time voice chat using LiveKit + local LLM -# -# Build: docker build -t dream-voice-agent . -# Run: docker run -e LLM_URL=... -e STT_URL=... dream-voice-agent - -FROM python:3.11-slim - -WORKDIR /app - -# Install system deps (portaudio for audio, ffmpeg for transcoding) -RUN apt-get update && apt-get install -y --no-install-recommends \ - gcc \ - libffi-dev \ - libportaudio2 \ - libportaudiocpp0 \ - portaudio19-dev \ - ffmpeg \ - curl \ - && rm -rf /var/lib/apt/lists/* - -# Install Python deps -COPY requirements.txt . -RUN pip install --no-cache-dir -r requirements.txt - -# Copy agent code -COPY agent.py . -COPY entrypoint.sh . -RUN chmod +x entrypoint.sh - -# Non-root user for security -RUN useradd -m -u 1000 agent && chown -R agent:agent /app -USER agent - -# Run the agent -CMD ["./entrypoint.sh"] diff --git a/dream-server/agents/voice/README.md b/dream-server/agents/voice/README.md deleted file mode 100644 index f42cb9594..000000000 --- a/dream-server/agents/voice/README.md +++ /dev/null @@ -1,84 +0,0 @@ -# Dream Server Voice Agent - -Real-time voice AI assistant running entirely on local hardware. - -## Architecture - -``` -User (WebRTC) โ†’ LiveKit Server โ†’ Voice Agent - โ†“ - โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” - โ†“ โ†“ โ†“ - Whisper STT vLLM (LLM) OpenTTS/Piper - (port 9000) (port 8000) (port 8880) -``` - -## Status - -**Current:** โš ๏ธ Plugin interface WIP - -The LiveKit Agents SDK uses a plugin architecture. Our local backends need to implement the correct interfaces: - -| Component | Local Service | Status | -|-----------|---------------|--------| -| LLM | vLLM (OpenAI-compatible) | โœ… Works via `livekit-plugins-openai` | -| STT | Whisper | ๐ŸŸก Needs OpenAI-compatible endpoint or custom plugin | -| TTS | OpenTTS/Piper | ๐ŸŸก Needs custom plugin | -| VAD | Silero | โœ… Works | - -## Requirements - -- LiveKit Server running (port 7880) -- vLLM with OpenAI-compatible API (port 8000) -- Whisper STT server (port 9000) -- TTS server (port 8880) - -## Environment Variables - -```bash -LIVEKIT_URL=ws://localhost:7880 -LIVEKIT_API_KEY= # Generated by install.sh -LIVEKIT_API_SECRET= # Generated by install.sh -LLM_URL=http://vllm:8000/v1 -LLM_MODEL=Qwen/Qwen2.5-32B-Instruct-AWQ -STT_URL=http://whisper:9000 -TTS_URL=http://tts:8880 -``` - -## Running - -```bash -# Development mode -python agent.py dev - -# Console mode (terminal only) -python agent.py console - -# Production mode -python agent.py start -``` - -## TODO for Full Integration - -1. **STT Plugin**: Either: - - Use `faster-whisper-server` which has OpenAI-compatible API - - Create custom LiveKit plugin for Whisper HTTP API - -2. **TTS Plugin**: Create custom plugin for OpenTTS/Piper HTTP API - -3. **Testing**: Integration test with all local services - -## Bootstrap Mode - -When using a small model (1.5B, 3B), the agent automatically: -- Uses shorter system prompt -- Limits response length -- Faster but less capable - -This allows immediate voice interaction while the full model downloads. - -## References - -- [LiveKit Agents Docs](https://docs.livekit.io/agents/) -- [LiveKit Plugins](https://docs.livekit.io/agents/models/#plugins) -- [Dream Server Roadmap](../docs/TECHNICAL-ROADMAP.md) diff --git a/dream-server/agents/voice/agent.py b/dream-server/agents/voice/agent.py deleted file mode 100644 index 7c1d22df9..000000000 --- a/dream-server/agents/voice/agent.py +++ /dev/null @@ -1,324 +0,0 @@ -""" -Dream Server Voice Agent (v3.1) -Real-time voice conversation using local LLM + STT + TTS - -Uses LiveKit Agents SDK v1.4+ with local model backends: -- LLM: vLLM (OpenAI-compatible) -- STT: Whisper (OpenAI-compatible API) -- TTS: Kokoro (OpenAI-compatible API) -- VAD: Silero (built-in) - -Features: -- Error handling with graceful degradation -- Service health checks before startup -- Reconnection logic for LiveKit -- Interrupt handling (user can stop bot speech) -""" - -import logging -import os -import asyncio -import signal -from typing import Optional - -from dotenv import load_dotenv -from livekit.agents import ( - JobContext, - JobProcess, - WorkerOptions, - cli, -) -from livekit.agents import Agent, AgentSession -from livekit.plugins import silero, openai as openai_plugin - -# Load environment -load_dotenv() - -# Configure logging -logging.basicConfig( - level=logging.INFO, - format='%(asctime)s | %(name)s | %(levelname)s | %(message)s' -) -logger = logging.getLogger("dream-voice") - -# Environment config -LIVEKIT_URL = os.getenv("LIVEKIT_URL", "ws://localhost:7880") -LLM_URL = os.getenv("LLM_URL", "http://localhost:8000/v1") -LLM_MODEL = os.getenv("LLM_MODEL", "Qwen/Qwen2.5-32B-Instruct-AWQ") -STT_URL = os.getenv("STT_URL", "http://localhost:9000") -TTS_URL = os.getenv("TTS_URL", "http://localhost:8880/v1") -TTS_VOICE = os.getenv("TTS_VOICE", "af_heart") - -# Feature flags for graceful degradation -ENABLE_STT = os.getenv("ENABLE_STT", "true").lower() == "true" -ENABLE_TTS = os.getenv("ENABLE_TTS", "true").lower() == "true" -ENABLE_INTERRUPTIONS = os.getenv("ENABLE_INTERRUPTIONS", "true").lower() == "true" - - -async def check_service_health(url: str, name: str, timeout: int = 5) -> bool: - """Check if a service is healthy before starting.""" - import aiohttp - try: - async with aiohttp.ClientSession() as session: - async with session.get(url, timeout=aiohttp.ClientTimeout(total=timeout)) as resp: - healthy = resp.status == 200 - if healthy: - logger.info(f"โœ“ {name} is healthy") - else: - logger.warning(f"โš  {name} returned status {resp.status}") - return healthy - except Exception as e: - logger.warning(f"โœ— {name} unreachable: {e}") - return False - - -class DreamVoiceAgent(Agent): - """ - Voice agent with robust error handling and graceful degradation. - - Features: - - Greets user on entry - - Handles interruptions (user can stop bot speech) - - Falls back gracefully if services fail - """ - - def __init__(self) -> None: - super().__init__( - instructions="""You are a helpful voice assistant running on local hardware. -You have access to a powerful GPU cluster running Qwen2.5 32B for language understanding. -Keep responses conversational and concise - this is voice, not text. -Be friendly, direct, and helpful.""", - # Enable interruption handling - allow_interruptions=ENABLE_INTERRUPTIONS, - ) - self.error_count = 0 - self.max_errors = 3 - - async def on_enter(self): - """Called when agent becomes active. Send greeting.""" - logger.info("Agent entered - sending greeting") - try: - self.session.generate_reply( - instructions="Greet the user warmly and briefly introduce yourself as their local voice assistant." - ) - except Exception as e: - logger.error(f"Failed to send greeting: {e}") - self.error_count += 1 - - async def on_exit(self): - """Called when agent is shutting down.""" - logger.info("Agent exiting - cleanup") - - async def on_error(self, error: Exception): - """Handle errors gracefully.""" - self.error_count += 1 - logger.error(f"Agent error ({self.error_count}/{self.max_errors}): {error}") - - if self.error_count >= self.max_errors: - logger.critical("Max errors reached, agent will restart") - # Signal for restart - raise error - - -async def create_llm() -> Optional[openai_plugin.LLM]: - """Create LLM with error handling.""" - try: - llm = openai_plugin.LLM( - model=LLM_MODEL, - base_url=LLM_URL, - api_key=os.environ.get("VLLM_API_KEY", ""), - ) - logger.info(f"โœ“ LLM configured: {LLM_MODEL}") - return llm - except Exception as e: - logger.error(f"โœ— Failed to create LLM: {e}") - return None - - -async def create_stt() -> Optional[openai_plugin.STT]: - """Create STT with error handling.""" - if not ENABLE_STT: - logger.info("STT disabled by configuration") - return None - - try: - # Strip /v1 suffix if present before appending /health - stt_base_url = STT_URL.removesuffix('/v1').removesuffix('/') - # Check service health first - healthy = await check_service_health(f"{stt_base_url}/", "STT (Whisper)") - if not healthy: - logger.warning("STT service not healthy, continuing without speech recognition") - return None - - stt = openai_plugin.STT( - model="whisper-1", - base_url=STT_URL, - api_key=os.environ.get("WHISPER_API_KEY", ""), - ) - logger.info("โœ“ STT configured") - return stt - except Exception as e: - logger.error(f"โœ— Failed to create STT: {e}") - logger.warning("Continuing without speech recognition") - return None - - -async def create_tts() -> Optional[openai_plugin.TTS]: - """Create TTS with error handling.""" - if not ENABLE_TTS: - logger.info("TTS disabled by configuration") - return None - - try: - # Check service health first (TTS_URL already includes /v1) - tts_base_url = TTS_URL.removesuffix('/v1').removesuffix('/') - healthy = await check_service_health(f"{tts_base_url}/health", "TTS (Kokoro)") - if not healthy: - logger.warning("TTS service not healthy, continuing without speech synthesis") - return None - - tts = openai_plugin.TTS( - model="kokoro", - voice=TTS_VOICE, - base_url=TTS_URL, - api_key=os.environ.get("KOKORO_API_KEY", ""), - ) - logger.info(f"โœ“ TTS configured with voice: {TTS_VOICE}") - return tts - except Exception as e: - logger.error(f"โœ— Failed to create TTS: {e}") - logger.warning("Continuing without speech synthesis") - return None - - -async def entrypoint(ctx: JobContext): - """ - Main entry point for the voice agent job. - - Includes: - - Service health checks - - Graceful degradation if services fail - - Reconnection logic - """ - logger.info(f"Voice agent connecting to room: {ctx.room.name}") - - # Health check phase - logger.info("Performing service health checks...") - # vLLM uses /v1/models for health check, not /health - # LLM_URL already ends with /v1, so just add /models - llm_healthy = await check_service_health(f"{LLM_URL}/models", "LLM (vLLM)") - - if not llm_healthy: - logger.error("LLM service not healthy - cannot start agent") - raise RuntimeError("LLM service required but not available") - - # Create components with error handling - llm = await create_llm() - if not llm: - raise RuntimeError("Failed to create LLM - agent cannot start") - - stt = await create_stt() - tts = await create_tts() - - # Create VAD from prewarmed cache or load fresh - try: - vad = ctx.proc.userdata.get("vad") or silero.VAD.load() - logger.info("โœ“ VAD loaded") - except Exception as e: - logger.error(f"โœ— Failed to load VAD: {e}") - logger.warning("Starting without voice activity detection") - vad = None - - # Create session - only include working components - session_kwargs = {"llm": llm} - if stt: - session_kwargs["stt"] = stt - if tts: - session_kwargs["tts"] = tts - if vad: - session_kwargs["vad"] = vad - - session = AgentSession(**session_kwargs) - - # Create agent - agent = DreamVoiceAgent() - - # Setup graceful shutdown - shutdown_event = asyncio.Event() - - def signal_handler(sig, frame): - logger.info("Shutdown signal received") - shutdown_event.set() - - signal.signal(signal.SIGTERM, signal_handler) - signal.signal(signal.SIGINT, signal_handler) - - # Connect to room first (required by LiveKit SDK) - max_retries = 3 - for attempt in range(max_retries): - try: - await ctx.connect() - logger.info("Connected to room") - break - except Exception as e: - logger.error(f"Room connection failed (attempt {attempt + 1}/{max_retries}): {e}") - if attempt == max_retries - 1: - raise - await asyncio.sleep(1) - - # Start session after room connection - for attempt in range(max_retries): - try: - await session.start(agent=agent, room=ctx.room) - logger.info("Voice agent session started") - break - except Exception as e: - logger.error(f"Session start failed (attempt {attempt + 1}/{max_retries}): {e}") - if attempt == max_retries - 1: - raise - await asyncio.sleep(1) - - # Wait for shutdown signal - try: - await shutdown_event.wait() - except asyncio.CancelledError: - logger.info("Agent task cancelled") - finally: - logger.info("Shutting down voice agent...") - try: - await session.close() - except Exception as e: - logger.error(f"Error during shutdown: {e}") - - -def prewarm(proc: JobProcess): - """Prewarm function - load models before first job.""" - logger.info("Prewarming voice agent...") - try: - proc.userdata["vad"] = silero.VAD.load() - logger.info("โœ“ VAD model loaded") - except Exception as e: - logger.error(f"โœ— Failed to load VAD: {e}") - proc.userdata["vad"] = None - - -if __name__ == "__main__": - agent_port = int(os.getenv("AGENT_PORT", "8181")) - - # Log startup info - logger.info("=" * 60) - logger.info("Dream Server Voice Agent Starting") - logger.info(f"Port: {agent_port}") - logger.info(f"LLM: {LLM_URL}") - logger.info(f"STT: {STT_URL} (enabled: {ENABLE_STT})") - logger.info(f"TTS: {TTS_URL} (enabled: {ENABLE_TTS})") - logger.info(f"Interruptions: {ENABLE_INTERRUPTIONS}") - logger.info("=" * 60) - - cli.run_app( - WorkerOptions( - entrypoint_fnc=entrypoint, - prewarm_fnc=prewarm, - port=agent_port, - ) - ) diff --git a/dream-server/agents/voice/entrypoint.sh b/dream-server/agents/voice/entrypoint.sh deleted file mode 100755 index ce10b151e..000000000 --- a/dream-server/agents/voice/entrypoint.sh +++ /dev/null @@ -1,61 +0,0 @@ -#!/bin/bash -# Voice Agent Entrypoint -set -euo pipefail - -echo "========================================" -echo " Dream Server Voice Agent" -echo "========================================" -echo "" -echo "Configuration:" -echo " LLM URL: ${LLM_URL:-http://vllm:8000/v1}" -echo " STT URL: ${STT_URL:-http://localhost:9000}" -echo " TTS URL: ${TTS_URL:-http://localhost:8880}" -echo "" - -# Health check function -wait_for_service() { - local name=$1 - local url=$2 - local max_attempts=${3:-30} - local attempt=1 - - echo "Waiting for $name at $url..." - while [ $attempt -le $max_attempts ]; do - if curl -sf --connect-timeout 10 --max-time 30 "$url" > /dev/null 2>&1; then - echo "โœ“ $name is ready" - return 0 - fi - echo " Attempt $attempt/$max_attempts - $name not ready yet..." - sleep 2 - attempt=$((attempt + 1)) - done - - echo "โœ— $name failed to respond after $max_attempts attempts" - return 1 -} - -# Wait for required services -echo "Checking service dependencies..." -# Extract base URL for health check (remove /v1 suffix) -LLM_BASE_URL="${LLM_URL:-http://vllm:8000/v1}" -LLM_BASE_URL="${LLM_BASE_URL%/v1}" -# vLLM uses /v1/models as health indicator, not /health -wait_for_service "LLM (vLLM)" "${LLM_BASE_URL}/v1/models" 60 || echo "Warning: LLM health check failed, continuing anyway..." -STT_BASE_URL="${STT_URL:-http://whisper:9000/v1}" -STT_BASE_URL="${STT_BASE_URL%/v1}" -# Whisper health check - try /health or just check if port is open -wait_for_service "STT (Whisper)" "${STT_BASE_URL}/" 10 || echo "Warning: STT health check failed, continuing anyway..." - -# TTS is optional for some configs -if [ -n "${TTS_URL:-}" ]; then - # Extract base URL for health check (remove /v1 suffix if present) - TTS_BASE_URL="${TTS_URL%/v1}" - wait_for_service "TTS" "${TTS_BASE_URL}/health" 5 || echo "Warning: TTS not available, continuing anyway..." -fi - -echo "" -echo "All services ready. Starting voice agent..." -echo "" - -# Start the voice agent -exec python agent.py start diff --git a/dream-server/agents/voice/requirements.txt b/dream-server/agents/voice/requirements.txt deleted file mode 100644 index 6710b33fb..000000000 --- a/dream-server/agents/voice/requirements.txt +++ /dev/null @@ -1,30 +0,0 @@ -# Dream Server Voice Agent Dependencies -# Pinned for reproducibility โ€” update periodically -# -# Versions verified 2026-02-10 - -# LiveKit core -livekit>=0.17.0 -livekit-agents>=1.0.0 -livekit-plugins-silero>=0.8.0 -livekit-plugins-openai>=0.10.0 - -# HTTP clients -httpx>=0.27.0 -aiohttp>=3.9.0 - -# OpenAI SDK (for vLLM compatibility) -openai>=1.60.0 - -# Audio processing -numpy>=1.26.0 -sounddevice>=0.5.0 -pydub>=0.25.0 - -# Environment -python-dotenv>=1.0.0 - -# API server (for test endpoints) -fastapi>=0.109.0 -uvicorn>=0.27.0 -pydantic>=2.0.0 diff --git a/dream-server/agents/voice/test_server.py b/dream-server/agents/voice/test_server.py deleted file mode 100644 index f9f3f35d1..000000000 --- a/dream-server/agents/voice/test_server.py +++ /dev/null @@ -1,175 +0,0 @@ -""" -M4 Voice Agent Test Server - -Provides HTTP endpoints for testing the deterministic layer -without requiring browser/voice interaction. - -Usage: - python test_server.py - -Endpoints: - POST /test/utterance - Test intent classification + FSM routing - GET /metrics - Get deterministic routing metrics - GET /health - Health check -""" - -import os -import sys -import json -import time -from typing import Dict, Any -from fastapi import FastAPI -from pydantic import BaseModel -import uvicorn - -# Add deterministic module to path -sys.path.insert(0, os.path.dirname(__file__)) -from deterministic import ( - QwenClassifier, - LiveKitFSMAdapter, - FSMExecutor, -) -from deterministic.extractors import DEFAULT_EXTRACTORS - -app = FastAPI(title="M4 Voice Agent Test Server") - -# Global state -clf = None -adapter = None -fsm = None - -class UtteranceRequest(BaseModel): - utterance: str - session_id: str = None - flow_name: str = "hvac_service" - -class TestResponse(BaseModel): - intent: str - confidence: float - deterministic: bool - response: str - latency_ms: float - flow_active: bool - -@app.on_event("startup") -async def startup(): - """Initialize M4 components.""" - global clf, adapter, fsm - - print("Initializing M4 Deterministic Layer...") - - # Initialize classifier - clf = QwenClassifier( - base_url=os.getenv("LLM_URL", "http://localhost:8000/v1"), - model=os.getenv("LLM_MODEL", "Qwen/Qwen2.5-32B-Instruct-AWQ"), - threshold=float(os.getenv("DETERMINISTIC_THRESHOLD", "0.85")) - ) - - # Initialize FSM with flows - fsm = FSMExecutor(extractors=DEFAULT_EXTRACTORS) - flows_dir = os.getenv("FLOWS_DIR", "./flows") - if os.path.exists(flows_dir): - # Load flows manually to handle "domain" vs "name" field - import glob - for flow_file in glob.glob(os.path.join(flows_dir, "*.json")): - with open(flow_file) as f: - flow = json.load(f) - # Normalize: use "domain" as "name" if present - flow_name = flow.get("name") or flow.get("domain") - if flow_name: - flow["name"] = flow_name - fsm.flows[flow_name] = flow - print(f"Loaded {len(fsm.flows)} flows from {flows_dir}") - else: - print(f"Warning: Flows directory not found: {flows_dir}") - - # Initialize adapter - adapter = LiveKitFSMAdapter( - fsm=fsm, - classifier=clf, - confidence_threshold=0.85, - entity_extractors=DEFAULT_EXTRACTORS - ) - - print("M4 Test Server ready!") - -@app.get("/health") -def health(): - return { - "status": "healthy", - "m4_enabled": clf is not None, - "flows_loaded": len(fsm.flows) if fsm else 0 - } - -@app.post("/test/utterance", response_model=TestResponse) -async def test_utterance(req: UtteranceRequest): - """Test a single utterance through M4 pipeline.""" - session_id = req.session_id or f"test-{int(time.time())}" - - # Start session if new - if session_id not in adapter.active_sessions: - await adapter.start_session(session_id, req.flow_name) - - # Process utterance - start = time.time() - result = await adapter.handle_utterance(session_id, req.utterance) - latency = (time.time() - start) * 1000 - - return TestResponse( - intent=result.intent, - confidence=result.confidence, - deterministic=result.used_deterministic, - response=result.text, - latency_ms=result.latency_ms or latency, - flow_active=result.flow_status == "in_progress" if result.flow_status else False - ) - -@app.post("/test/flow") -async def test_flow(req: UtteranceRequest): - """Test a complete flow with multiple utterances.""" - session_id = req.session_id or f"test-{int(time.time())}" - - # Define test sequence - test_utterances = [ - "schedule a service", - "my name is Todd", - "tomorrow at 2pm", - "yes confirm" - ] - - results = [] - await adapter.start_session(session_id, req.flow_name) - - for utterance in test_utterances: - start = time.time() - result = await adapter.handle_utterance(session_id, utterance) - latency = (time.time() - start) * 1000 - - results.append({ - "utterance": utterance, - "intent": result.intent, - "confidence": result.confidence, - "deterministic": result.used_deterministic, - "response": result.text, - "latency_ms": result.latency_ms or latency - }) - - # Get metrics - metrics = adapter.get_metrics() - - return { - "session_id": session_id, - "flow_name": req.flow_name, - "results": results, - "metrics": metrics - } - -@app.get("/metrics") -def get_metrics(): - """Get M4 routing metrics.""" - if adapter: - return adapter.get_metrics() - return {"error": "Adapter not initialized"} - -if __name__ == "__main__": - uvicorn.run(app, host="0.0.0.0", port=8290) diff --git a/dream-server/compose/docker-compose.cluster.yml b/dream-server/compose/docker-compose.cluster.yml deleted file mode 100644 index ba31cbe54..000000000 --- a/dream-server/compose/docker-compose.cluster.yml +++ /dev/null @@ -1,270 +0,0 @@ -# Dream Server โ€” Cluster Tier -# Multi-GPU (48GB+ total VRAM) โ€” 70B+ models with tensor parallelism -# Usage: docker compose -f docker-compose.cluster.yml up -d -# -# Requirements: -# - 2+ NVIDIA GPUs with 24GB+ each, or 4+ GPUs with 16GB+ each -# - NVLink/NVSwitch recommended for optimal tensor parallelism -# - 64GB+ system RAM recommended -# -# Capacity estimate (2x A100 80GB): -# - 100+ concurrent LLM requests at <100ms latency -# - 20+ concurrent voice conversations -# - 72B model with 32K context - -services: - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # LLM โ€” Qwen2.5-72B with Tensor Parallelism - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - vllm: - image: vllm/vllm-openai:v0.15.1 - runtime: nvidia - container_name: dream-vllm-cluster - environment: - - NVIDIA_VISIBLE_DEVICES=all - - VLLM_ATTENTION_BACKEND=FLASHINFER - - NCCL_DEBUG=WARN - volumes: - - ${HF_HOME:-~/.cache/huggingface}:/root/.cache/huggingface - ports: - - "8000:8000" - command: > - --model Qwen/Qwen2.5-72B-Instruct-AWQ - ${VLLM_QUANTIZATION:+--quantization $VLLM_QUANTIZATION} - --tensor-parallel-size ${VLLM_TP_SIZE:-2} - --max-model-len 32768 - --gpu-memory-utilization 0.92 - --enable-auto-tool-choice - --tool-call-parser hermes - --served-model-name gpt-4o - --trust-remote-code - --disable-log-requests - deploy: - resources: - reservations: - devices: - - driver: nvidia - count: all - capabilities: [gpu] - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:8000/health"] - interval: 30s - timeout: 10s - retries: 5 - start_period: 600s # 72B takes longer to load - restart: unless-stopped - ulimits: - memlock: -1 - stack: 67108864 - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # STT โ€” Whisper Large v3 Turbo (GPU) - # Dedicated GPU for STT to avoid contention with LLM - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - whisper: - image: fedirz/faster-whisper-server:latest-cuda - runtime: nvidia - container_name: dream-whisper-cluster - environment: - - WHISPER__MODEL=Systran/faster-whisper-large-v3-turbo - - WHISPER__DEVICE=cuda - - WHISPER__COMPUTE_TYPE=float16 - - WHISPER__NUM_WORKERS=4 - - CUDA_VISIBLE_DEVICES=${WHISPER_GPU:-0} - ports: - - "8001:8000" - deploy: - resources: - reservations: - devices: - - driver: nvidia - device_ids: ["${WHISPER_GPU:-0}"] - capabilities: [gpu] - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:8000/health"] - interval: 30s - timeout: 10s - retries: 3 - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # TTS โ€” Kokoro GPU (batch synthesis for high throughput) - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - kokoro: - image: ghcr.io/remsky/kokoro-fastapi:v0.6.2-gpu - runtime: nvidia - container_name: dream-kokoro-cluster - environment: - - CUDA_VISIBLE_DEVICES=${KOKORO_GPU:-0} - - KOKORO_BATCH_SIZE=8 - ports: - - "8880:8880" - volumes: - - kokoro-cache:/app/cache - deploy: - resources: - reservations: - devices: - - driver: nvidia - device_ids: ["${KOKORO_GPU:-0}"] - capabilities: [gpu] - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:8880/health"] - interval: 30s - timeout: 10s - retries: 3 - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # LiveKit โ€” WebRTC Server (production config) - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - livekit: - image: livekit/livekit-server:latest - container_name: dream-livekit-cluster - ports: - - "7880:7880" # HTTP API - - "7881:7881" # WebRTC TCP - - "7882:7882/udp" # WebRTC UDP - - "50000-50100:50000-50100/udp" # RTP ports for high concurrency - command: > - --config /livekit.yaml - volumes: - - ./livekit-cluster.yaml:/livekit.yaml:ro - healthcheck: - test: ["CMD", "wget", "--spider", "-q", "http://localhost:7880"] - interval: 10s - timeout: 5s - retries: 3 - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # Voice Agent โ€” High-concurrency configuration - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - voice-agent: - build: - context: ./agents/voice - dockerfile: Dockerfile - container_name: dream-voice-agent-cluster - environment: - - LIVEKIT_URL=ws://livekit:7880 - - LIVEKIT_API_KEY=${LIVEKIT_API_KEY:?LIVEKIT_API_KEY must be set} - - LIVEKIT_API_SECRET=${LIVEKIT_API_SECRET:?LIVEKIT_API_SECRET must be set} - - LLM_BASE_URL=http://vllm:8000/v1 - - STT_BASE_URL=http://whisper:8000 - - TTS_BASE_URL=http://kokoro:8880 - - AGENT_CONCURRENCY=20 - depends_on: - vllm: - condition: service_healthy - whisper: - condition: service_healthy - kokoro: - condition: service_healthy - livekit: - condition: service_healthy - deploy: - replicas: 2 # Multiple agent instances for high concurrency - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # Dashboard โ€” Web UI - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - dashboard: - build: - context: ./dashboard - dockerfile: Dockerfile - container_name: dream-dashboard-cluster - ports: - - "3001:3001" - environment: - - VITE_API_URL=http://localhost:3002 - - VITE_LIVEKIT_URL=ws://localhost:7880 - depends_on: - - api - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # API โ€” Backend for Dashboard - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - api: - build: - context: ./dashboard-api - dockerfile: Dockerfile - container_name: dream-api-cluster - ports: - - "3002:3002" - environment: - - VLLM_URL=http://vllm:8000 - - WHISPER_URL=http://whisper:8000 - - KOKORO_URL=http://kokoro:8880 - - LIVEKIT_URL=ws://livekit:7880 - - LIVEKIT_API_KEY=${LIVEKIT_API_KEY:?LIVEKIT_API_KEY must be set} - - LIVEKIT_API_SECRET=${LIVEKIT_API_SECRET:?LIVEKIT_API_SECRET must be set} - depends_on: - - vllm - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # Metrics โ€” Prometheus + Grafana for cluster monitoring - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - prometheus: - image: prom/prometheus:latest - container_name: dream-prometheus - ports: - - "9090:9090" - extra_hosts: - - "host.docker.internal:host-gateway" - volumes: - - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro - - prometheus-data:/prometheus - command: - - '--config.file=/etc/prometheus/prometheus.yml' - - '--storage.tsdb.retention.time=7d' - restart: unless-stopped - - grafana: - image: grafana/grafana:latest - container_name: dream-grafana - ports: - - "${GRAFANA_PORT:-3003}:3000" - environment: - - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD:?GRAFANA_PASSWORD must be set in .env} - - GF_USERS_ALLOW_SIGN_UP=false - volumes: - - grafana-data:/var/lib/grafana - - ./grafana/dashboards:/etc/grafana/provisioning/dashboards:ro - - ./grafana/datasources:/etc/grafana/provisioning/datasources:ro - depends_on: - - prometheus - restart: unless-stopped - -volumes: - kokoro-cache: - prometheus-data: - grafana-data: - -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• -# Configuration Notes: -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• -# -# Environment Variables: -# VLLM_TP_SIZE - Tensor parallel size (default: 2, set to GPU count) -# WHISPER_GPU - GPU device ID for Whisper (default: 0) -# KOKORO_GPU - GPU device ID for Kokoro (default: 0) -# LIVEKIT_API_KEY - LiveKit API key (default: devkey) -# LIVEKIT_API_SECRET - LiveKit API secret (default: secret) -# GRAFANA_PASSWORD - Grafana admin password (default: admin) -# -# Recommended GPU Allocation (4x GPU setup): -# GPU 0-1: vLLM (tensor parallel) -# GPU 2: Whisper STT -# GPU 3: Kokoro TTS -# -# For 2x GPU setup: -# GPU 0-1: vLLM (tensor parallel) -# GPU 0: Whisper + Kokoro (shared, time-sliced) -# -# Scaling: -# - Adjust VLLM_TP_SIZE to match available GPUs -# - For more concurrent voice, add voice-agent replicas -# - Monitor with Grafana at :3000 diff --git a/dream-server/compose/docker-compose.edge.yml b/dream-server/compose/docker-compose.edge.yml deleted file mode 100644 index e3d2ea998..000000000 --- a/dream-server/compose/docker-compose.edge.yml +++ /dev/null @@ -1,170 +0,0 @@ -# Dream Server โ€” Edge Tier -# 16GB RAM or 8GB+ VRAM โ€” 7-8B models, full voice stack -# Usage: docker compose -f docker-compose.edge.yml up -d - -services: - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # LLM โ€” Qwen2.5-7B (fits in 8GB VRAM with AWQ) - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - vllm: - image: vllm/vllm-openai:v0.15.1 - runtime: nvidia - container_name: dream-vllm - environment: - - NVIDIA_VISIBLE_DEVICES=all - volumes: - - ${HF_HOME:-~/.cache/huggingface}:/root/.cache/huggingface - ports: - - "8000:8000" - command: > - --model Qwen/Qwen2.5-7B-Instruct-AWQ - ${VLLM_QUANTIZATION:+--quantization $VLLM_QUANTIZATION} - --max-model-len 16384 - --gpu-memory-utilization 0.85 - --served-model-name gpt-4o - --trust-remote-code - deploy: - resources: - reservations: - devices: - - driver: nvidia - count: 1 - capabilities: [gpu] - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:8000/health"] - interval: 30s - timeout: 10s - retries: 5 - start_period: 180s - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # STT โ€” Whisper Medium (balances quality vs VRAM) - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - whisper: - image: fedirz/faster-whisper-server:latest-cuda - runtime: nvidia - container_name: dream-whisper - environment: - - WHISPER__MODEL=Systran/faster-whisper-medium - - WHISPER__DEVICE=cuda - - NVIDIA_VISIBLE_DEVICES=all - ports: - - "8001:8000" - deploy: - resources: - reservations: - devices: - - driver: nvidia - count: 1 - capabilities: [gpu] - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:8000/health"] - interval: 30s - timeout: 10s - retries: 3 - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # TTS โ€” Kokoro CPU (saves VRAM for LLM) - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - kokoro: - image: ghcr.io/remsky/kokoro-fastapi:v0.6.2-cpu - container_name: dream-kokoro - ports: - - "8880:8880" - volumes: - - kokoro-cache:/app/cache - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:8880/health"] - interval: 30s - timeout: 10s - retries: 3 - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # LiveKit โ€” WebRTC Server - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - livekit: - image: livekit/livekit-server:latest - container_name: dream-livekit - ports: - - "7880:7880" - - "7881:7881" - - "7882:7882/udp" - command: --config /livekit.yaml - environment: - # Keys passed via env var (safer than config file) - - LIVEKIT_KEYS=${LIVEKIT_API_KEY}:${LIVEKIT_API_SECRET} - volumes: - - ./livekit.yaml:/livekit.yaml:ro - healthcheck: - test: ["CMD", "wget", "--spider", "-q", "http://localhost:7880"] - interval: 10s - timeout: 5s - retries: 3 - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # Voice Agent - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - voice-agent: - build: - context: ./agents/voice - dockerfile: Dockerfile - container_name: dream-voice-agent - environment: - - LIVEKIT_URL=ws://livekit:7880 - - LIVEKIT_API_KEY=${LIVEKIT_API_KEY} - - LIVEKIT_API_SECRET=${LIVEKIT_API_SECRET} - - LLM_BASE_URL=http://vllm:8000/v1 - - STT_BASE_URL=http://whisper:8000 - - TTS_BASE_URL=http://kokoro:8880 - depends_on: - vllm: - condition: service_healthy - whisper: - condition: service_healthy - kokoro: - condition: service_healthy - livekit: - condition: service_healthy - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # Dashboard + API - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - dashboard: - build: - context: ./dashboard - dockerfile: Dockerfile - container_name: dream-dashboard - ports: - - "3001:3001" - environment: - - VITE_API_URL=http://localhost:3002 - - VITE_LIVEKIT_URL=ws://localhost:7880 - depends_on: - - api - restart: unless-stopped - - api: - build: - context: ./dashboard-api - dockerfile: Dockerfile - container_name: dream-api - ports: - - "3002:3002" - environment: - - VLLM_URL=http://vllm:8000 - - WHISPER_URL=http://whisper:8000 - - KOKORO_URL=http://kokoro:8880 - - LIVEKIT_URL=ws://livekit:7880 - - LIVEKIT_API_KEY=${LIVEKIT_API_KEY} - - LIVEKIT_API_SECRET=${LIVEKIT_API_SECRET} - depends_on: - - vllm - restart: unless-stopped - -volumes: - kokoro-cache: diff --git a/dream-server/compose/docker-compose.nano.yml b/dream-server/compose/docker-compose.nano.yml deleted file mode 100644 index 0310b1d23..000000000 --- a/dream-server/compose/docker-compose.nano.yml +++ /dev/null @@ -1,63 +0,0 @@ -# Dream Server โ€” Nano Tier -# 8GB+ RAM, no GPU required โ€” 1-3B models, text-only -# Usage: docker compose -f docker-compose.nano.yml up -d -# -# Note: Voice features disabled (no GPU for real-time STT/TTS) -# Use text chat via API or dashboard - -services: - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # LLM โ€” Qwen2.5-1.5B via llama.cpp (CPU) - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - llama: - image: ghcr.io/ggerganov/llama.cpp:server - container_name: dream-llama - ports: - - "8000:8080" - volumes: - - ${MODELS_DIR:-~/.cache/models}:/models - command: > - --model /models/qwen2.5-1.5b-instruct-q4_k_m.gguf - --ctx-size 8192 - --n-gpu-layers 0 - --threads 4 - --host 0.0.0.0 - --port 8080 - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:8080/health"] - interval: 30s - timeout: 10s - retries: 3 - start_period: 60s - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # Dashboard + API (no voice features) - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - dashboard: - build: - context: ./dashboard - dockerfile: Dockerfile - container_name: dream-dashboard - ports: - - "3001:3001" - environment: - - VITE_API_URL=http://localhost:3002 - - VITE_VOICE_ENABLED=false - depends_on: - - api - restart: unless-stopped - - api: - build: - context: ./dashboard-api - dockerfile: Dockerfile - container_name: dream-api - ports: - - "3002:3002" - environment: - - LLM_URL=http://llama:8080 - - VOICE_ENABLED=false - depends_on: - - llama - restart: unless-stopped diff --git a/dream-server/compose/docker-compose.pro.yml b/dream-server/compose/docker-compose.pro.yml deleted file mode 100644 index 80650e345..000000000 --- a/dream-server/compose/docker-compose.pro.yml +++ /dev/null @@ -1,184 +0,0 @@ -# Dream Server โ€” Pro Tier -# 24GB+ VRAM โ€” 32B models, full voice stack -# Usage: docker compose -f docker-compose.pro.yml up -d - -services: - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # LLM โ€” Qwen2.5-32B-Instruct-AWQ - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - vllm: - image: vllm/vllm-openai:v0.15.1 - runtime: nvidia - container_name: dream-vllm - environment: - - NVIDIA_VISIBLE_DEVICES=all - - VLLM_ATTENTION_BACKEND=FLASHINFER - volumes: - - ${HF_HOME:-~/.cache/huggingface}:/root/.cache/huggingface - ports: - - "8000:8000" - command: > - --model Qwen/Qwen2.5-32B-Instruct-AWQ - ${VLLM_QUANTIZATION:+--quantization $VLLM_QUANTIZATION} - --max-model-len 32768 - --gpu-memory-utilization 0.90 - --enable-auto-tool-choice - --tool-call-parser hermes - --served-model-name gpt-4o - --trust-remote-code - deploy: - resources: - reservations: - devices: - - driver: nvidia - count: 1 - capabilities: [gpu] - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:8000/health"] - interval: 30s - timeout: 10s - retries: 5 - start_period: 300s - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # STT โ€” Whisper Large v3 - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - whisper: - image: fedirz/faster-whisper-server:latest-cuda - runtime: nvidia - container_name: dream-whisper - environment: - - WHISPER__MODEL=Systran/faster-whisper-large-v3 - - WHISPER__DEVICE=cuda - - NVIDIA_VISIBLE_DEVICES=all - ports: - - "8001:8000" - deploy: - resources: - reservations: - devices: - - driver: nvidia - count: 1 - capabilities: [gpu] - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:8000/health"] - interval: 30s - timeout: 10s - retries: 3 - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # TTS โ€” Kokoro (GPU-accelerated) - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - kokoro: - image: ghcr.io/remsky/kokoro-fastapi:v0.6.2-gpu - runtime: nvidia - container_name: dream-kokoro - environment: - - NVIDIA_VISIBLE_DEVICES=all - ports: - - "8880:8880" - volumes: - - kokoro-cache:/app/cache - deploy: - resources: - reservations: - devices: - - driver: nvidia - count: 1 - capabilities: [gpu] - healthcheck: - test: ["CMD", "curl", "-f", "http://localhost:8880/health"] - interval: 30s - timeout: 10s - retries: 3 - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # LiveKit โ€” WebRTC Server - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - livekit: - image: livekit/livekit-server:latest - container_name: dream-livekit - ports: - - "7880:7880" # HTTP - - "7881:7881" # WebRTC TCP - - "7882:7882/udp" # WebRTC UDP - command: > - --config /livekit.yaml - volumes: - - ./livekit.yaml:/livekit.yaml:ro - healthcheck: - test: ["CMD", "wget", "--spider", "-q", "http://localhost:7880"] - interval: 10s - timeout: 5s - retries: 3 - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # Voice Agent โ€” Connects LLM + STT + TTS via LiveKit - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - voice-agent: - build: - context: ./agents/voice - dockerfile: Dockerfile - container_name: dream-voice-agent - environment: - - LIVEKIT_URL=ws://livekit:7880 - - LIVEKIT_API_KEY=${LIVEKIT_API_KEY:?LIVEKIT_API_KEY must be set} - - LIVEKIT_API_SECRET=${LIVEKIT_API_SECRET:?LIVEKIT_API_SECRET must be set} - - LLM_BASE_URL=http://vllm:8000/v1 - - STT_BASE_URL=http://whisper:8000 - - TTS_BASE_URL=http://kokoro:8880 - depends_on: - vllm: - condition: service_healthy - whisper: - condition: service_healthy - kokoro: - condition: service_healthy - livekit: - condition: service_healthy - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # Dashboard โ€” Web UI - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - dashboard: - build: - context: ./dashboard - dockerfile: Dockerfile - container_name: dream-dashboard - ports: - - "3001:3001" - environment: - - VITE_API_URL=http://localhost:3002 - - VITE_LIVEKIT_URL=ws://localhost:7880 - depends_on: - - api - restart: unless-stopped - - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - # API โ€” Backend for Dashboard - # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - api: - build: - context: ./dashboard-api - dockerfile: Dockerfile - container_name: dream-api - ports: - - "3002:3002" - environment: - - VLLM_URL=http://vllm:8000 - - WHISPER_URL=http://whisper:8000 - - KOKORO_URL=http://kokoro:8880 - - LIVEKIT_URL=ws://livekit:7880 - - LIVEKIT_API_KEY=${LIVEKIT_API_KEY:?LIVEKIT_API_KEY must be set} - - LIVEKIT_API_SECRET=${LIVEKIT_API_SECRET:?LIVEKIT_API_SECRET must be set} - depends_on: - - vllm - restart: unless-stopped - -volumes: - kokoro-cache: diff --git a/dream-server/compose/grafana/dashboards/dashboard.yml b/dream-server/compose/grafana/dashboards/dashboard.yml deleted file mode 100644 index 9a4e56eee..000000000 --- a/dream-server/compose/grafana/dashboards/dashboard.yml +++ /dev/null @@ -1,11 +0,0 @@ -apiVersion: 1 - -providers: - - name: 'Dream Server' - orgId: 1 - folder: '' - type: file - disableDeletion: false - editable: true - options: - path: /etc/grafana/provisioning/dashboards diff --git a/dream-server/compose/grafana/dashboards/dream-server.json b/dream-server/compose/grafana/dashboards/dream-server.json deleted file mode 100644 index 4ad72df2e..000000000 --- a/dream-server/compose/grafana/dashboards/dream-server.json +++ /dev/null @@ -1,580 +0,0 @@ -{ - "annotations": { - "list": [] - }, - "editable": true, - "fiscalYearStartMonth": 0, - "graphTooltip": 0, - "id": null, - "links": [], - "panels": [ - { - "collapsed": false, - "gridPos": { "h": 1, "w": 24, "x": 0, "y": 0 }, - "id": 100, - "panels": [], - "title": "vLLM Inference", - "type": "row" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "fieldConfig": { - "defaults": { - "color": { "mode": "palette-classic" }, - "custom": { - "axisBorderShow": false, - "axisCenteredZero": false, - "axisColorMode": "text", - "axisLabel": "", - "axisPlacement": "auto", - "barAlignment": 0, - "drawStyle": "line", - "fillOpacity": 10, - "gradientMode": "none", - "hideFrom": { "legend": false, "tooltip": false, "viz": false }, - "insertNulls": false, - "lineInterpolation": "linear", - "lineWidth": 1, - "pointSize": 5, - "scaleDistribution": { "type": "linear" }, - "showPoints": "never", - "spanNulls": false, - "stacking": { "group": "A", "mode": "none" }, - "thresholdsStyle": { "mode": "off" } - }, - "mappings": [], - "thresholds": { "mode": "absolute", "steps": [{ "color": "green", "value": null }] }, - "unit": "reqps" - }, - "overrides": [] - }, - "gridPos": { "h": 8, "w": 12, "x": 0, "y": 1 }, - "id": 1, - "options": { - "legend": { "calcs": ["mean", "max"], "displayMode": "list", "placement": "bottom", "showLegend": true }, - "tooltip": { "mode": "multi", "sort": "none" } - }, - "targets": [ - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "rate(vllm:num_requests_total[1m])", - "legendFormat": "Requests/sec", - "refId": "A" - } - ], - "title": "Request Rate", - "type": "timeseries" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "fieldConfig": { - "defaults": { - "color": { "mode": "palette-classic" }, - "custom": { - "axisBorderShow": false, - "axisCenteredZero": false, - "axisColorMode": "text", - "axisLabel": "", - "axisPlacement": "auto", - "barAlignment": 0, - "drawStyle": "line", - "fillOpacity": 10, - "gradientMode": "none", - "hideFrom": { "legend": false, "tooltip": false, "viz": false }, - "insertNulls": false, - "lineInterpolation": "linear", - "lineWidth": 1, - "pointSize": 5, - "scaleDistribution": { "type": "linear" }, - "showPoints": "never", - "spanNulls": false, - "stacking": { "group": "A", "mode": "none" }, - "thresholdsStyle": { "mode": "off" } - }, - "mappings": [], - "thresholds": { "mode": "absolute", "steps": [{ "color": "green", "value": null }] }, - "unit": "s" - }, - "overrides": [] - }, - "gridPos": { "h": 8, "w": 12, "x": 12, "y": 1 }, - "id": 2, - "options": { - "legend": { "calcs": ["mean", "p95"], "displayMode": "list", "placement": "bottom", "showLegend": true }, - "tooltip": { "mode": "multi", "sort": "none" } - }, - "targets": [ - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "histogram_quantile(0.5, rate(vllm:time_to_first_token_seconds_bucket[5m]))", - "legendFormat": "TTFT p50", - "refId": "A" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "histogram_quantile(0.95, rate(vllm:time_to_first_token_seconds_bucket[5m]))", - "legendFormat": "TTFT p95", - "refId": "B" - } - ], - "title": "Time to First Token", - "type": "timeseries" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "fieldConfig": { - "defaults": { - "color": { "mode": "thresholds" }, - "mappings": [], - "max": 100, - "min": 0, - "thresholds": { - "mode": "absolute", - "steps": [ - { "color": "green", "value": null }, - { "color": "yellow", "value": 70 }, - { "color": "red", "value": 90 } - ] - }, - "unit": "percent" - }, - "overrides": [] - }, - "gridPos": { "h": 8, "w": 6, "x": 0, "y": 9 }, - "id": 3, - "options": { - "minVizHeight": 75, - "minVizWidth": 75, - "orientation": "auto", - "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, - "showThresholdLabels": false, - "showThresholdMarkers": true - }, - "targets": [ - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "vllm:gpu_cache_usage_perc * 100", - "legendFormat": "GPU Cache", - "refId": "A" - } - ], - "title": "GPU KV Cache Usage", - "type": "gauge" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "fieldConfig": { - "defaults": { - "color": { "mode": "thresholds" }, - "mappings": [], - "thresholds": { - "mode": "absolute", - "steps": [ - { "color": "green", "value": null }, - { "color": "yellow", "value": 5 }, - { "color": "red", "value": 10 } - ] - }, - "unit": "none" - }, - "overrides": [] - }, - "gridPos": { "h": 8, "w": 6, "x": 6, "y": 9 }, - "id": 4, - "options": { - "colorMode": "value", - "graphMode": "area", - "justifyMode": "auto", - "orientation": "auto", - "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, - "textMode": "auto" - }, - "targets": [ - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "vllm:num_requests_waiting", - "legendFormat": "Waiting", - "refId": "A" - } - ], - "title": "Requests Waiting", - "type": "stat" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "fieldConfig": { - "defaults": { - "color": { "mode": "thresholds" }, - "mappings": [], - "thresholds": { - "mode": "absolute", - "steps": [{ "color": "blue", "value": null }] - }, - "unit": "none" - }, - "overrides": [] - }, - "gridPos": { "h": 8, "w": 6, "x": 12, "y": 9 }, - "id": 5, - "options": { - "colorMode": "value", - "graphMode": "area", - "justifyMode": "auto", - "orientation": "auto", - "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, - "textMode": "auto" - }, - "targets": [ - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "vllm:num_requests_running", - "legendFormat": "Running", - "refId": "A" - } - ], - "title": "Requests Running", - "type": "stat" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "fieldConfig": { - "defaults": { - "color": { "mode": "palette-classic" }, - "custom": { - "axisBorderShow": false, - "axisCenteredZero": false, - "axisColorMode": "text", - "axisLabel": "", - "axisPlacement": "auto", - "barAlignment": 0, - "drawStyle": "line", - "fillOpacity": 10, - "gradientMode": "none", - "hideFrom": { "legend": false, "tooltip": false, "viz": false }, - "insertNulls": false, - "lineInterpolation": "linear", - "lineWidth": 1, - "pointSize": 5, - "scaleDistribution": { "type": "linear" }, - "showPoints": "never", - "spanNulls": false, - "stacking": { "group": "A", "mode": "none" }, - "thresholdsStyle": { "mode": "off" } - }, - "mappings": [], - "thresholds": { "mode": "absolute", "steps": [{ "color": "green", "value": null }] }, - "unit": "short" - }, - "overrides": [] - }, - "gridPos": { "h": 8, "w": 6, "x": 18, "y": 9 }, - "id": 6, - "options": { - "legend": { "calcs": ["mean"], "displayMode": "list", "placement": "bottom", "showLegend": true }, - "tooltip": { "mode": "multi", "sort": "none" } - }, - "targets": [ - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "rate(vllm:generation_tokens_total[1m])", - "legendFormat": "Tokens/sec", - "refId": "A" - } - ], - "title": "Token Generation Rate", - "type": "timeseries" - }, - { - "collapsed": false, - "gridPos": { "h": 1, "w": 24, "x": 0, "y": 17 }, - "id": 101, - "panels": [], - "title": "System Resources", - "type": "row" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "description": "Requires node_exporter on host", - "fieldConfig": { - "defaults": { - "color": { "mode": "palette-classic" }, - "custom": { - "axisBorderShow": false, - "axisCenteredZero": false, - "axisColorMode": "text", - "axisLabel": "", - "axisPlacement": "auto", - "barAlignment": 0, - "drawStyle": "line", - "fillOpacity": 10, - "gradientMode": "none", - "hideFrom": { "legend": false, "tooltip": false, "viz": false }, - "insertNulls": false, - "lineInterpolation": "linear", - "lineWidth": 1, - "pointSize": 5, - "scaleDistribution": { "type": "linear" }, - "showPoints": "never", - "spanNulls": false, - "stacking": { "group": "A", "mode": "none" }, - "thresholdsStyle": { "mode": "off" } - }, - "mappings": [], - "max": 100, - "min": 0, - "thresholds": { "mode": "absolute", "steps": [{ "color": "green", "value": null }] }, - "unit": "percent" - }, - "overrides": [] - }, - "gridPos": { "h": 8, "w": 12, "x": 0, "y": 18 }, - "id": 7, - "options": { - "legend": { "calcs": ["mean", "max"], "displayMode": "list", "placement": "bottom", "showLegend": true }, - "tooltip": { "mode": "multi", "sort": "none" } - }, - "targets": [ - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "100 - (avg by(instance) (rate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)", - "legendFormat": "CPU Usage", - "refId": "A" - } - ], - "title": "CPU Usage", - "type": "timeseries" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "description": "Requires node_exporter on host", - "fieldConfig": { - "defaults": { - "color": { "mode": "palette-classic" }, - "custom": { - "axisBorderShow": false, - "axisCenteredZero": false, - "axisColorMode": "text", - "axisLabel": "", - "axisPlacement": "auto", - "barAlignment": 0, - "drawStyle": "line", - "fillOpacity": 10, - "gradientMode": "none", - "hideFrom": { "legend": false, "tooltip": false, "viz": false }, - "insertNulls": false, - "lineInterpolation": "linear", - "lineWidth": 1, - "pointSize": 5, - "scaleDistribution": { "type": "linear" }, - "showPoints": "never", - "spanNulls": false, - "stacking": { "group": "A", "mode": "none" }, - "thresholdsStyle": { "mode": "off" } - }, - "mappings": [], - "thresholds": { "mode": "absolute", "steps": [{ "color": "green", "value": null }] }, - "unit": "bytes" - }, - "overrides": [] - }, - "gridPos": { "h": 8, "w": 12, "x": 12, "y": 18 }, - "id": 8, - "options": { - "legend": { "calcs": ["mean", "max"], "displayMode": "list", "placement": "bottom", "showLegend": true }, - "tooltip": { "mode": "multi", "sort": "none" } - }, - "targets": [ - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes", - "legendFormat": "Used Memory", - "refId": "A" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "node_memory_MemTotal_bytes", - "legendFormat": "Total Memory", - "refId": "B" - } - ], - "title": "Memory Usage", - "type": "timeseries" - }, - { - "collapsed": false, - "gridPos": { "h": 1, "w": 24, "x": 0, "y": 26 }, - "id": 102, - "panels": [], - "title": "GPU (requires dcgm-exporter)", - "type": "row" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "description": "Requires dcgm-exporter on host", - "fieldConfig": { - "defaults": { - "color": { "mode": "palette-classic" }, - "custom": { - "axisBorderShow": false, - "axisCenteredZero": false, - "axisColorMode": "text", - "axisLabel": "", - "axisPlacement": "auto", - "barAlignment": 0, - "drawStyle": "line", - "fillOpacity": 10, - "gradientMode": "none", - "hideFrom": { "legend": false, "tooltip": false, "viz": false }, - "insertNulls": false, - "lineInterpolation": "linear", - "lineWidth": 1, - "pointSize": 5, - "scaleDistribution": { "type": "linear" }, - "showPoints": "never", - "spanNulls": false, - "stacking": { "group": "A", "mode": "none" }, - "thresholdsStyle": { "mode": "off" } - }, - "mappings": [], - "max": 100, - "min": 0, - "thresholds": { "mode": "absolute", "steps": [{ "color": "green", "value": null }] }, - "unit": "percent" - }, - "overrides": [] - }, - "gridPos": { "h": 8, "w": 8, "x": 0, "y": 27 }, - "id": 9, - "options": { - "legend": { "calcs": ["mean", "max"], "displayMode": "list", "placement": "bottom", "showLegend": true }, - "tooltip": { "mode": "multi", "sort": "none" } - }, - "targets": [ - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "DCGM_FI_DEV_GPU_UTIL", - "legendFormat": "GPU {{gpu}}", - "refId": "A" - } - ], - "title": "GPU Utilization", - "type": "timeseries" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "description": "Requires dcgm-exporter on host", - "fieldConfig": { - "defaults": { - "color": { "mode": "palette-classic" }, - "custom": { - "axisBorderShow": false, - "axisCenteredZero": false, - "axisColorMode": "text", - "axisLabel": "", - "axisPlacement": "auto", - "barAlignment": 0, - "drawStyle": "line", - "fillOpacity": 10, - "gradientMode": "none", - "hideFrom": { "legend": false, "tooltip": false, "viz": false }, - "insertNulls": false, - "lineInterpolation": "linear", - "lineWidth": 1, - "pointSize": 5, - "scaleDistribution": { "type": "linear" }, - "showPoints": "never", - "spanNulls": false, - "stacking": { "group": "A", "mode": "none" }, - "thresholdsStyle": { "mode": "off" } - }, - "mappings": [], - "thresholds": { "mode": "absolute", "steps": [{ "color": "green", "value": null }] }, - "unit": "bytes" - }, - "overrides": [] - }, - "gridPos": { "h": 8, "w": 8, "x": 8, "y": 27 }, - "id": 10, - "options": { - "legend": { "calcs": ["mean", "max"], "displayMode": "list", "placement": "bottom", "showLegend": true }, - "tooltip": { "mode": "multi", "sort": "none" } - }, - "targets": [ - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "DCGM_FI_DEV_FB_USED * 1024 * 1024", - "legendFormat": "GPU {{gpu}} Used", - "refId": "A" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "DCGM_FI_DEV_FB_FREE * 1024 * 1024", - "legendFormat": "GPU {{gpu}} Free", - "refId": "B" - } - ], - "title": "GPU Memory", - "type": "timeseries" - }, - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "description": "Requires dcgm-exporter on host", - "fieldConfig": { - "defaults": { - "color": { "mode": "palette-classic" }, - "custom": { - "axisBorderShow": false, - "axisCenteredZero": false, - "axisColorMode": "text", - "axisLabel": "", - "axisPlacement": "auto", - "barAlignment": 0, - "drawStyle": "line", - "fillOpacity": 10, - "gradientMode": "none", - "hideFrom": { "legend": false, "tooltip": false, "viz": false }, - "insertNulls": false, - "lineInterpolation": "linear", - "lineWidth": 1, - "pointSize": 5, - "scaleDistribution": { "type": "linear" }, - "showPoints": "never", - "spanNulls": false, - "stacking": { "group": "A", "mode": "none" }, - "thresholdsStyle": { "mode": "off" } - }, - "mappings": [], - "thresholds": { "mode": "absolute", "steps": [{ "color": "green", "value": null }] }, - "unit": "celsius" - }, - "overrides": [] - }, - "gridPos": { "h": 8, "w": 8, "x": 16, "y": 27 }, - "id": 11, - "options": { - "legend": { "calcs": ["mean", "max"], "displayMode": "list", "placement": "bottom", "showLegend": true }, - "tooltip": { "mode": "multi", "sort": "none" } - }, - "targets": [ - { - "datasource": { "type": "prometheus", "uid": "prometheus" }, - "expr": "DCGM_FI_DEV_GPU_TEMP", - "legendFormat": "GPU {{gpu}} Temp", - "refId": "A" - } - ], - "title": "GPU Temperature", - "type": "timeseries" - } - ], - "refresh": "10s", - "schemaVersion": 39, - "tags": ["dream-server", "vllm", "inference"], - "templating": { "list": [] }, - "time": { "from": "now-1h", "to": "now" }, - "timepicker": {}, - "timezone": "browser", - "title": "Dream Server Overview", - "uid": "dream-server-overview", - "version": 1 -} diff --git a/dream-server/compose/grafana/datasources/prometheus.yml b/dream-server/compose/grafana/datasources/prometheus.yml deleted file mode 100644 index bb009bb21..000000000 --- a/dream-server/compose/grafana/datasources/prometheus.yml +++ /dev/null @@ -1,9 +0,0 @@ -apiVersion: 1 - -datasources: - - name: Prometheus - type: prometheus - access: proxy - url: http://prometheus:9090 - isDefault: true - editable: false diff --git a/dream-server/compose/livekit-cluster.yaml.template b/dream-server/compose/livekit-cluster.yaml.template deleted file mode 100644 index 4fc0bcda3..000000000 --- a/dream-server/compose/livekit-cluster.yaml.template +++ /dev/null @@ -1,39 +0,0 @@ -port: 7880 -rtc: - port_range_start: 50000 - port_range_end: 50100 - use_external_ip: true - tcp_port: 7881 - udp_port: 7882 - -# Production keys โ€” set via LIVEKIT_API_KEY and LIVEKIT_API_SECRET environment variables -keys: - ${LIVEKIT_API_KEY}: ${LIVEKIT_API_SECRET} - -logging: - level: info - json: true - -# Limits for cluster tier -limit: - num_tracks: 100 - bytes_per_sec: 100000000 # 100 MB/s total - subscription_limit_video: 50 - subscription_limit_audio: 100 - -# Room settings -room: - auto_create: true - empty_timeout: 300 # 5 min - max_participants: 50 # per room - -# Turn server (use external TURN for production) -# turn: -# enabled: true -# domain: turn.example.com -# tls_port: 443 - -# Webhook for analytics (optional) -# webhook: -# urls: -# - https://your-webhook-endpoint.com/livekit diff --git a/dream-server/compose/livekit-entrypoint.sh b/dream-server/compose/livekit-entrypoint.sh deleted file mode 100644 index 4268e47db..000000000 --- a/dream-server/compose/livekit-entrypoint.sh +++ /dev/null @@ -1,22 +0,0 @@ -#!/bin/bash -# LiveKit Server Entrypoint with Template Substitution -# Replaces environment variables in livekit.yaml.template โ†’ livekit.yaml - -set -e - -# Required environment variables -if [[ -z "${LIVEKIT_API_KEY}" ]]; then - echo "ERROR: LIVEKIT_API_KEY must be set" >&2 - exit 1 -fi - -if [[ -z "${LIVEKIT_API_SECRET}" ]]; then - echo "ERROR: LIVEKIT_API_SECRET must be set" >&2 - exit 1 -fi - -# Substitute environment variables in template -envsubst < /etc/livekit.yaml.template > /etc/livekit.yaml - -# Run LiveKit with the generated config -exec livekit-server --config /etc/livekit.yaml "$@" diff --git a/dream-server/compose/livekit.yaml b/dream-server/compose/livekit.yaml deleted file mode 100644 index b5781e170..000000000 --- a/dream-server/compose/livekit.yaml +++ /dev/null @@ -1,30 +0,0 @@ -# LiveKit Server Configuration for Dream Server -# https://docs.livekit.io/home/self-hosting/vm/#config -# -# SECURITY: API keys are set via LIVEKIT_KEYS environment variable -# in docker-compose, NOT in this file. Never commit secrets here. - -port: 7880 -rtc: - port_range_start: 50000 - port_range_end: 60000 - tcp_port: 7881 - use_external_ip: false - -# Keys are injected via LIVEKIT_KEYS environment variable -# Do not add a 'keys:' section here - it will conflict with env var - -logging: - level: info - pion_level: warn - -room: - enabled_codecs: - - mime: audio/opus - - mime: audio/red - max_participants: 10 - empty_timeout: 300 - departure_timeout: 20 - -turn: - enabled: false diff --git a/dream-server/compose/prometheus.yml b/dream-server/compose/prometheus.yml deleted file mode 100644 index 61de110e9..000000000 --- a/dream-server/compose/prometheus.yml +++ /dev/null @@ -1,28 +0,0 @@ -# Prometheus Configuration โ€” Dream Server Cluster -# Scrapes metrics from vLLM, Whisper, and system - -global: - scrape_interval: 15s - evaluation_interval: 15s - -scrape_configs: - # vLLM metrics - - job_name: 'vllm' - static_configs: - - targets: ['vllm:8000'] - metrics_path: /metrics - - # Node exporter (if installed on host) - - job_name: 'node' - static_configs: - - targets: ['host.docker.internal:9100'] - - # NVIDIA GPU metrics (dcgm-exporter) - - job_name: 'gpu' - static_configs: - - targets: ['host.docker.internal:9400'] - - # Prometheus self-monitoring - - job_name: 'prometheus' - static_configs: - - targets: ['localhost:9090'] diff --git a/dream-server/config/backends/amd.json b/dream-server/config/backends/amd.json new file mode 100644 index 000000000..f444da7bf --- /dev/null +++ b/dream-server/config/backends/amd.json @@ -0,0 +1,9 @@ +{ + "id": "amd", + "llm_engine": "llama-server", + "service_name": "llama-server", + "public_api_port": 8080, + "public_health_url": "http://localhost:8080/health", + "provider_name": "local-ollama", + "provider_url": "http://llama-server:8080/v1" +} diff --git a/dream-server/config/backends/apple.json b/dream-server/config/backends/apple.json new file mode 100644 index 000000000..2a4cfd3f8 --- /dev/null +++ b/dream-server/config/backends/apple.json @@ -0,0 +1,9 @@ +{ + "id": "apple", + "llm_engine": "llama-server", + "service_name": "llama-server", + "public_api_port": 8080, + "public_health_url": "http://localhost:8080/health", + "provider_name": "local-mlx", + "provider_url": "http://llama-server:8080/v1" +} diff --git a/dream-server/config/backends/cpu.json b/dream-server/config/backends/cpu.json new file mode 100644 index 000000000..c4e2ca5ff --- /dev/null +++ b/dream-server/config/backends/cpu.json @@ -0,0 +1,9 @@ +{ + "id": "cpu", + "llm_engine": "llama-server", + "service_name": "llama-server", + "public_api_port": 8080, + "public_health_url": "http://localhost:8080/health", + "provider_name": "local-llama", + "provider_url": "http://llama-server:8080/v1" +} diff --git a/dream-server/config/backends/nvidia.json b/dream-server/config/backends/nvidia.json new file mode 100644 index 000000000..446ed6a74 --- /dev/null +++ b/dream-server/config/backends/nvidia.json @@ -0,0 +1,9 @@ +{ + "id": "nvidia", + "llm_engine": "llama-server", + "service_name": "llama-server", + "public_api_port": 8080, + "public_health_url": "http://localhost:8080/health", + "provider_name": "local-llama", + "provider_url": "http://llama-server:8080/v1" +} diff --git a/dream-server/config/capability-profile.schema.json b/dream-server/config/capability-profile.schema.json new file mode 100644 index 000000000..f452f8f35 --- /dev/null +++ b/dream-server/config/capability-profile.schema.json @@ -0,0 +1,117 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://dream-server.dev/schema/capability-profile.v1.json", + "title": "Dream Server Capability Profile v1", + "type": "object", + "required": [ + "version", + "platform", + "gpu", + "runtime", + "compose", + "tier", + "hardware_class" + ], + "properties": { + "version": { + "const": "1" + }, + "platform": { + "type": "object", + "required": ["id", "family"], + "properties": { + "id": { + "type": "string", + "enum": ["linux", "wsl", "macos", "windows", "unknown"] + }, + "family": { + "type": "string", + "enum": ["linux", "windows", "darwin", "unknown"] + } + }, + "additionalProperties": false + }, + "gpu": { + "type": "object", + "required": ["vendor", "name", "memory_type", "count", "vram_mb"], + "properties": { + "vendor": { + "type": "string", + "enum": ["nvidia", "amd", "apple", "none", "unknown"] + }, + "name": { + "type": "string" + }, + "memory_type": { + "type": "string", + "enum": ["discrete", "unified", "none", "unknown"] + }, + "count": { + "type": "integer", + "minimum": 0 + }, + "vram_mb": { + "type": "integer", + "minimum": 0 + } + }, + "additionalProperties": false + }, + "runtime": { + "type": "object", + "required": ["llm_backend", "llm_health_url", "llm_api_port"], + "properties": { + "llm_backend": { + "type": "string", + "enum": ["nvidia", "amd", "apple", "cpu"] + }, + "llm_health_url": { + "type": "string" + }, + "llm_api_port": { + "type": "integer", + "minimum": 1 + } + }, + "additionalProperties": false + }, + "compose": { + "type": "object", + "required": ["overlays"], + "properties": { + "overlays": { + "type": "array", + "items": { + "type": "string" + } + } + }, + "additionalProperties": false + }, + "tier": { + "type": "object", + "required": ["recommended"], + "properties": { + "recommended": { + "type": "string", + "enum": ["T1", "T2", "T3", "T4", "SH_COMPACT", "SH_LARGE"] + } + }, + "additionalProperties": false + }, + "hardware_class": { + "type": "object", + "required": ["id", "label"], + "properties": { + "id": { + "type": "string" + }, + "label": { + "type": "string" + } + }, + "additionalProperties": false + } + }, + "additionalProperties": false +} diff --git a/dream-server/config/gpu-database.json b/dream-server/config/gpu-database.json new file mode 100644 index 000000000..6240101ac --- /dev/null +++ b/dream-server/config/gpu-database.json @@ -0,0 +1,275 @@ +{ + "schema_version": "dream.hardware.v1", + "_attribution": { + "gpu_bandwidth_data": "llmfit by Alex Jones (MIT) โ€” github.com/AlexsJones/llmfit", + "note": "GPU bandwidth numbers sourced from the llmfit project's hardware database. Thank you to Alex Jones and the llmfit contributors for maintaining this excellent open-source resource." + }, + "known_gpus": [ + { + "id": "rtx_pro_6000_blackwell", + "match": { + "device_ids": [], + "name_patterns": ["RTX PRO 6000", "Blackwell"] + }, + "specs": { + "label": "NVIDIA RTX PRO 6000 Blackwell Workstation Edition", + "vendor": "nvidia", + "architecture": "blackwell", + "memory_type": "discrete", + "memory_mb": 96000, + "memory_source": "vram", + "bandwidth_gbps": 1792 + }, + "recommended": { + "backend": "nvidia", + "tier": "NV_ULTRA" + } + }, + { + "id": "strix_halo_395", + "match": { + "device_ids": ["0x1586"], + "name_patterns": ["Radeon 8060S", "RYZEN AI MAX+ 395", "Strix Halo"] + }, + "specs": { + "label": "AMD Ryzen AI MAX+ 395 (Strix Halo)", + "vendor": "amd", + "architecture": "rdna-3.5", + "memory_type": "unified", + "memory_mb": 98304, + "memory_source": "ram", + "bandwidth_gbps": 256, + "compute_units": 40 + }, + "recommended": { + "backend": "amd", + "tier": "SH_LARGE" + } + }, + { + "id": "strix_halo_390", + "match": { + "device_ids": ["0x1586"], + "name_patterns": ["RYZEN AI MAX 390", "Radeon 8050S"] + }, + "specs": { + "label": "AMD Ryzen AI MAX 390 (Strix Halo)", + "vendor": "amd", + "architecture": "rdna-3.5", + "memory_type": "unified", + "memory_mb": 65536, + "memory_source": "ram", + "bandwidth_gbps": 256, + "compute_units": 32 + }, + "recommended": { + "backend": "amd", + "tier": "SH_COMPACT" + } + }, + { + "id": "strix_halo_385", + "match": { + "device_ids": ["0x1586"], + "name_patterns": ["RYZEN AI MAX+ 385"] + }, + "specs": { + "label": "AMD Ryzen AI MAX+ 385 (Strix Halo)", + "vendor": "amd", + "architecture": "rdna-3.5", + "memory_type": "unified", + "memory_mb": 98304, + "memory_source": "ram", + "bandwidth_gbps": 256, + "compute_units": 32 + }, + "recommended": { + "backend": "amd", + "tier": "SH_LARGE" + } + } + ], + "known_gpu_bandwidth": { + "nvidia": { + "RTX PRO 6000": 1792, + "RTX 5090": 1792, + "RTX 5080": 960, + "RTX 5070 Ti": 896, + "RTX 5070": 672, + "RTX 5060 Ti": 448, + "RTX 5060": 256, + "RTX 4090": 1008, + "RTX 4080 Super": 736, + "RTX 4080": 717, + "RTX 4070 Ti Super": 672, + "RTX 4070 Ti": 504, + "RTX 4070 Super": 504, + "RTX 4070": 504, + "RTX 4060 Ti": 288, + "RTX 4060": 272, + "RTX 3090 Ti": 1008, + "RTX 3090": 936, + "RTX 3080 Ti": 912, + "RTX 3080": 760, + "RTX 3070 Ti": 608, + "RTX 3070": 448, + "RTX 3060 Ti": 448, + "RTX 3060": 360, + "RTX 2080 Ti": 616, + "RTX 2080 Super": 496, + "RTX 2080": 448, + "RTX 2070 Super": 448, + "RTX 2070": 448, + "RTX 2060 Super": 448, + "RTX 2060": 336, + "GTX 1660 Ti": 288, + "GTX 1660 Super": 336, + "GTX 1660": 192, + "GTX 1650 Super": 192, + "GTX 1650": 128, + "H200": 4800, + "H100 SXM": 3350, + "H100 PCIe": 2039, + "A100 SXM": 2039, + "A100 PCIe": 1555, + "V100 SXM": 900, + "V100": 897, + "L40S": 864, + "L40": 864, + "A6000": 768, + "A5000": 768, + "A10G": 600, + "A10": 600, + "A4000": 448, + "T4": 320, + "L4": 300 + }, + "amd": { + "RX 9070 XT": 624, + "RX 9070": 488, + "RX 7900 XTX": 960, + "RX 7900 XT": 800, + "RX 7900 GRE": 576, + "RX 7800 XT": 624, + "RX 7700 XT": 432, + "RX 7600": 288, + "RX 6950 XT": 576, + "RX 6900 XT": 512, + "RX 6800 XT": 512, + "RX 6800": 512, + "RX 6700 XT": 384, + "RX 6600 XT": 256, + "RX 6600": 224, + "MI300X": 5300, + "MI300": 5300, + "MI250X": 3277, + "MI250": 3277, + "MI210": 1638, + "MI100": 1229 + }, + "apple": { + "M4 Ultra": 819, + "M4 Max": 546, + "M4 Pro": 273, + "M4": 120, + "M3 Ultra": 800, + "M3 Max": 400, + "M3 Pro": 150, + "M3": 100, + "M2 Ultra": 800, + "M2 Max": 400, + "M2 Pro": 200, + "M2": 100, + "M1 Ultra": 800, + "M1 Max": 400, + "M1 Pro": 200, + "M1": 68 + } + }, + "heuristic_classes": [ + { + "id": "nvidia_ultra", + "match": { "vendor": "nvidia", "memory_type": "discrete", "min_vram_mb": 92160 }, + "recommended": { "backend": "nvidia", "tier": "NV_ULTRA" } + }, + { + "id": "nvidia_enterprise", + "match": { "vendor": "nvidia", "memory_type": "discrete", "min_vram_mb": 40960 }, + "recommended": { "backend": "nvidia", "tier": "T4" } + }, + { + "id": "nvidia_pro", + "match": { "vendor": "nvidia", "memory_type": "discrete", "min_vram_mb": 20480 }, + "recommended": { "backend": "nvidia", "tier": "T3" } + }, + { + "id": "nvidia_prosumer", + "match": { "vendor": "nvidia", "memory_type": "discrete", "min_vram_mb": 12288 }, + "recommended": { "backend": "nvidia", "tier": "T2" } + }, + { + "id": "nvidia_entry", + "match": { "vendor": "nvidia", "memory_type": "discrete", "min_vram_mb": 0 }, + "recommended": { "backend": "nvidia", "tier": "T1" } + }, + { + "id": "amd_unified_large", + "match": { "vendor": "amd", "memory_type": "unified", "min_ram_mb": 92160 }, + "recommended": { "backend": "amd", "tier": "SH_LARGE" } + }, + { + "id": "amd_unified_compact", + "match": { "vendor": "amd", "memory_type": "unified", "min_ram_mb": 0 }, + "recommended": { "backend": "amd", "tier": "SH_COMPACT" } + }, + { + "id": "amd_discrete_large", + "match": { "vendor": "amd", "memory_type": "discrete", "min_vram_mb": 20480 }, + "recommended": { "backend": "amd", "tier": "T3" } + }, + { + "id": "amd_discrete_medium", + "match": { "vendor": "amd", "memory_type": "discrete", "min_vram_mb": 12288 }, + "recommended": { "backend": "amd", "tier": "T2" } + }, + { + "id": "amd_discrete_entry", + "match": { "vendor": "amd", "memory_type": "discrete", "min_vram_mb": 0 }, + "recommended": { "backend": "amd", "tier": "T1" } + }, + { + "id": "apple_ultra", + "match": { "vendor": "apple", "memory_type": "unified", "min_ram_mb": 131072 }, + "recommended": { "backend": "apple", "tier": "T4" } + }, + { + "id": "apple_max", + "match": { "vendor": "apple", "memory_type": "unified", "min_ram_mb": 65536 }, + "recommended": { "backend": "apple", "tier": "T3" } + }, + { + "id": "apple_pro", + "match": { "vendor": "apple", "memory_type": "unified", "min_ram_mb": 32768 }, + "recommended": { "backend": "apple", "tier": "T2" } + }, + { + "id": "apple_base", + "match": { "vendor": "apple", "memory_type": "unified", "min_ram_mb": 0 }, + "recommended": { "backend": "apple", "tier": "T1" } + }, + { + "id": "cpu_only", + "match": { "vendor": "none", "memory_type": "none", "min_ram_mb": 0 }, + "recommended": { "backend": "cpu", "tier": "T1" } + } + ], + "defaults": { + "bandwidth_gbps": { + "cuda": 220, + "rocm": 180, + "metal": 160, + "cpu_x86": 70, + "cpu_arm": 50 + } + } +} diff --git a/dream-server/config/gpu-database.schema.json b/dream-server/config/gpu-database.schema.json new file mode 100644 index 000000000..87d38a8bd --- /dev/null +++ b/dream-server/config/gpu-database.schema.json @@ -0,0 +1,138 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "dream.hardware.v1", + "title": "Dream Server GPU Database", + "description": "GPU knowledge base for hardware classification. Known GPUs with specs, bandwidth lookup table, and heuristic fallback classes.", + "type": "object", + "required": ["schema_version", "known_gpus", "known_gpu_bandwidth", "heuristic_classes", "defaults"], + "properties": { + "schema_version": { + "type": "string", + "const": "dream.hardware.v1" + }, + "_attribution": { + "type": "object", + "properties": { + "gpu_bandwidth_data": { "type": "string" }, + "note": { "type": "string" } + } + }, + "known_gpus": { + "type": "array", + "items": { "$ref": "#/$defs/known_gpu" } + }, + "known_gpu_bandwidth": { + "type": "object", + "properties": { + "nvidia": { "$ref": "#/$defs/bandwidth_map" }, + "amd": { "$ref": "#/$defs/bandwidth_map" }, + "apple": { "$ref": "#/$defs/bandwidth_map" } + }, + "additionalProperties": { "$ref": "#/$defs/bandwidth_map" } + }, + "heuristic_classes": { + "type": "array", + "items": { "$ref": "#/$defs/heuristic_class" } + }, + "defaults": { + "type": "object", + "required": ["bandwidth_gbps"], + "properties": { + "bandwidth_gbps": { + "type": "object", + "additionalProperties": { "type": "number", "minimum": 0 } + } + } + } + }, + "$defs": { + "known_gpu": { + "type": "object", + "required": ["id", "match", "specs", "recommended"], + "properties": { + "id": { + "type": "string", + "pattern": "^[a-z0-9_]+$", + "description": "Unique identifier for this known GPU entry" + }, + "match": { + "type": "object", + "properties": { + "device_ids": { + "type": "array", + "items": { "type": "string", "pattern": "^0x[0-9a-fA-F]{4}$" }, + "description": "PCI device IDs to match (exact)" + }, + "name_patterns": { + "type": "array", + "items": { "type": "string" }, + "description": "Substring patterns to match against GPU name (case-insensitive)" + } + }, + "anyOf": [ + { "required": ["device_ids"] }, + { "required": ["name_patterns"] } + ] + }, + "specs": { + "type": "object", + "required": ["label", "vendor", "architecture", "memory_type", "memory_mb", "bandwidth_gbps"], + "properties": { + "label": { "type": "string" }, + "vendor": { "enum": ["nvidia", "amd", "apple", "intel"] }, + "architecture": { "type": "string" }, + "memory_type": { "enum": ["discrete", "unified"] }, + "memory_mb": { "type": "integer", "minimum": 0 }, + "memory_source": { + "enum": ["vram", "ram"], + "description": "Where to read actual memory from. 'ram' = use system RAM (for unified memory GPUs where reported VRAM is unreliable)" + }, + "bandwidth_gbps": { "type": "number", "minimum": 0 }, + "compute_units": { "type": "integer", "minimum": 0 } + } + }, + "recommended": { + "$ref": "#/$defs/recommendation" + } + } + }, + "heuristic_class": { + "type": "object", + "required": ["id", "match", "recommended"], + "properties": { + "id": { + "type": "string", + "pattern": "^[a-z0-9_]+$" + }, + "match": { + "type": "object", + "properties": { + "vendor": { "enum": ["nvidia", "amd", "apple", "intel", "none"] }, + "memory_type": { "enum": ["discrete", "unified", "none"] }, + "min_vram_mb": { "type": "integer", "minimum": 0 }, + "min_ram_mb": { "type": "integer", "minimum": 0 } + } + }, + "recommended": { + "$ref": "#/$defs/recommendation" + } + } + }, + "recommendation": { + "type": "object", + "required": ["backend", "tier"], + "properties": { + "backend": { "enum": ["nvidia", "amd", "apple", "cpu"] }, + "tier": { + "type": "string", + "pattern": "^(T[1-4]|SH_LARGE|SH_COMPACT|NV_ULTRA)$" + } + } + }, + "bandwidth_map": { + "type": "object", + "additionalProperties": { "type": "number", "minimum": 0 }, + "description": "Map of GPU model name to bandwidth in GB/s" + } + } +} diff --git a/dream-server/config/hardware-classes.json b/dream-server/config/hardware-classes.json new file mode 100644 index 000000000..6fc3d4b83 --- /dev/null +++ b/dream-server/config/hardware-classes.json @@ -0,0 +1,155 @@ +{ + "version": "1", + "classes": [ + { + "id": "strix_unified_large", + "label": "Strix Halo (90GB+)", + "match": { + "platform_id": ["linux", "wsl"], + "gpu_vendor": ["amd"], + "memory_type": ["unified"], + "min_vram_mb": 92160 + }, + "recommended": { + "backend": "amd", + "tier": "SH_LARGE", + "compose_overlays": ["docker-compose.base.yml", "docker-compose.amd.yml"] + } + }, + { + "id": "strix_unified", + "label": "Strix Unified", + "match": { + "platform_id": ["linux", "wsl"], + "gpu_vendor": ["amd"], + "memory_type": ["unified"], + "min_vram_mb": 65536 + }, + "recommended": { + "backend": "amd", + "tier": "SH_COMPACT", + "compose_overlays": ["docker-compose.base.yml", "docker-compose.amd.yml"] + } + }, + { + "id": "nvidia_ultra", + "label": "NVIDIA Ultra (90GB+)", + "match": { + "platform_id": ["linux", "wsl"], + "gpu_vendor": ["nvidia"], + "memory_type": ["discrete"], + "min_vram_mb": 92160 + }, + "recommended": { + "backend": "nvidia", + "tier": "NV_ULTRA", + "compose_overlays": ["docker-compose.base.yml", "docker-compose.nvidia.yml"] + } + }, + { + "id": "nvidia_enterprise", + "label": "NVIDIA Enterprise (40GB+)", + "match": { + "platform_id": ["linux", "wsl"], + "gpu_vendor": ["nvidia"], + "memory_type": ["discrete"], + "min_vram_mb": 40960 + }, + "recommended": { + "backend": "nvidia", + "tier": "T4", + "compose_overlays": ["docker-compose.base.yml", "docker-compose.nvidia.yml"] + } + }, + { + "id": "nvidia_pro", + "label": "NVIDIA Pro (20GB+)", + "match": { + "platform_id": ["linux", "wsl"], + "gpu_vendor": ["nvidia"], + "memory_type": ["discrete"], + "min_vram_mb": 20480 + }, + "recommended": { + "backend": "nvidia", + "tier": "T3", + "compose_overlays": ["docker-compose.base.yml", "docker-compose.nvidia.yml"] + } + }, + { + "id": "nvidia_prosumer", + "label": "NVIDIA Prosumer (12GB+)", + "match": { + "platform_id": ["linux", "wsl"], + "gpu_vendor": ["nvidia"], + "memory_type": ["discrete"], + "min_vram_mb": 12288 + }, + "recommended": { + "backend": "nvidia", + "tier": "T2", + "compose_overlays": ["docker-compose.base.yml", "docker-compose.nvidia.yml"] + } + }, + { + "id": "nvidia_entry", + "label": "NVIDIA Entry", + "match": { + "platform_id": ["linux", "wsl"], + "gpu_vendor": ["nvidia"], + "memory_type": ["discrete"], + "min_vram_mb": 0 + }, + "recommended": { + "backend": "nvidia", + "tier": "T1", + "compose_overlays": ["docker-compose.base.yml", "docker-compose.nvidia.yml"] + } + }, + { + "id": "apple_silicon_pro", + "label": "Apple Silicon Pro (36GB+)", + "match": { + "platform_id": ["macos"], + "gpu_vendor": ["apple"], + "memory_type": ["unified"], + "min_vram_mb": 36864 + }, + "recommended": { + "backend": "apple", + "tier": "T3", + "compose_overlays": ["docker-compose.base.yml", "docker-compose.apple.yml"] + } + }, + { + "id": "apple_silicon", + "label": "Apple Silicon", + "match": { + "platform_id": ["macos"], + "gpu_vendor": ["apple"], + "memory_type": ["unified"], + "min_vram_mb": 8192 + }, + "recommended": { + "backend": "apple", + "tier": "T2", + "compose_overlays": ["docker-compose.base.yml", "docker-compose.apple.yml"] + } + }, + { + "id": "cpu_fallback", + "label": "CPU Fallback", + "match": { + "platform_id": ["linux", "wsl", "macos", "windows", "unknown"], + "gpu_vendor": ["none", "unknown"], + "memory_type": ["discrete", "unified", "none", "unknown"], + "min_vram_mb": 0 + }, + "recommended": { + "backend": "cpu", + "tier": "T1", + "compose_overlays": ["docker-compose.base.yml"] + } + } + ] +} diff --git a/dream-server/config/installer-sim-summary.schema.json b/dream-server/config/installer-sim-summary.schema.json new file mode 100644 index 000000000..c4b1c4dbc --- /dev/null +++ b/dream-server/config/installer-sim-summary.schema.json @@ -0,0 +1,57 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://dream-server.dev/schema/installer-sim-summary.v1.json", + "title": "Installer Simulation Summary v1", + "type": "object", + "required": ["version", "generated_at", "runs"], + "properties": { + "version": { "const": "1" }, + "generated_at": { "type": "string" }, + "runs": { + "type": "object", + "required": ["linux_dryrun", "macos_installer_mvp", "windows_scenario_preflight", "doctor_snapshot"], + "properties": { + "linux_dryrun": { + "type": "object", + "required": ["exit_code", "signals", "log"], + "properties": { + "exit_code": { "type": "integer" }, + "signals": { "type": "object" }, + "log": { "type": "string" } + }, + "additionalProperties": true + }, + "macos_installer_mvp": { + "type": "object", + "required": ["exit_code", "log"], + "properties": { + "exit_code": { "type": "integer" }, + "log": { "type": "string" }, + "preflight": { "type": ["object", "null"] }, + "doctor": { "type": ["object", "null"] } + }, + "additionalProperties": true + }, + "windows_scenario_preflight": { + "type": "object", + "required": ["report"], + "properties": { + "report": { "type": ["object", "null"] } + }, + "additionalProperties": true + }, + "doctor_snapshot": { + "type": "object", + "required": ["exit_code", "report"], + "properties": { + "exit_code": { "type": "integer" }, + "report": { "type": ["object", "null"] } + }, + "additionalProperties": true + } + }, + "additionalProperties": true + } + }, + "additionalProperties": true +} diff --git a/dream-server/config/litellm/cloud-config.yaml b/dream-server/config/litellm/cloud-config.yaml deleted file mode 100644 index eeefacd0e..000000000 --- a/dream-server/config/litellm/cloud-config.yaml +++ /dev/null @@ -1,55 +0,0 @@ -# LiteLLM Cloud Mode Configuration -# Full cloud model access - -model_list: - # Claude (Anthropic) - - model_name: claude-sonnet - litellm_params: - model: claude-sonnet-4-5 - api_key: os.environ/ANTHROPIC_API_KEY - model_info: - description: "Claude Sonnet 4.5 - Best for coding and analysis" - - - model_name: claude-opus - litellm_params: - model: claude-opus-4 - api_key: os.environ/ANTHROPIC_API_KEY - model_info: - description: "Claude Opus 4 - Most capable, best reasoning" - - # OpenAI - - model_name: gpt-4o - litellm_params: - model: gpt-4o - api_key: os.environ/OPENAI_API_KEY - model_info: - description: "GPT-4o - Fast and capable" - - - model_name: gpt-4-turbo - litellm_params: - model: gpt-4-turbo-preview - api_key: os.environ/OPENAI_API_KEY - model_info: - description: "GPT-4 Turbo - Latest GPT-4" - - # Together AI (open source models) - - model_name: llama-3.1-70b - litellm_params: - model: together_ai/meta-llama/Llama-3.1-70B-Instruct-Turbo - api_key: os.environ/TOGETHER_API_KEY - model_info: - description: "Llama 3.1 70B - Open source powerhouse" - - - model_name: qwen-72b - litellm_params: - model: together_ai/Qwen/Qwen2.5-72B-Instruct-Turbo - api_key: os.environ/TOGETHER_API_KEY - model_info: - description: "Qwen 2.5 72B - Excellent for coding" - -litellm_settings: - drop_params: true - set_verbose: false - -general_settings: - master_key: os.environ/LITELLM_MASTER_KEY diff --git a/dream-server/config/litellm/cloud.yaml b/dream-server/config/litellm/cloud.yaml new file mode 100644 index 000000000..053386011 --- /dev/null +++ b/dream-server/config/litellm/cloud.yaml @@ -0,0 +1,25 @@ +model_list: + - model_name: default + litellm_params: + model: anthropic/claude-sonnet-4-5-20250514 + api_key: os.environ/ANTHROPIC_API_KEY + + - model_name: gpt4o + litellm_params: + model: openai/gpt-4o + api_key: os.environ/OPENAI_API_KEY + + - model_name: fast + litellm_params: + model: anthropic/claude-haiku-4-5-20251001 + api_key: os.environ/ANTHROPIC_API_KEY + +router_settings: + routing_strategy: simple-shuffle + +general_settings: + master_key: os.environ/LITELLM_MASTER_KEY + +litellm_settings: + drop_params: true + set_verbose: false diff --git a/dream-server/config/litellm/config.yaml b/dream-server/config/litellm/config.yaml deleted file mode 100644 index 54f535277..000000000 --- a/dream-server/config/litellm/config.yaml +++ /dev/null @@ -1,36 +0,0 @@ -# LiteLLM Configuration -# Use this when running multiple models or providers - -model_list: - # Local vLLM model - - model_name: local-qwen - litellm_params: - model: openai/Qwen/Qwen2.5-32B-Instruct-AWQ - api_base: http://vllm:8000/v1 - api_key: ${VLLM_API_KEY:-} - model_info: - max_tokens: 8192 - - # Example: Add OpenAI for comparison - # - model_name: gpt-4o - # litellm_params: - # model: gpt-4o - # api_key: ${OPENAI_API_KEY} - - # Example: Add Claude - # - model_name: claude-sonnet - # litellm_params: - # model: claude-3-5-sonnet-20241022 - # api_key: ${ANTHROPIC_API_KEY} - -# General settings -litellm_settings: - drop_params: true - set_verbose: false - num_retries: 3 - -# Router settings (for load balancing multiple backends) -router_settings: - routing_strategy: simple-shuffle - model_group_alias: - default: local-qwen diff --git a/dream-server/config/litellm/hybrid-config.yaml b/dream-server/config/litellm/hybrid-config.yaml deleted file mode 100644 index d14d18a54..000000000 --- a/dream-server/config/litellm/hybrid-config.yaml +++ /dev/null @@ -1,49 +0,0 @@ -# LiteLLM Hybrid Config โ€” Local Primary + Cloud Fallback -# Mission: M1 (Fully Local OpenClaw) โ†’ M5 (Clonable Dream Setup Server) - -model_list: - # Local model (primary) - - model_name: qwen2.5-32b-instruct-awq - litellm_params: - model: openai/qwen2.5-32b-instruct-awq - api_base: http://localhost:8000/v1 - api_key: dummy - tpm: 100000 - rpm: 1000 - - # Cloud fallback (when local fails) - - model_name: gpt-4o - litellm_params: - model: gpt-4o - api_key: ${CLOUD_API_KEY} - api_base: ${CLOUD_BASE_URL} - tpm: 1000000 - rpm: 10000 - - - model_name: claude-3-5-sonnet - litellm_params: - model: claude-3-5-sonnet-20241022 - api_key: ${CLOUD_API_KEY} - api_base: ${CLOUD_BASE_URL} - tpm: 1000000 - rpm: 10000 - -litellm_settings: - # Retry on failure (local โ†’ cloud fallback) - num_retries: 3 - request_timeout: 300 - - # Fallback configuration - fallback_models: - - gpt-4o - - claude-3-5-sonnet - - # Circuit breaker - circuit_breaker: - errors: 3 - timeout: 60 - -general_settings: - master_key: ${LITELLM_MASTER_KEY:?LITELLM_MASTER_KEY must be set} - logs_dir: ./logs - database_url: ./data/litellm.db diff --git a/dream-server/config/litellm/hybrid.yaml b/dream-server/config/litellm/hybrid.yaml new file mode 100644 index 000000000..d26cf91e8 --- /dev/null +++ b/dream-server/config/litellm/hybrid.yaml @@ -0,0 +1,25 @@ +model_list: + - model_name: default + litellm_params: + model: openai/default + api_base: http://llama-server:8080/v1 + api_key: not-needed + + - model_name: default + litellm_params: + model: anthropic/claude-sonnet-4-5-20250514 + api_key: os.environ/ANTHROPIC_API_KEY + +router_settings: + routing_strategy: simple-shuffle + num_retries: 2 + fallbacks: + - default: + - default + +general_settings: + master_key: os.environ/LITELLM_MASTER_KEY + +litellm_settings: + drop_params: true + set_verbose: false diff --git a/dream-server/config/litellm/local.yaml b/dream-server/config/litellm/local.yaml new file mode 100644 index 000000000..27a8c0212 --- /dev/null +++ b/dream-server/config/litellm/local.yaml @@ -0,0 +1,19 @@ +model_list: + - model_name: default + litellm_params: + model: openai/default + api_base: http://llama-server:8080/v1 + api_key: not-needed + + - model_name: "*" + litellm_params: + model: openai/* + api_base: http://llama-server:8080/v1 + api_key: not-needed + +general_settings: + master_key: os.environ/LITELLM_MASTER_KEY + +litellm_settings: + drop_params: true + set_verbose: false diff --git a/dream-server/config/litellm/offline-config.yaml b/dream-server/config/litellm/offline-config.yaml deleted file mode 100644 index aaad53548..000000000 --- a/dream-server/config/litellm/offline-config.yaml +++ /dev/null @@ -1,36 +0,0 @@ -# LiteLLM Offline Mode Configuration -# Local models only - no cloud access - -model_list: - # Local vLLM - - model_name: qwen-32b - litellm_params: - model: openai/Qwen/Qwen2.5-32B-Instruct-AWQ - api_base: http://vllm:8000/v1 - api_key: not-needed - model_info: - description: "Local Qwen 32B via vLLM" - - # Local Ollama (CPU fallback) - - model_name: qwen-cpu - litellm_params: - model: ollama/qwen2.5:32b - api_base: http://ollama:11434 - model_info: - description: "Local Qwen 32B via Ollama (CPU)" - - # Default route to vLLM - - model_name: default - litellm_params: - model: openai/Qwen/Qwen2.5-32B-Instruct-AWQ - api_base: http://vllm:8000/v1 - api_key: not-needed - model_info: - description: "Default to local vLLM" - -litellm_settings: - drop_params: true - set_verbose: false - -general_settings: - master_key: os.environ/LITELLM_MASTER_KEY diff --git a/dream-server/config/litellm/strix-halo-config.yaml b/dream-server/config/litellm/strix-halo-config.yaml new file mode 100644 index 000000000..27a8c0212 --- /dev/null +++ b/dream-server/config/litellm/strix-halo-config.yaml @@ -0,0 +1,19 @@ +model_list: + - model_name: default + litellm_params: + model: openai/default + api_base: http://llama-server:8080/v1 + api_key: not-needed + + - model_name: "*" + litellm_params: + model: openai/* + api_base: http://llama-server:8080/v1 + api_key: not-needed + +general_settings: + master_key: os.environ/LITELLM_MASTER_KEY + +litellm_settings: + drop_params: true + set_verbose: false diff --git a/dream-server/config/livekit/Dockerfile b/dream-server/config/livekit/Dockerfile deleted file mode 100644 index 530f762e6..000000000 --- a/dream-server/config/livekit/Dockerfile +++ /dev/null @@ -1,19 +0,0 @@ -# LiveKit Server with Environment Variable Support -# Adds envsubst for runtime config generation - -FROM livekit/livekit-server:v1.9.11 - -# Install envsubst (from gettext) โ€” livekit base image is Alpine -USER root -RUN apk add --no-cache gettext - -# Copy entrypoint script -COPY livekit-entrypoint.sh /usr/local/bin/ -RUN chmod +x /usr/local/bin/livekit-entrypoint.sh - -# Use non-root user -USER 1000:1000 - -# Set entrypoint -ENTRYPOINT ["/usr/local/bin/livekit-entrypoint.sh"] -CMD ["--config", "/tmp/livekit.yaml"] diff --git a/dream-server/config/livekit/livekit-entrypoint.sh b/dream-server/config/livekit/livekit-entrypoint.sh deleted file mode 100755 index 2e10a8cf2..000000000 --- a/dream-server/config/livekit/livekit-entrypoint.sh +++ /dev/null @@ -1,34 +0,0 @@ -#!/bin/sh -# livekit-entrypoint.sh -# Substitutes environment variables in LiveKit config and starts server - -set -e - -CONFIG_TEMPLATE="/etc/livekit.yaml.template" -CONFIG_OUTPUT="/tmp/livekit.yaml" - -# Check if template exists -if [ -f "$CONFIG_TEMPLATE" ]; then - echo "Generating LiveKit config from template..." - - # Check required env vars - if [ -z "${LIVEKIT_API_KEY:-}" ]; then - echo "ERROR: LIVEKIT_API_KEY environment variable is required" - exit 1 - fi - - if [ -z "${LIVEKIT_API_SECRET:-}" ]; then - echo "ERROR: LIVEKIT_API_SECRET environment variable is required" - exit 1 - fi - - # Substitute environment variables - envsubst < "$CONFIG_TEMPLATE" > "$CONFIG_OUTPUT" - echo "LiveKit config generated successfully" -else - echo "ERROR: Config template not found at $CONFIG_TEMPLATE" - exit 1 -fi - -# Execute the original LiveKit server command -exec /livekit-server "$@" diff --git a/dream-server/config/livekit/livekit.yaml b/dream-server/config/livekit/livekit.yaml deleted file mode 100644 index 401e8498f..000000000 --- a/dream-server/config/livekit/livekit.yaml +++ /dev/null @@ -1,17 +0,0 @@ -port: 7880 -rtc: - port_range_start: 50000 - port_range_end: 60000 - use_external_ip: true - # node_ip removed - let LiveKit auto-detect - -keys: - ${LIVEKIT_API_KEY}: ${LIVEKIT_API_SECRET} - -logging: - level: info - json: false - -room: - empty_timeout: 300 - max_participants: 10 diff --git a/dream-server/config/livekit/offline-livekit.yaml b/dream-server/config/livekit/offline-livekit.yaml deleted file mode 100644 index ea5e03b93..000000000 --- a/dream-server/config/livekit/offline-livekit.yaml +++ /dev/null @@ -1,112 +0,0 @@ -# LiveKit Offline Configuration -# Local-only WebRTC setup for Dream Server zero-cloud mode -# M1 Phase 2 - No external dependencies - -port: 7880 - -# RTC Configuration - Local network only -rtc: - # Port range for WebRTC (ensure these are open on firewall) - port_range_start: 50000 - port_range_end: 60000 - - # OFFLINE MODE: Force local network usage - use_external_ip: false - - # Use container hostname for local networking - node_ip: "0.0.0.0" - - # UDP configuration for local network - udp_port: 7882 - - # STUN/TURN servers - DISABLED for offline mode - # stun_servers: [] - # turn_servers: [] - -# Authentication keys - populated from environment variables -keys: - ${LIVEKIT_API_KEY}: ${LIVEKIT_API_SECRET} - # OFFLINE MODE: No webhook validation needed - # webhooks: [] - -# Logging configuration -logging: - level: info - json: false - # OFFLINE MODE: Log to stdout only - sample: 100 - -# Room configuration -room: - # Timeout for empty rooms (5 minutes) - empty_timeout: 300 - - # Max participants per room - max_participants: 10 - - # OFFLINE MODE: Disable external integrations - # webhooks: [] - - # Enable recording (local storage only) - enabled_codecs: - - mime: audio/opus - - mime: video/VP8 - - mime: video/VP9 - - mime: video/H264 - -# Node configuration -node_selector: - # OFFLINE MODE: Single node setup - kind: any - -# Signal relay configuration -signal_relay: - # OFFLINE MODE: Disabled for local deployment - enabled: false - -# Limits and security -limits: - # Max bitrate per participant (1.5 Mbps) - max_bitrate: 1500000 - - # Max packet size - max_packet_size: 1200 - - # OFFLINE MODE: No rate limiting for local use - # rate_limit: 100 - -# Development settings -debug: - # Enable detailed logging for troubleshooting - pprof: false - -# Prometheus metrics (optional) -prometheus: - # OFFLINE MODE: Disable metrics export - port: 0 - -# Key provider configuration -key_provider: - # Use static keys from environment variables - kind: static - static: - keys: - ${LIVEKIT_API_KEY}: ${LIVEKIT_API_SECRET} - -# Region configuration - single region for offline -region: - # Local deployment - current: "local" - regions: - - local - -# TURN configuration - DISABLED for offline mode -turn: - enabled: false - # No external TURN servers - -# Webhooks - DISABLED for offline mode -webhook: - # No external webhooks - urls: [] - api_key: "" \ No newline at end of file diff --git a/dream-server/config/llama-server/models.ini b/dream-server/config/llama-server/models.ini new file mode 100644 index 000000000..1b4879f0b --- /dev/null +++ b/dream-server/config/llama-server/models.ini @@ -0,0 +1,4 @@ +[qwen3-8b] +filename = Qwen3-8B-Q4_K_M.gguf +load-on-startup = true +n-ctx = 32768 diff --git a/dream-server/config/openclaw/entry.json b/dream-server/config/openclaw/entry.json deleted file mode 100644 index 0ad727623..000000000 --- a/dream-server/config/openclaw/entry.json +++ /dev/null @@ -1,44 +0,0 @@ -{ - "$schema": "https://raw.githubusercontent.com/openclaw/openclaw/main/schemas/openclaw.json", - "version": "1.0", - "agent": { - "name": "Dream Agent", - "model": "local-vllm/Qwen/Qwen2.5-1.5B-Instruct", - "systemPrompt": "You are Dream Agent, a local AI assistant running on this machine's GPU. You cost nothing per token โ€” no API keys, no cloud, no data leaving this network. Be helpful, accurate, and respect privacy. You have access to tools for reading files, writing files, and running commands. Use them proactively โ€” don't give the user homework you can do yourself." - }, - "providers": { - "local-vllm": { - "type": "openai-compatible", - "baseUrl": "http://vllm-tool-proxy:8003/v1", - "apiKey": "none", - "models": { - "Qwen/Qwen2.5-1.5B-Instruct": { - "contextWindow": 8192, - "supportsTools": true - } - } - } - }, - "subagent": { - "enabled": true, - "model": "local-vllm/Qwen/Qwen2.5-1.5B-Instruct", - "maxConcurrent": 8, - "timeoutSeconds": 240 - }, - "tools": { - "exec": { - "enabled": true, - "allowedCommands": ["ls", "cat", "grep", "find", "head", "tail", "wc"] - }, - "read": { "enabled": true }, - "write": { "enabled": true }, - "web_fetch": { "enabled": true } - }, - "gateway": { - "port": 7860, - "host": "0.0.0.0", - "auth": { - "mode": "none" - } - } -} diff --git a/dream-server/config/openclaw/inject-token.js b/dream-server/config/openclaw/inject-token.js index 62749db40..d8cd8223e 100644 --- a/dream-server/config/openclaw/inject-token.js +++ b/dream-server/config/openclaw/inject-token.js @@ -1,40 +1,135 @@ // Inject gateway auth token into Control UI so it auto-connects // Runs at container startup before the gateway starts // +// Three tasks: +// 1. Patch the runtime config (origins, flags, auth, model names) +// 2. Inject auto-token.js into the Control UI HTML (CSP-compliant) +// 3. Fix model references to match what llama-server actually serves +// // IMPORTANT: The gateway sets Content-Security-Policy: script-src 'self' // which blocks inline scripts. So we must create an EXTERNAL .js file // and reference it via '); - fs.writeFileSync(htmlPath, html); - - console.log('[inject-token] Created auto-token.js and injected '); + fs.writeFileSync(HTML_PATH, html); + + console.log('[inject-token] created auto-token.js and injected + + + + +
+
+

Token Spy โ€” API Monitor

+
Real-time token usage, cost tracking & session control
+
+
+ + + +
+
+ +
+ + + +
+ +
+
+

Cost Per Turn (Session Timeline)

+ +
+
+

History Growth (chars)

+ +
+
+ +
+
+

Token Usage Over Time

+ +
+
+

Cost Breakdown by Type

+ +
+
+ +
+

Cumulative Cost Over Time

+ +
+ +
+

Recent Turns

+ + + + + + + + + + + + + + + + + +
TimeAgentModelInput TokOutput TokCache ReadCache WriteSys PromptHistoryCostDuration
+
+ + + + +""" + + +@app.get("/dashboard", response_class=HTMLResponse) +def dashboard(): + return DASHBOARD_HTML + + +# โ”€โ”€ SSE Token Events Stream โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +@app.get("/token_events") +async def token_events(request: Request): + """Stream token usage events as Server-Sent Events.""" + async def event_stream(): + last_id = None + while True: + try: + # Query recent events + events = query_recent_events(limit=50, after_id=last_id) + + for event in events: + # Format event as SSE + event_data = { + "type": "token_usage", + "session_id": event.get("session_id", ""), + "model": event.get("model", ""), + "provider": event.get("provider", ""), + "input_tokens": event.get("input_tokens", 0), + "output_tokens": event.get("output_tokens", 0), + "total_tokens": event.get("total_tokens", 0), + "cost_usd": float(event.get("cost_usd", 0) or 0), + "timestamp": event.get("timestamp", ""), + "agent_name": event.get("agent_name", AGENT_NAME) + } + + yield f"data: {json.dumps(event_data)}\n\n" + last_id = event.get("id") + + # Heartbeat to keep connection alive + yield ":heartbeat\n\n" + + # Wait before next poll + await asyncio.sleep(2) + + except Exception as e: + log.error(f"SSE stream error: {e}") + yield f"event: error\ndata: {json.dumps({'error': str(e)})}\n\n" + await asyncio.sleep(5) + + return StreamingResponse( + event_stream(), + media_type="text/event-stream", + headers={ + "Cache-Control": "no-cache", + "Connection": "keep-alive", + "X-Accel-Buffering": "no", + }, + ) + + +# โ”€โ”€ Catch-all for other endpoints โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +@app.api_route("/{path:path}", methods=["GET", "POST", "PUT", "DELETE", "PATCH"]) +async def proxy_other(request: Request, path: str): + """Forward any other requests to upstream transparently.""" + # Use the correct upstream client based on provider + if API_PROVIDER in ("openai", "moonshot"): + client = get_moonshot_client() + else: + client = get_http_client() + headers = {} + for key in ("x-api-key", "anthropic-version", "content-type", "anthropic-beta", + "authorization", "accept", "user-agent"): + val = request.headers.get(key) + if val: + headers[key] = val + + # Inject environment API key if not provided in request + if UPSTREAM_API_KEY and "x-api-key" not in headers and "authorization" not in headers: + if API_PROVIDER == "anthropic": + headers["x-api-key"] = UPSTREAM_API_KEY + else: + headers["authorization"] = f"Bearer {UPSTREAM_API_KEY}" + + body = await request.body() + try: + resp = await client.request( + method=request.method, + url=f"/{path}", + content=body if body else None, + headers=headers, + ) + return Response( + content=resp.content, + status_code=resp.status_code, + headers=dict(resp.headers), + ) + except Exception as e: + return JSONResponse(status_code=502, content={"error": str(e)}) diff --git a/dream-server/extensions/services/token-spy/manifest.yaml b/dream-server/extensions/services/token-spy/manifest.yaml new file mode 100644 index 000000000..f267d8831 --- /dev/null +++ b/dream-server/extensions/services/token-spy/manifest.yaml @@ -0,0 +1,16 @@ +schema_version: dream.services.v1 + +service: + id: token-spy + name: Token Spy (Usage Monitor) + aliases: [] + container_name: dream-token-spy + default_host: token-spy + port: 8080 + external_port_default: 3005 + health: /health + type: docker + gpu_backends: [amd, nvidia] + compose_file: compose.yaml + category: recommended + depends_on: [] diff --git a/dream-server/extensions/services/token-spy/providers/__init__.py b/dream-server/extensions/services/token-spy/providers/__init__.py new file mode 100644 index 000000000..7119eb4cb --- /dev/null +++ b/dream-server/extensions/services/token-spy/providers/__init__.py @@ -0,0 +1,17 @@ +"""Token Spy Provider Plugin System. + +Enables pluggable LLM provider support with unified cost tracking and metrics capture. +""" + +from .base import LLMProvider +from .registry import ProviderRegistry, register_provider +from .anthropic import AnthropicProvider +from .openai import OpenAICompatibleProvider + +__all__ = [ + "LLMProvider", + "ProviderRegistry", + "register_provider", + "AnthropicProvider", + "OpenAICompatibleProvider", +] diff --git a/dream-server/extensions/services/token-spy/providers/anthropic.py b/dream-server/extensions/services/token-spy/providers/anthropic.py new file mode 100644 index 000000000..bb0292c8f --- /dev/null +++ b/dream-server/extensions/services/token-spy/providers/anthropic.py @@ -0,0 +1,272 @@ +"""Anthropic Provider โ€” Claude Messages API support. + +Handles Anthropic-specific request/response formats including: +- System prompt with cache_control blocks +- Workspace file breakdown (AGENTS.md, SOUL.md, etc.) +- SSE streaming with event types (message_start, message_delta, message_stop) +""" + +import json +import re +from typing import Any, Dict, List, Optional + +from .base import LLMProvider +from .registry import register_provider + + +@register_provider("anthropic") +class AnthropicProvider(LLMProvider): + """Anthropic Messages API provider (Claude models).""" + + # Pricing per 1M tokens: {input, output, cache_read, cache_write} + COST_TABLE = { + "claude-opus-4-6": {"input": 5.0, "output": 25.0, "cache_read": 0.50, "cache_write": 6.25}, + "claude-opus-4-5": {"input": 5.0, "output": 25.0, "cache_read": 0.50, "cache_write": 6.25}, + "claude-opus-4-1": {"input": 15.0, "output": 75.0, "cache_read": 1.50, "cache_write": 18.75}, + "claude-opus-4": {"input": 15.0, "output": 75.0, "cache_read": 1.50, "cache_write": 18.75}, + "claude-sonnet-4": {"input": 3.0, "output": 15.0, "cache_read": 0.30, "cache_write": 3.75}, + "claude-haiku-4-5": {"input": 1.0, "output": 5.0, "cache_read": 0.10, "cache_write": 1.25}, + "claude-haiku-3-5": {"input": 0.80, "output": 4.0, "cache_read": 0.08, "cache_write": 1.0}, + "claude-haiku": {"input": 0.80, "output": 4.0, "cache_read": 0.08, "cache_write": 1.0}, + } + + # Map workspace file markers to metric keys + WORKSPACE_FILE_MAP = { + "AGENTS.md": "workspace_agents_chars", + "SOUL.md": "workspace_soul_chars", + "TOOLS.md": "workspace_tools_chars", + "IDENTITY.md": "workspace_identity_chars", + "USER.md": "workspace_user_chars", + "HEARTBEAT.md": "workspace_heartbeat_chars", + "BOOTSTRAP.md": "workspace_bootstrap_chars", + "MEMORY.md": "workspace_memory_chars", + } + + @property + def name(self) -> str: + return "anthropic" + + @property + def default_base_url(self) -> str: + return "https://api.anthropic.com" + + @property + def api_endpoint(self) -> str: + return "/v1/messages" + + def get_model_pricing(self, model: str) -> Dict[str, float]: + """Match model name to pricing table.""" + model_lower = model.lower() + + # Try exact prefix matches (longer prefixes first for specificity) + for prefix in sorted(self.COST_TABLE.keys(), key=len, reverse=True): + if prefix in model_lower: + return self.COST_TABLE[prefix] + + # Default to zero if unknown model + return {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0} + + def analyze_request(self, body: Dict[str, Any]) -> Dict[str, Any]: + """Analyze Anthropic request for metrics. + + Extracts: + - System prompt breakdown with workspace file detection + - Message counts and conversation history size + - Tool count + """ + result = { + "system_prompt_total_chars": 0, + "base_prompt_chars": 0, + "message_count": 0, + "user_message_count": 0, + "assistant_message_count": 0, + "conversation_history_chars": 0, + "tool_count": 0, + } + + # Initialize workspace file metrics + for key in self.WORKSPACE_FILE_MAP.values(): + result[key] = 0 + + # Analyze system prompt + system = body.get("system", []) + if system: + sys_analysis = self._analyze_system_prompt(system) + result.update(sys_analysis) + + # Analyze messages + messages = body.get("messages", []) + msg_analysis = self._analyze_messages(messages) + result.update(msg_analysis) + + # Tool count + result["tool_count"] = len(body.get("tools", [])) + + return result + + def _analyze_system_prompt(self, system: Any) -> Dict[str, Any]: + """Parse system prompt structure for workspace file breakdown. + + Anthropic system prompt can be: + - A string (simple) + - A list of blocks with text and cache_control + """ + result = { + "system_prompt_total_chars": 0, + "base_prompt_chars": 0, + } + for key in self.WORKSPACE_FILE_MAP.values(): + result[key] = 0 + + # Convert string to block format + if isinstance(system, str): + blocks = [{"type": "text", "text": system}] + elif isinstance(system, list): + blocks = system + else: + return result + + total_chars = 0 + base_chars = 0 + workspace_chars = {k: 0 for k in self.WORKSPACE_FILE_MAP.values()} + + for block in blocks: + if not isinstance(block, dict): + continue + + text = block.get("text", "") + if not isinstance(text, str): + text = str(text) + + block_len = len(text) + total_chars += block_len + + # Check for workspace file markers + matched_workspace = False + for filename, metric_key in self.WORKSPACE_FILE_MAP.items(): + # Look for ## FILENAME patterns + if f"## {filename}" in text or f"# {filename}" in text: + workspace_chars[metric_key] += block_len + matched_workspace = True + break + + if not matched_workspace: + base_chars += block_len + + result["system_prompt_total_chars"] = total_chars + result["base_prompt_chars"] = base_chars + for key, chars in workspace_chars.items(): + result[key] = chars + + return result + + def _analyze_messages(self, messages: List[Dict[str, Any]]) -> Dict[str, Any]: + """Analyze message array for counts and sizes.""" + user_count = 0 + assistant_count = 0 + + for msg in messages: + role = msg.get("role", "") + if role == "user": + user_count += 1 + elif role == "assistant": + assistant_count += 1 + + # Serialize messages for history char count + try: + history_str = json.dumps(messages, separators=(",", ":")) + history_chars = len(history_str) + except (TypeError, ValueError): + history_chars = 0 + + return { + "message_count": len(messages), + "user_message_count": user_count, + "assistant_message_count": assistant_count, + "conversation_history_chars": history_chars, + } + + def rewrite_request(self, body: Dict[str, Any]) -> Dict[str, Any]: + """Anthropic is the reference format โ€” no rewriting needed.""" + return body + + def extract_usage_from_response(self, response: Dict[str, Any]) -> Dict[str, Any]: + """Extract usage from non-streaming response.""" + usage = response.get("usage", {}) + return { + "input_tokens": usage.get("input_tokens", 0), + "output_tokens": usage.get("output_tokens", 0), + "cache_read_tokens": usage.get("cache_read_input_tokens", 0), + "cache_write_tokens": usage.get("cache_creation_input_tokens", 0), + "stop_reason": response.get("stop_reason"), + } + + def extract_usage_from_stream( + self, line: str, event_type: Optional[str] = None + ) -> Optional[Dict[str, Any]]: + """Extract usage from Anthropic SSE stream. + + Anthropic uses event types: + - message_start: Contains input tokens, cache stats + - message_delta: Contains output tokens, stop_reason + - message_stop: End of stream (no usage) + """ + stripped = line.strip() + + # Only process data lines + if not stripped.startswith("data:"): + return None + + data_str = stripped[5:].strip() + if data_str == "[DONE]": + return None + + try: + data = json.loads(data_str) + except json.JSONDecodeError: + return None + + result = {} + + if event_type == "message_start": + # Initial message with input usage + msg_usage = data.get("message", {}).get("usage", {}) + if msg_usage: + result["input_tokens"] = msg_usage.get("input_tokens", 0) + result["cache_read_tokens"] = msg_usage.get("cache_read_input_tokens", 0) + result["cache_write_tokens"] = msg_usage.get("cache_creation_input_tokens", 0) + + elif event_type == "message_delta": + # Delta with output tokens and/or stop reason + delta_usage = data.get("usage", {}) + delta = data.get("delta", {}) + + if delta_usage.get("output_tokens") is not None: + result["output_tokens"] = delta_usage["output_tokens"] + + if delta.get("stop_reason"): + result["stop_reason"] = delta["stop_reason"] + + return result if result else None + + def get_auth_headers(self, request_headers: Dict[str, str]) -> Dict[str, str]: + """Extract Anthropic-specific headers to forward.""" + headers = {} + + # Required auth header + for key in ("x-api-key",): + val = request_headers.get(key.lower()) + if val: + headers[key] = val + + # Optional Anthropic headers + for key in ( + "anthropic-version", + "anthropic-beta", + "anthropic-dangerous-direct-browser-access", + ): + val = request_headers.get(key.lower()) + if val: + headers[key] = val + + return headers diff --git a/dream-server/extensions/services/token-spy/providers/base.py b/dream-server/extensions/services/token-spy/providers/base.py new file mode 100644 index 000000000..07ec712c3 --- /dev/null +++ b/dream-server/extensions/services/token-spy/providers/base.py @@ -0,0 +1,171 @@ +"""Abstract Base Class for LLM Providers. + +All provider implementations must inherit from LLMProvider and implement +the required abstract methods for request/response handling and cost calculation. +""" + +from abc import ABC, abstractmethod +from typing import Any, Dict, Optional +import httpx + + +class LLMProvider(ABC): + """Abstract base for LLM API providers. + + Providers handle the provider-specific logic for: + - Request analysis (extracting metrics from incoming requests) + - Request rewriting (transforming requests for provider compatibility) + - Response parsing (extracting usage from responses) + - Stream parsing (extracting usage from SSE streams) + - Cost calculation (pricing per model) + """ + + def __init__(self, config: Optional[Dict[str, Any]] = None): + """Initialize provider with optional configuration. + + Args: + config: Provider-specific configuration (base_url overrides, etc.) + """ + self.config = config or {} + self._client: Optional[httpx.AsyncClient] = None + + @property + @abstractmethod + def name(self) -> str: + """Provider identifier (anthropic, openai, google, etc.)""" + pass + + @property + @abstractmethod + def default_base_url(self) -> str: + """Default API base URL for this provider.""" + pass + + @property + def base_url(self) -> str: + """API base URL, allowing config override.""" + return self.config.get("base_url", self.default_base_url) + + @property + @abstractmethod + def api_endpoint(self) -> str: + """Primary API endpoint path (e.g., /v1/messages or /v1/chat/completions).""" + pass + + @abstractmethod + def get_model_pricing(self, model: str) -> Dict[str, float]: + """Return pricing per 1M tokens for a model. + + Returns: + Dict with keys: input, output, cache_read, cache_write + Values are USD per 1M tokens, 0.0 if unknown. + """ + pass + + @abstractmethod + def analyze_request(self, body: Dict[str, Any]) -> Dict[str, Any]: + """Extract metrics from request body. + + Returns dict with: + - system_prompt_total_chars: Total system prompt size + - base_prompt_chars: Base (static) prompt size + - workspace_*_chars: Optional breakdown by workspace file + - message_count: Total messages + - user_message_count: User messages + - assistant_message_count: Assistant messages + - conversation_history_chars: Total serialized message chars + - tool_count: Number of tools defined + """ + pass + + @abstractmethod + def rewrite_request(self, body: Dict[str, Any]) -> Dict[str, Any]: + """Rewrite request for provider compatibility. + + E.g., convert 'developer' role to 'system' for Moonshot. + Returns the potentially modified body (may modify in place). + """ + pass + + @abstractmethod + def extract_usage_from_response(self, response: Dict[str, Any]) -> Dict[str, Any]: + """Extract token usage from non-streaming response. + + Returns dict with: + - input_tokens: Input/prompt tokens + - output_tokens: Output/completion tokens + - cache_read_tokens: Tokens read from cache (0 if not supported) + - cache_write_tokens: Tokens written to cache (0 if not supported) + - stop_reason: Why generation stopped (optional) + """ + pass + + @abstractmethod + def extract_usage_from_stream( + self, line: str, event_type: Optional[str] = None + ) -> Optional[Dict[str, Any]]: + """Extract usage from a single SSE stream line. + + Args: + line: Raw SSE line (may include "data:" prefix) + event_type: For Anthropic-style SSE, the current event type + + Returns: + Partial usage dict if this line contains usage info, None otherwise. + Can return partial updates that get merged with existing usage. + """ + pass + + def get_auth_headers(self, request_headers: Dict[str, str]) -> Dict[str, str]: + """Extract and return authentication headers to forward. + + Override in subclasses for provider-specific auth handling. + Default implementation returns empty dict (no auth forwarding). + + Args: + request_headers: Incoming request headers (lowercase keys) + + Returns: + Headers to include in upstream request + """ + return {} + + def get_http_client(self) -> httpx.AsyncClient: + """Get or create HTTP client with provider-specific config. + + Creates a new client if none exists or the existing one is closed. + """ + if self._client is None or self._client.is_closed: + self._client = httpx.AsyncClient( + base_url=self.base_url, + timeout=httpx.Timeout(connect=10.0, read=300.0, write=30.0, pool=30.0), + limits=httpx.Limits(max_connections=20, max_keepalive_connections=10), + ) + return self._client + + async def close(self): + """Close the HTTP client if open.""" + if self._client and not self._client.is_closed: + await self._client.aclose() + self._client = None + + def calculate_cost(self, usage: Dict[str, Any], model: str) -> float: + """Calculate cost in USD from usage and model. + + Args: + usage: Dict with *_tokens keys + model: Model name for pricing lookup + + Returns: + Estimated cost in USD + """ + rates = self.get_model_pricing(model) + return ( + usage.get("input_tokens", 0) * rates.get("input", 0) / 1_000_000 + + usage.get("output_tokens", 0) * rates.get("output", 0) / 1_000_000 + + usage.get("cache_read_tokens", 0) * rates.get("cache_read", 0) / 1_000_000 + + usage.get("cache_write_tokens", 0) * rates.get("cache_write", 0) / 1_000_000 + ) + + def __repr__(self) -> str: + return f"<{self.__class__.__name__} name={self.name} base_url={self.base_url}>" diff --git a/dream-server/extensions/services/token-spy/providers/openai.py b/dream-server/extensions/services/token-spy/providers/openai.py new file mode 100644 index 000000000..4ef1fa16f --- /dev/null +++ b/dream-server/extensions/services/token-spy/providers/openai.py @@ -0,0 +1,267 @@ +"""OpenAI-Compatible Provider โ€” OpenAI, Moonshot, vLLM, and other compatible APIs. + +Handles OpenAI-style request/response formats including: +- Chat completions API +- Developer/system role rewriting +- Standard SSE streaming format +""" + +import json +from typing import Any, Dict, List, Optional + +from .base import LLMProvider +from .registry import register_provider + + +@register_provider("openai") +class OpenAICompatibleProvider(LLMProvider): + """OpenAI-compatible API provider. + + Supports: + - OpenAI native API + - Moonshot/Kimi API + - Local vLLM/Ollama with OpenAI-compatible endpoints + - Any other OpenAI-compatible service + """ + + # Pricing per 1M tokens: {input, output, cache_read, cache_write} + # cache_read/write are 0 for providers that don't support caching + COST_TABLE = { + # Moonshot Kimi models + "kimi-k2-0711": {"input": 0.60, "output": 3.0, "cache_read": 0.10, "cache_write": 0.60}, + "kimi-k2-0905": {"input": 0.60, "output": 2.50, "cache_read": 0.15, "cache_write": 0.60}, + "kimi-k2-thinking": {"input": 0.60, "output": 2.50, "cache_read": 0.15, "cache_write": 0.60}, + "kimi-k2.5": {"input": 0.60, "output": 2.50, "cache_read": 0.15, "cache_write": 0.60}, + "kimi-k2": {"input": 0.60, "output": 2.50, "cache_read": 0.15, "cache_write": 0.60}, + # OpenAI models + "gpt-4o": {"input": 2.50, "output": 10.0, "cache_read": 1.25, "cache_write": 0.0}, + "gpt-4o-mini": {"input": 0.15, "output": 0.60, "cache_read": 0.075, "cache_write": 0.0}, + "gpt-4-turbo": {"input": 10.0, "output": 30.0, "cache_read": 0.0, "cache_write": 0.0}, + "gpt-4": {"input": 30.0, "output": 60.0, "cache_read": 0.0, "cache_write": 0.0}, + "gpt-3.5-turbo": {"input": 0.50, "output": 1.50, "cache_read": 0.0, "cache_write": 0.0}, + "o1": {"input": 15.0, "output": 60.0, "cache_read": 7.50, "cache_write": 0.0}, + "o1-mini": {"input": 3.0, "output": 12.0, "cache_read": 1.50, "cache_write": 0.0}, + "o1-pro": {"input": 150.0, "output": 600.0, "cache_read": 0.0, "cache_write": 0.0}, + # DeepSeek models (OpenAI-compatible) + "deepseek-chat": {"input": 0.27, "output": 1.10, "cache_read": 0.07, "cache_write": 0.27}, + "deepseek-reasoner": {"input": 0.55, "output": 2.19, "cache_read": 0.14, "cache_write": 0.55}, + # Local models (free) โ€” Strix Halo llama-server models + "qwen3-coder-next": {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0}, + "qwen3:30b-a3b": {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0}, + "qwen3-8b": {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0}, + "qwen3-14b": {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0}, + "qwen3-30b-a3b": {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0}, + "qwen3.5:27b": {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0}, + "qwen": {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0}, + "llama": {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0}, + "mistral": {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0}, + } + + @property + def name(self) -> str: + return "openai" + + @property + def default_base_url(self) -> str: + return "https://api.openai.com" + + @property + def api_endpoint(self) -> str: + return "/v1/chat/completions" + + def get_model_pricing(self, model: str) -> Dict[str, float]: + """Match model name to pricing table.""" + model_lower = model.lower() + + # Try exact prefix matches (longer prefixes first for specificity) + for prefix in sorted(self.COST_TABLE.keys(), key=len, reverse=True): + if prefix in model_lower: + return self.COST_TABLE[prefix] + + # Default to zero for unknown models (likely local) + return {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0} + + def analyze_request(self, body: Dict[str, Any]) -> Dict[str, Any]: + """Analyze OpenAI-format request for metrics.""" + messages = body.get("messages", []) + + user_count = 0 + assistant_count = 0 + system_chars = 0 + + for msg in messages: + role = msg.get("role", "") + content = msg.get("content", "") + + if role == "user": + user_count += 1 + elif role == "assistant": + assistant_count += 1 + elif role in ("system", "developer"): + # System prompt - could be string or array + if isinstance(content, str): + system_chars += len(content) + elif isinstance(content, list): + # Array of content blocks + for block in content: + if isinstance(block, dict): + text = block.get("text", "") + if isinstance(text, str): + system_chars += len(text) + elif isinstance(block, str): + system_chars += len(block) + else: + system_chars += len(json.dumps(content, separators=(",", ":"))) + + # Serialize messages for history char count + try: + history_str = json.dumps(messages, separators=(",", ":")) + history_chars = len(history_str) + except (TypeError, ValueError): + history_chars = 0 + + return { + "system_prompt_total_chars": system_chars, + "base_prompt_chars": system_chars, # No workspace breakdown for OpenAI + "message_count": len(messages), + "user_message_count": user_count, + "assistant_message_count": assistant_count, + "conversation_history_chars": history_chars, + "tool_count": len(body.get("tools", body.get("functions", []))), + } + + def rewrite_request(self, body: Dict[str, Any]) -> Dict[str, Any]: + """Rewrite request for OpenAI compatibility. + + Main transformation: convert 'developer' role to 'system' for + providers that don't support the developer role (e.g., Moonshot). + """ + messages = body.get("messages", []) + rewritten = False + + for msg in messages: + if msg.get("role") == "developer": + msg["role"] = "system" + rewritten = True + + if rewritten: + body["messages"] = messages + + return body + + def extract_usage_from_response(self, response: Dict[str, Any]) -> Dict[str, Any]: + """Extract usage from non-streaming response.""" + usage = response.get("usage", {}) + + # Get stop reason from choices + choices = response.get("choices", []) + stop_reason = None + if choices: + stop_reason = choices[0].get("finish_reason") + + return { + "input_tokens": usage.get("prompt_tokens", 0), + "output_tokens": usage.get("completion_tokens", 0), + "cache_read_tokens": usage.get("prompt_tokens_details", {}).get("cached_tokens", 0), + "cache_write_tokens": 0, # OpenAI doesn't expose cache write stats + "stop_reason": stop_reason, + } + + def extract_usage_from_stream( + self, line: str, event_type: Optional[str] = None + ) -> Optional[Dict[str, Any]]: + """Extract usage from OpenAI SSE stream. + + OpenAI streaming: + - Usage comes in the final chunk with empty choices + - Stop reason comes in the last content chunk + """ + stripped = line.strip() + + # Only process data lines + if not stripped.startswith("data:"): + return None + + data_str = stripped[5:].strip() + if data_str == "[DONE]": + return None + + try: + data = json.loads(data_str) + except json.JSONDecodeError: + return None + + result = {} + + # Check for usage in final chunk + usage = data.get("usage", {}) + if usage: + result["input_tokens"] = usage.get("prompt_tokens", 0) + result["output_tokens"] = usage.get("completion_tokens", 0) + + # OpenAI may include cache stats in prompt_tokens_details + details = usage.get("prompt_tokens_details", {}) + if details: + result["cache_read_tokens"] = details.get("cached_tokens", 0) + + # Check for stop reason in choices + choices = data.get("choices", []) + if choices: + finish_reason = choices[0].get("finish_reason") + if finish_reason: + result["stop_reason"] = finish_reason + + return result if result else None + + def get_auth_headers(self, request_headers: Dict[str, str]) -> Dict[str, str]: + """Extract Authorization header for OpenAI-compatible APIs.""" + headers = {} + + auth = request_headers.get("authorization") + if auth: + headers["Authorization"] = auth + + # Some providers use x-api-key instead + api_key = request_headers.get("x-api-key") + if api_key: + headers["x-api-key"] = api_key + + return headers + + +# Convenience alias for Moonshot-specific usage +@register_provider("moonshot") +class MoonshotProvider(OpenAICompatibleProvider): + """Moonshot/Kimi API provider. + + Moonshot is OpenAI-compatible with some quirks handled here. + """ + + @property + def name(self) -> str: + return "moonshot" + + @property + def default_base_url(self) -> str: + return "https://api.moonshot.ai" + + +# Local vLLM provider (no cost tracking) +@register_provider("local") +class LocalProvider(OpenAICompatibleProvider): + """Local inference provider (vLLM, Ollama, etc.). + + Same as OpenAI-compatible but defaults to localhost and zero costs. + """ + + @property + def name(self) -> str: + return "local" + + @property + def default_base_url(self) -> str: + return self.config.get("base_url", "http://localhost:8000") + + def get_model_pricing(self, model: str) -> Dict[str, float]: + """Local models are free (electricity cost not tracked).""" + return {"input": 0.0, "output": 0.0, "cache_read": 0.0, "cache_write": 0.0} diff --git a/dream-server/extensions/services/token-spy/providers/registry.py b/dream-server/extensions/services/token-spy/providers/registry.py new file mode 100644 index 000000000..dac5e420d --- /dev/null +++ b/dream-server/extensions/services/token-spy/providers/registry.py @@ -0,0 +1,111 @@ +"""Provider Registry โ€” Central registration and lookup for LLM providers.""" + +from typing import Any, Dict, List, Optional, Type + +from .base import LLMProvider + + +class ProviderRegistry: + """Registry of available LLM providers. + + Providers register themselves using the @register_provider decorator + or by calling ProviderRegistry.register() directly. + """ + + _providers: Dict[str, Type[LLMProvider]] = {} + _instances: Dict[str, LLMProvider] = {} # Cached instances + + @classmethod + def register(cls, name: str, provider_class: Type[LLMProvider]) -> None: + """Register a provider class by name. + + Args: + name: Provider identifier (lowercase, e.g., "anthropic") + provider_class: The provider class to register + """ + cls._providers[name.lower()] = provider_class + + @classmethod + def get(cls, name: str, config: Optional[Dict[str, Any]] = None) -> LLMProvider: + """Get a provider instance by name. + + Creates a new instance with the given config. Does not cache + instances with custom configs. + + Args: + name: Provider identifier + config: Optional provider configuration + + Returns: + Provider instance + + Raises: + ValueError: If provider name is not registered + """ + name_lower = name.lower() + if name_lower not in cls._providers: + available = ", ".join(cls._providers.keys()) or "none" + raise ValueError(f"Unknown provider: {name}. Available: {available}") + + # If config provided, always create new instance + if config: + return cls._providers[name_lower](config) + + # Check cache for default instance + if name_lower not in cls._instances: + cls._instances[name_lower] = cls._providers[name_lower]() + + return cls._instances[name_lower] + + @classmethod + def get_or_none(cls, name: str, config: Optional[Dict[str, Any]] = None) -> Optional[LLMProvider]: + """Get a provider instance or None if not found. + + Same as get() but returns None instead of raising ValueError. + """ + try: + return cls.get(name, config) + except ValueError: + return None + + @classmethod + def list_providers(cls) -> List[str]: + """List all registered provider names.""" + return list(cls._providers.keys()) + + @classmethod + def is_registered(cls, name: str) -> bool: + """Check if a provider is registered.""" + return name.lower() in cls._providers + + @classmethod + def clear_cache(cls) -> None: + """Clear all cached provider instances.""" + cls._instances.clear() + + @classmethod + def unregister(cls, name: str) -> bool: + """Unregister a provider (mainly for testing). + + Returns True if provider was removed, False if not found. + """ + name_lower = name.lower() + if name_lower in cls._providers: + del cls._providers[name_lower] + cls._instances.pop(name_lower, None) + return True + return False + + +def register_provider(name: str): + """Decorator to register a provider class. + + Usage: + @register_provider("mycloud") + class MyCloudProvider(LLMProvider): + ... + """ + def decorator(cls: Type[LLMProvider]) -> Type[LLMProvider]: + ProviderRegistry.register(name, cls) + return cls + return decorator diff --git a/dream-server/extensions/services/token-spy/requirements.txt b/dream-server/extensions/services/token-spy/requirements.txt new file mode 100644 index 000000000..c84fbf222 --- /dev/null +++ b/dream-server/extensions/services/token-spy/requirements.txt @@ -0,0 +1,4 @@ +fastapi>=0.110.0 +uvicorn[standard]>=0.27.0 +httpx>=0.26.0 +psycopg2-binary>=2.9.9 diff --git a/dream-server/extensions/services/token-spy/session-manager.sh b/dream-server/extensions/services/token-spy/session-manager.sh new file mode 100644 index 000000000..b6257b754 --- /dev/null +++ b/dream-server/extensions/services/token-spy/session-manager.sh @@ -0,0 +1,334 @@ +#!/bin/bash +# Token Spy Session Manager โ€” cost-aware session cleanup +# Queries the token monitor API for real token economics instead of checking file sizes. +# Primary defense is your agent framework's native compaction. This script only +# intervenes as a safety valve when compaction fails and sessions exceed limits. +# +# Runs periodically via systemd timer or cron. +# +# Configure agents in the AGENTS array below. Format: +# "agent-name|monitor-port|/path/to/sessions/dir" + +set -euo pipefail + +# โ”€โ”€ Configuration โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +MONITOR_HOST="${MONITOR_HOST:-127.0.0.1}" + +# Define your agents here. +# Format: "agent-name|proxy-port|sessions-directory" +# Example: +# AGENTS=( +# "my-agent|9110|/home/user/.openclaw/agents/main/sessions" +# "my-other-agent|9111|/home/user/other/.openclaw/agents/main/sessions" +# ) +AGENTS=( + "openclaw|9110|~/dream-server/data/openclaw/home/agents/main/sessions" +) + +# Remote agents: "agent-name|remote-host|remote-sessions-dir" +REMOTE_AGENTS=() + +RECENT_MINUTES=15 # Protect sessions touched in last N minutes + +# Dynamic settings: read from Token Monitor API (dashboard-editable) +# Falls back to defaults if the API is unreachable. +DEFAULT_CHAR_LIMIT=80000 + +# Remote agents use file-size limit (bytes) as proxy for char limit +REMOTE_FILE_SIZE_LIMIT=200000 + +get_agent_char_limit() { + local agent="$1" port="$2" + local limit + limit=$(curl -sf --max-time 3 "http://${MONITOR_HOST}:${port}/api/session-status?agent=${agent}" 2>/dev/null | python3 -c "import json,sys; print(json.load(sys.stdin).get('session_char_limit', $DEFAULT_CHAR_LIMIT))" 2>/dev/null || echo "$DEFAULT_CHAR_LIMIT") + echo "$limit" +} + +# โ”€โ”€ Functions โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*"; } + +query_status() { + local agent="$1" port="$2" + curl -sf --max-time 5 "http://${MONITOR_HOST}:${port}/api/session-status?agent=${agent}" 2>/dev/null || echo '{"recommendation":"unavailable"}' +} + +clean_inactive() { + local sessions_dir="$1" + local sessions_json="${sessions_dir}/sessions.json" + + find "$sessions_dir" -name '*.deleted.*' -delete 2>/dev/null || true + find "$sessions_dir" -name '*.bak*' -mmin +60 -delete 2>/dev/null || true + + [ -f "$sessions_json" ] || return 0 + + local active_ids + active_ids=$(grep -oP '"sessionId":\s*"\K[^"]+' "$sessions_json" 2>/dev/null || true) + + for f in "$sessions_dir"/*.jsonl; do + [ -f "$f" ] || continue + local basename + basename=$(basename "$f" .jsonl) + + local is_active=false + for id in $active_ids; do + [ "$basename" = "$id" ] && { is_active=true; break; } + done + + if [ "$is_active" = false ]; then + local size_h + size_h=$(du -h "$f" | cut -f1) + log " [CLEANUP] Removing inactive session: $basename ($size_h)" + rm -f "$f" + fi + done +} + +kill_session() { + local sessions_dir="$1" session_id="$2" reason="$3" + local sessions_json="${sessions_dir}/sessions.json" + + local f="${sessions_dir}/${session_id}.jsonl" + if [ -f "$f" ]; then + local size_h + size_h=$(du -h "$f" | cut -f1) + log " [KILL] Removing session $session_id ($size_h) โ€” $reason" + rm -f "$f" + fi + + if [ -f "$sessions_json" ]; then + cp "$sessions_json" "${sessions_json}.bak-manager" + python3 -c " +import json, sys +with open('$sessions_json', 'r') as f: + data = json.load(f) +to_remove = [k for k, v in data.items() if isinstance(v, dict) and v.get('sessionId') == '$session_id'] +for k in to_remove: + del data[k] + print(f' Removed session key: {k}', file=sys.stderr) +with open('$sessions_json', 'w') as f: + json.dump(data, f, indent=2) +" 2>&1 + fi +} + +enforce_count_limit() { + local sessions_dir="$1" + local max_sessions=5 + local now + now=$(date +%s) + + local remaining=() + while IFS= read -r f; do + remaining+=("$f") + done < <(ls -t "$sessions_dir"/*.jsonl 2>/dev/null) + + local count=${#remaining[@]} + if [ "$count" -le "$max_sessions" ]; then + return 0 + fi + + log " [COUNT] $count sessions exceed max of $max_sessions, trimming oldest" + for (( i=max_sessions; i/dev/null || echo 0) + local age_mins=$(( (now - mtime) / 60 )) + + if [ "$age_mins" -le "$RECENT_MINUTES" ]; then + log " [COUNT] Skipping $basename โ€” touched ${age_mins}m ago (hot)" + continue + fi + + kill_session "$sessions_dir" "$basename" "excess session (${age_mins}m old)" + done +} + +# โ”€โ”€ Remote Agent Management โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +manage_remote_agent() { + local agent="$1" host="$2" remote_dir="$3" + local size_limit="$REMOTE_FILE_SIZE_LIMIT" + local max_sessions=5 + + log "Checking $agent (remote: $host, local model, \$0.00/turn)" + + local remote_info + remote_info=$(ssh -o ConnectTimeout=5 -o BatchMode=yes "${host}" bash << REMOTESCRIPT 2>/dev/null) || remote_info="SSH_FAILED" + SESSIONS_DIR="${remote_dir}" + if [ ! -d "\$SESSIONS_DIR" ]; then + echo "NO_DIR" + exit 0 + fi + echo "SESSION_LIST_START" + for f in "\$SESSIONS_DIR"/*.jsonl; do + [ -f "\$f" ] || continue + sid=\$(basename "\$f" .jsonl) + sz=\$(stat -c%s "\$f" 2>/dev/null || echo 0) + mt=\$(stat -c%Y "\$f" 2>/dev/null || echo 0) + echo "\${sid}|\${sz}|\${mt}" + done + echo "SESSION_LIST_END" + if [ -f "\$SESSIONS_DIR/sessions.json" ]; then + echo "ACTIVE_IDS_START" + grep -oP '"sessionId":\s*"\K[^"]+' "\$SESSIONS_DIR/sessions.json" 2>/dev/null || true + echo "ACTIVE_IDS_END" + fi + echo "TOTAL_SIZE=\$(du -sb "\$SESSIONS_DIR" 2>/dev/null | cut -f1)" + find "\$SESSIONS_DIR" -name '*.deleted.*' -delete 2>/dev/null || true + find "\$SESSIONS_DIR" -name '*.bak*' -mmin +60 -delete 2>/dev/null || true +REMOTESCRIPT + + if [ "$remote_info" = "SSH_FAILED" ]; then + log " [WARN] SSH to $host failed โ€” skipping $agent" + return 0 + fi + + if echo "$remote_info" | grep -q "NO_DIR"; then + log " [OK] No sessions directory on $host" + return 0 + fi + + local total_size + total_size=$(echo "$remote_info" | grep "^TOTAL_SIZE=" | cut -d= -f2) + log " Total sessions size: $(( ${total_size:-0} / 1024 ))KB (cost: \$0.00)" + + local active_ids="" + if echo "$remote_info" | grep -q "ACTIVE_IDS_START"; then + active_ids=$(echo "$remote_info" | sed -n '/ACTIVE_IDS_START/,/ACTIVE_IDS_END/p' | grep -v '_START\|_END') + fi + + local now + now=$(date +%s) + local session_count=0 + local to_remove=() + + while IFS='|' read -r sid size mtime; do + [ -z "$sid" ] && continue + session_count=$((session_count + 1)) + + local is_active=false + for aid in $active_ids; do + [ "$sid" = "$aid" ] && { is_active=true; break; } + done + + if [ "$is_active" = false ]; then + to_remove+=("$sid") + log " [CLEANUP] Inactive session: $sid ($(( size / 1024 ))KB)" + continue + fi + + if [ "$size" -gt "$size_limit" ]; then + local age_mins=$(( (now - mtime) / 60 )) + if [ "$age_mins" -gt "$RECENT_MINUTES" ]; then + to_remove+=("$sid") + log " [KILL] Oversized session: $sid ($(( size / 1024 ))KB > $(( size_limit / 1024 ))KB)" + else + log " [WARN] Oversized session $sid ($(( size / 1024 ))KB) but hot (${age_mins}m) โ€” skipping" + fi + fi + done < <(echo "$remote_info" | sed -n '/SESSION_LIST_START/,/SESSION_LIST_END/p' | grep -v '_START\|_END' | grep '|') + + log " Sessions: $session_count total, ${#to_remove[@]} to remove" + + if [ "${#to_remove[@]}" -gt 0 ]; then + local rm_args="" + for sid in "${to_remove[@]}"; do + rm_args="${rm_args} ${remote_dir}/${sid}.jsonl" + done + ssh -o ConnectTimeout=5 -o BatchMode=yes "${host}" "rm -f ${rm_args}" 2>/dev/null || true + log " [DONE] Removed ${#to_remove[@]} sessions on $host" + else + log " [OK] No cleanup needed" + fi + + log " Done" +} + +# โ”€โ”€ Main Loop โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +if [ ${#AGENTS[@]} -eq 0 ]; then + log "No agents configured in AGENTS array. Edit session-manager.sh to add your agents." + exit 0 +fi + +log "=== Session Manager Start ===" + +for agent_entry in "${AGENTS[@]}"; do + IFS='|' read -r agent port sessions_dir <<< "$agent_entry" + log "Checking $agent (port $port)" + + status_json=$(query_status "$agent" "$port") + rec=$(echo "$status_json" | python3 -c "import json,sys; print(json.load(sys.stdin).get('recommendation','unknown'))" 2>/dev/null || echo "unknown") + history=$(echo "$status_json" | python3 -c "import json,sys; print(json.load(sys.stdin).get('current_history_chars',0))" 2>/dev/null || echo "0") + turns=$(echo "$status_json" | python3 -c "import json,sys; print(json.load(sys.stdin).get('current_session_turns',0))" 2>/dev/null || echo "0") + session_cost=$(echo "$status_json" | python3 -c "import json,sys; print(json.load(sys.stdin).get('cost_since_last_reset',0))" 2>/dev/null || echo "0") + + char_limit=$(get_agent_char_limit "$agent" "$port") + log " Status: recommendation=$rec history=${history}ch / ${char_limit}ch limit | turns=$turns cost=\$${session_cost}" + + case "$rec" in + healthy|no_data) + log " [OK] Session healthy, no action needed" + ;; + monitor) + log " [WATCH] Session growing, compaction should trigger soon" + ;; + compact_soon) + log " [WARN] Session approaching limit โ€” compaction expected" + ;; + reset_recommended) + log " [CRITICAL] History exceeds ${char_limit}ch limit (at ${history}ch) โ€” compaction may have failed" + if [ -d "$sessions_dir" ]; then + largest=$(ls -S "$sessions_dir"/*.jsonl 2>/dev/null | head -1) + if [ -n "$largest" ]; then + basename=$(basename "$largest" .jsonl) + kill_session "$sessions_dir" "$basename" "safety valve: history=${history}ch, compaction failed" + fi + fi + ;; + cache_unstable) + log " [ALERT] Cache write percentage unusually high โ€” possible cache thrashing" + ;; + unavailable) + log " [WARN] Token monitor unavailable on port $port โ€” falling back to file cleanup only" + ;; + *) + log " [WARN] Unknown recommendation: $rec" + ;; + esac + + if [ -d "$sessions_dir" ]; then + clean_inactive "$sessions_dir" + enforce_count_limit "$sessions_dir" + fi + + log " Done" +done + +# โ”€โ”€ Remote Agents โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +for agent_entry in "${REMOTE_AGENTS[@]}"; do + IFS='|' read -r agent host remote_dir <<< "$agent_entry" + manage_remote_agent "$agent" "$host" "$remote_dir" +done + +# โ”€โ”€ Summary โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +log "=== Session Manager Complete ===" +for agent_entry in "${AGENTS[@]}"; do + IFS='|' read -r agent port sessions_dir <<< "$agent_entry" + if [ -d "$sessions_dir" ]; then + count=$(ls "$sessions_dir"/*.jsonl 2>/dev/null | wc -l) + log " $agent: $count sessions remaining" + ls -lht "$sessions_dir"/*.jsonl 2>/dev/null | head -5 || true + fi +done +for agent_entry in "${REMOTE_AGENTS[@]}"; do + IFS='|' read -r agent host remote_dir <<< "$agent_entry" + count=$(ssh -o ConnectTimeout=5 -o BatchMode=yes "${host}" "ls ${remote_dir}/*.jsonl 2>/dev/null | wc -l" 2>/dev/null || echo "?") + log " $agent (remote $host): $count sessions remaining" +done diff --git a/dream-server/extensions/services/token-spy/start.sh b/dream-server/extensions/services/token-spy/start.sh new file mode 100644 index 000000000..5ccca2a4c --- /dev/null +++ b/dream-server/extensions/services/token-spy/start.sh @@ -0,0 +1,77 @@ +#!/bin/bash +# Token Spy โ€” API Monitor โ€” launcher +# Starts proxy instances sharing a single database. +# Pure telemetry โ€” no request modification. +# +# Dual upstream routing: +# Anthropic Messages API (/v1/messages) โ†’ ANTHROPIC_UPSTREAM +# OpenAI Chat Completions (/v1/chat/completions) โ†’ OPENAI_UPSTREAM +# +# Database backend: +# DB_BACKEND=sqlite (default) โ€” uses SQLite in data/usage.db +# DB_BACKEND=postgres โ€” uses PostgreSQL/TimescaleDB on DB_HOST:DB_PORT +# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +set -e +cd "$(dirname "$0")" +mkdir -p data + +# Load env file if exists +if [ -f .env ]; then + export $(grep -v '^#' .env | xargs) +fi + +# Database backend (sqlite or postgres) +export DB_BACKEND="${DB_BACKEND:-sqlite}" + +# Upstream API config +# Strix Halo: llama-server on port 11434 (container port 8080 mapped to host 11434) +export ANTHROPIC_UPSTREAM="${ANTHROPIC_UPSTREAM:-https://api.anthropic.com}" +export OPENAI_UPSTREAM="${OPENAI_UPSTREAM:-http://localhost:11434}" +export API_PROVIDER="${API_PROVIDER:-local}" + +# โ”€โ”€ Agent Configuration โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ +# Define your agents below. Each agent gets its own proxy port. +# Format: AGENT_NAME= python3 -m uvicorn main:app --host 0.0.0.0 --port +# +# Single agent (simplest setup โ€” Strix Halo default): +# AGENT_NAME=openclaw python3 -m uvicorn main:app --host 0.0.0.0 --port 9110 +# +# Multiple agents (one process per agent): +# AGENT_NAME=agent-1 python3 -m uvicorn main:app --host 0.0.0.0 --port 9110 & +# AGENT_NAME=agent-2 python3 -m uvicorn main:app --host 0.0.0.0 --port 9111 & +# +# Local model agent (routes to llama-server): +# AGENT_NAME=openclaw OPENAI_UPSTREAM=http://localhost:11434 API_PROVIDER=local \ +# python3 -m uvicorn main:app --host 0.0.0.0 --port 9110 & +# โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +AGENT_NAME="${AGENT_NAME:-openclaw}" +PORT="${PORT:-9110}" + +# Session management for OpenClaw (local inference, $0 cost) +export AGENT_SESSION_DIRS="${AGENT_SESSION_DIRS:-'{\"openclaw\":\"~/dream-server/data/openclaw/home/agents/main/sessions\"}'}" +export LOCAL_MODEL_AGENTS="${LOCAL_MODEL_AGENTS:-openclaw}" + +echo "Starting Token Spy โ€” API Monitor..." +echo " Agent โ†’ ${AGENT_NAME}" +echo " Port โ†’ :${PORT}" +echo " Provider โ†’ ${API_PROVIDER}" +echo " DB Backendโ†’ ${DB_BACKEND}" +echo " Anthropic โ†’ ${ANTHROPIC_UPSTREAM}" +echo " OpenAI โ†’ ${OPENAI_UPSTREAM:-}" +echo " Local โ†’ ${LOCAL_MODEL_AGENTS:-}" + +AGENT_NAME="${AGENT_NAME}" python3 -m uvicorn main:app --host 0.0.0.0 --port "${PORT}" --log-level warning + +# โ”€โ”€ Multi-Agent Example โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ +# Uncomment and customize for multiple agents: +# +# AGENT_NAME=agent-1 python3 -m uvicorn main:app --host 0.0.0.0 --port 9110 --log-level warning & +# PID1=$! +# +# AGENT_NAME=agent-2 python3 -m uvicorn main:app --host 0.0.0.0 --port 9111 --log-level warning & +# PID2=$! +# +# trap "echo 'Stopping...'; kill $PID1 $PID2 2>/dev/null; wait" EXIT INT TERM +# wait diff --git a/dream-server/extensions/services/tts/compose.yaml b/dream-server/extensions/services/tts/compose.yaml new file mode 100644 index 000000000..866482bdb --- /dev/null +++ b/dream-server/extensions/services/tts/compose.yaml @@ -0,0 +1,27 @@ +services: + tts: + image: ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.4 + container_name: dream-tts + restart: unless-stopped + security_opt: + - no-new-privileges:true + environment: + - PYTHONDONTWRITEBYTECODE=1 + - DEFAULT_VOICE=af_heart + - UVICORN_WORKERS=2 + ports: + - "${TTS_PORT:-8880}:8880" + deploy: + resources: + limits: + cpus: '8.0' + memory: 4G + reservations: + cpus: '2.0' + memory: 1G + healthcheck: + test: ["CMD", "python3", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8880/health', timeout=5)"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 30s diff --git a/dream-server/extensions/services/tts/manifest.yaml b/dream-server/extensions/services/tts/manifest.yaml new file mode 100644 index 000000000..cc3beed5b --- /dev/null +++ b/dream-server/extensions/services/tts/manifest.yaml @@ -0,0 +1,18 @@ +schema_version: dream.services.v1 + +service: + id: tts + name: Kokoro (TTS) + aliases: [kokoro] + container_name: dream-tts + host_env: KOKORO_HOST + default_host: tts + port: 8880 + external_port_env: TTS_PORT + external_port_default: 8880 + health: /health + type: docker + gpu_backends: [amd, nvidia] + compose_file: compose.yaml + category: optional + depends_on: [] diff --git a/dream-server/extensions/services/whisper/compose.nvidia.yaml b/dream-server/extensions/services/whisper/compose.nvidia.yaml new file mode 100644 index 000000000..4102e3526 --- /dev/null +++ b/dream-server/extensions/services/whisper/compose.nvidia.yaml @@ -0,0 +1,13 @@ +services: + whisper: + image: ghcr.io/speaches-ai/speaches:latest-cuda + deploy: + resources: + reservations: + devices: + - driver: nvidia + count: 1 + capabilities: [gpu] + limits: + cpus: '4.0' + memory: 8G diff --git a/dream-server/extensions/services/whisper/compose.yaml b/dream-server/extensions/services/whisper/compose.yaml new file mode 100644 index 000000000..1f2e7e0dd --- /dev/null +++ b/dream-server/extensions/services/whisper/compose.yaml @@ -0,0 +1,33 @@ +services: + whisper: + image: ghcr.io/speaches-ai/speaches:latest-cpu + container_name: dream-whisper + restart: unless-stopped + security_opt: + - no-new-privileges:true + environment: + - WHISPER__TTL=86400 + entrypoint: + - /bin/sh + - -c + - | + sed -i 's/vad_filter=effective_vad_filter,/vad_filter=effective_vad_filter, vad_parameters={"threshold": 0.3, "min_silence_duration_ms": 400, "min_speech_duration_ms": 50, "speech_pad_ms": 200},/' /home/ubuntu/speaches/src/speaches/routers/stt.py + exec uvicorn --factory speaches.main:create_app + volumes: + - ./data/whisper:/home/ubuntu/.cache/huggingface/hub + ports: + - "${WHISPER_PORT:-9000}:8000" + deploy: + resources: + limits: + cpus: '4.0' + memory: 4G + reservations: + cpus: '1.0' + memory: 1G + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8000/health"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 60s diff --git a/dream-server/extensions/services/whisper/manifest.yaml b/dream-server/extensions/services/whisper/manifest.yaml new file mode 100644 index 000000000..0bac66c09 --- /dev/null +++ b/dream-server/extensions/services/whisper/manifest.yaml @@ -0,0 +1,32 @@ +schema_version: dream.services.v1 + +service: + id: whisper + name: Whisper (STT) + aliases: [stt, voice] + container_name: dream-whisper + host_env: WHISPER_HOST + default_host: whisper + port: 8000 + external_port_env: WHISPER_PORT + external_port_default: 9000 + health: /health + type: docker + gpu_backends: [amd, nvidia] + compose_file: compose.yaml + category: optional + depends_on: [] + +features: + - id: voice + name: Voice Assistant + description: Talk to your AI with your voice + icon: Mic + category: voice + requirements: + services: [whisper, tts] + vram_gb: 6 + enabled_services_all: [whisper, tts] + setup_time: ~5 minutes + priority: 2 + gpu_backends: [amd, nvidia] diff --git a/dream-server/extensions/templates/compose-gpu-only.yaml b/dream-server/extensions/templates/compose-gpu-only.yaml new file mode 100644 index 000000000..17aad8961 --- /dev/null +++ b/dream-server/extensions/templates/compose-gpu-only.yaml @@ -0,0 +1,160 @@ +# ============================================================================= +# GPU Overlay Template โ€” Pattern 2: Empty Base with Full GPU Overlay +# ============================================================================= +# +# USE THIS PATTERN WHEN: +# Your service ONLY makes sense on a GPU (e.g., image generation, video +# rendering, model training). There is no useful CPU fallback. The base +# compose.yaml is an empty stub; the entire service definition lives in +# the GPU-specific overlays. +# +# WHY AN EMPTY BASE? +# The compose resolver and service registry detect a service as "enabled" +# by the presence of compose.yaml. An empty stub (`services: {}`) satisfies +# that check without defining a runnable container. The GPU overlay then +# provides the full definition, which varies significantly between vendors +# (different images, device passthrough, environment variables, etc.). +# +# REAL EXAMPLE: +# extensions/services/comfyui/compose.yaml (empty stub) +# extensions/services/comfyui/compose.nvidia.yaml (full NVIDIA definition) +# extensions/services/comfyui/compose.amd.yaml (full AMD definition) +# +# FILE LAYOUT: +# extensions/services/my-service/ +# manifest.yaml +# compose.yaml <-- Empty stub (this file) +# compose.nvidia.yaml <-- Complete service for NVIDIA +# compose.amd.yaml <-- Complete service for AMD +# +# The compose resolver (resolve-compose-stack.sh) picks up the correct overlay +# based on the detected GPU vendor. Only one overlay is active at a time. +# ============================================================================= + + +# ----------------------------------------------------------------------------- +# compose.yaml โ€” Empty base stub +# ----------------------------------------------------------------------------- +# This file exists so the registry can detect my-service as enabled. +# The actual service definition comes from the GPU overlay. + +# my-service โ€” GPU-Only Service +# This base stub is merged with a GPU-specific overlay: +# compose.amd.yaml (AMD ROCm) +# compose.nvidia.yaml (NVIDIA CUDA) +# The GPU overlay provides the full service definition. +# This file exists so the registry can detect my-service as enabled. +#services: {} + + +# ----------------------------------------------------------------------------- +# compose.nvidia.yaml โ€” Full NVIDIA CUDA definition +# ----------------------------------------------------------------------------- +# Since the base is empty, this overlay must define EVERYTHING: +# image, container_name, ports, volumes, healthcheck, deploy, etc. + +services: + my-service: + # Use the NVIDIA/CUDA-compatible image. + image: myorg/my-service:latest-cuda + container_name: dream-my-service + restart: unless-stopped + + ports: + - "${MY_SERVICE_PORT:-8080}:8080" + + volumes: + - ./data/my-service/models:/models + - ./data/my-service/output:/output + + # Shared memory โ€” needed for PyTorch DataLoader workers and large tensors. + shm_size: '8g' + + # NVIDIA GPU reservation via the NVIDIA Container Toolkit. + deploy: + resources: + reservations: + devices: + - driver: nvidia + count: 1 # Use "all" for multi-GPU workloads + capabilities: [gpu] + + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/health"] + interval: 30s + timeout: 10s + retries: 3 + # GPU services often have long startup (model loading). Be generous. + start_period: 120s + + +# ----------------------------------------------------------------------------- +# compose.amd.yaml โ€” Full AMD ROCm definition +# ----------------------------------------------------------------------------- +# AMD requires device passthrough, group membership, and ROCm-specific env vars. +# This is a separate file because the configuration diverges significantly +# from NVIDIA (different image, device nodes, environment, sometimes different +# command-line flags). +# +# Copy the block below into a standalone compose.amd.yaml file. +# ----------------------------------------------------------------------------- +# +# services: +# my-service: +# image: myorg/my-service:latest-rocm +# container_name: dream-my-service +# restart: unless-stopped +# +# # AMD GPU device passthrough โ€” both DRI (rendering) and KFD (compute). +# devices: +# - /dev/dri:/dev/dri +# - /dev/kfd:/dev/kfd +# +# # The container user must be in the host's video and render groups. +# group_add: +# - "${VIDEO_GID:-44}" +# - "${RENDER_GID:-992}" +# +# # ROCm profiling/debugging may need these relaxed security settings. +# cap_add: +# - SYS_PTRACE +# security_opt: +# - seccomp:unconfined +# +# # Shared memory for PyTorch / large tensor operations. +# shm_size: 8g +# +# environment: +# # Override GFX version to match your AMD GPU architecture. +# # Check: rocminfo | grep gfx +# - HSA_OVERRIDE_GFX_VERSION=11.5.1 +# # Optional tuning flags for PyTorch on ROCm. +# - PYTORCH_TUNABLEOP_ENABLED=1 +# - TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1 +# +# volumes: +# - ./data/my-service/models:/models +# - ./data/my-service/output:/output +# +# ports: +# - "${MY_SERVICE_PORT:-8080}:8080" +# +# # AMD images may need a custom entrypoint or different CLI flags. +# command: >- +# python3 /app/main.py --listen 0.0.0.0 --gpu-only +# +# deploy: +# resources: +# limits: +# cpus: '16.0' +# memory: 55G +# reservations: +# cpus: '2.0' +# memory: 4G +# +# healthcheck: +# test: ["CMD", "curl", "-f", "http://localhost:8080/health"] +# interval: 30s +# timeout: 10s +# retries: 3 +# start_period: 120s diff --git a/dream-server/extensions/templates/compose-gpu-swap.yaml b/dream-server/extensions/templates/compose-gpu-swap.yaml new file mode 100644 index 000000000..e63821a6b --- /dev/null +++ b/dream-server/extensions/templates/compose-gpu-swap.yaml @@ -0,0 +1,101 @@ +# ============================================================================= +# GPU Overlay Template โ€” Pattern 1: CPU-Base with GPU Tag Swap +# ============================================================================= +# +# USE THIS PATTERN WHEN: +# Your service runs on CPU by default and you want to accelerate it on GPU. +# The base compose.yaml carries the full service definition with a CPU image. +# The GPU overlay only swaps the image tag and adds device reservations. +# +# HOW IT WORKS: +# Docker Compose merges overlays on top of the base. Keys in the overlay +# replace matching keys in the base, so setting `image:` here replaces the +# CPU image from compose.yaml. The `deploy.resources` block is also replaced +# because the GPU variant typically needs different resource limits. +# +# REAL EXAMPLE: +# extensions/services/whisper/compose.yaml (CPU image: latest-cpu) +# extensions/services/whisper/compose.nvidia.yaml (swaps to: latest-cuda) +# +# FILE LAYOUT: +# extensions/services/my-service/ +# manifest.yaml +# compose.yaml <-- Full definition with CPU image (see compose-template.yaml) +# compose.nvidia.yaml <-- This file (NVIDIA GPU swap) +# compose.amd.yaml <-- Same idea, targeting AMD ROCm +# +# The compose resolver (resolve-compose-stack.sh) picks up the correct overlay +# based on the detected GPU vendor. Only one overlay is active at a time. +# ============================================================================= + + +# ----------------------------------------------------------------------------- +# compose.nvidia.yaml โ€” NVIDIA CUDA overlay +# ----------------------------------------------------------------------------- +# Only the keys that DIFFER from the CPU base need to appear here. +# Everything else (container_name, ports, volumes, healthcheck, etc.) +# is inherited from compose.yaml unchanged. + +services: + my-service: + # Swap the CPU image tag for the CUDA variant. + # The base compose.yaml has: image: myorg/my-service:latest-cpu + image: myorg/my-service:latest-cuda + + # Grant access to NVIDIA GPUs via the container toolkit. + deploy: + resources: + reservations: + devices: + - driver: nvidia + count: 1 # Number of GPUs to reserve (use "all" for every GPU) + capabilities: [gpu] + # GPU workloads often need more memory than CPU-only. + limits: + cpus: '4.0' + memory: 8G + + +# ----------------------------------------------------------------------------- +# compose.amd.yaml โ€” AMD ROCm overlay (same file, different name) +# ----------------------------------------------------------------------------- +# For AMD GPUs, there is no deploy.resources.devices driver shorthand. +# Instead, pass the DRI and KFD device nodes directly. +# +# Below is what compose.amd.yaml would look like for the same service. +# Copy this into a separate compose.amd.yaml file. +# ----------------------------------------------------------------------------- +# +# services: +# my-service: +# # Swap the CPU image tag for the ROCm variant. +# image: myorg/my-service:latest-rocm +# +# # AMD GPU device passthrough โ€” required for ROCm. +# devices: +# - /dev/dri:/dev/dri +# - /dev/kfd:/dev/kfd +# +# # The container user must belong to the video and render groups on the host. +# group_add: +# - "${VIDEO_GID:-44}" +# - "${RENDER_GID:-992}" +# +# # ROCm sometimes needs relaxed seccomp for profiling/debugging. +# cap_add: +# - SYS_PTRACE +# security_opt: +# - seccomp:unconfined +# +# # AMD-specific environment (adjust GFX version to your card). +# environment: +# - HSA_OVERRIDE_GFX_VERSION=11.5.1 +# +# deploy: +# resources: +# limits: +# cpus: '4.0' +# memory: 8G +# reservations: +# cpus: '1.0' +# memory: 2G diff --git a/dream-server/extensions/templates/compose-template.yaml b/dream-server/extensions/templates/compose-template.yaml new file mode 100644 index 000000000..a792a9a65 --- /dev/null +++ b/dream-server/extensions/templates/compose-template.yaml @@ -0,0 +1,91 @@ +# ============================================================================= +# Dream Server โ€” Compose Fragment Template +# ============================================================================= +# +# This file is merged into the compose stack via: +# docker compose -f docker-compose.base.yml -f docker-compose..yml \ +# -f extensions/services/my-service/compose.yaml +# +# RULES: +# - The top-level key MUST be "services:" (standard Compose format) +# - The service name MUST match the "id" in your manifest.yaml +# - Use ${VAR:-default} syntax for user-configurable values +# - Join the shared network: dream-network (defined in base.yml) +# - Mount data under ./data// (relative to project root) +# - Always set restart, security_opt, and a healthcheck +# +# GPU OVERLAYS: +# If your service needs GPU access, create compose.amd.yaml / compose.nvidia.yaml +# alongside this file. The compose resolver picks them up automatically. +# There are TWO patterns โ€” pick the one that fits your service: +# +# Pattern 1 โ€” CPU-base with GPU tag swap (compose-gpu-swap.yaml) +# Use when your service works on CPU but runs faster on GPU. +# This file (compose.yaml) carries the full definition with a CPU image. +# The GPU overlay only swaps the image tag and adds device access. +# Example: whisper (speech-to-text runs on CPU, accelerated by CUDA). +# +# Pattern 2 โ€” Empty base with full GPU overlay (compose-gpu-only.yaml) +# Use when your service ONLY makes sense on a GPU (no CPU fallback). +# This file (compose.yaml) is just `services: {}` (empty stub). +# Each GPU overlay contains the complete service definition. +# Example: comfyui (image generation requires a GPU). +# +# See the template files for detailed, commented examples: +# extensions/templates/compose-gpu-swap.yaml (Pattern 1) +# extensions/templates/compose-gpu-only.yaml (Pattern 2) +# +# ============================================================================= + +services: + my-service: + image: myorg/my-service:latest + container_name: dream-my-service + + restart: unless-stopped + + # Drop privileges + user: "${UID:-1000}:${GID:-1000}" + security_opt: + - no-new-privileges:true + + environment: + - MY_SETTING=${MY_SETTING:-default_value} + + volumes: + # Persistent data โ€” survives container rebuilds + - ./data/my-service:/app/data + + ports: + # External port (user-facing) : Internal port (container) + - "${MY_SERVICE_PORT:-1234}:1234" + + networks: + - dream-network + + deploy: + resources: + limits: + cpus: '2.0' + memory: 2G + reservations: + cpus: '0.25' + memory: 256M + + healthcheck: + test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:1234/health"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 15s + +# If you reference volumes or networks defined in docker-compose.base.yml, +# you don't need to redeclare them here. Only declare NEW named volumes: +# +# volumes: +# my-service-data: +# driver: local + +networks: + dream-network: + external: true diff --git a/dream-server/extensions/templates/dashboard-plugin-template.js b/dream-server/extensions/templates/dashboard-plugin-template.js new file mode 100644 index 000000000..ab0cb80cb --- /dev/null +++ b/dream-server/extensions/templates/dashboard-plugin-template.js @@ -0,0 +1,37 @@ +// Dashboard extension template. +// Copy this file and import it from your plugin entrypoint. + +import { Sparkles } from 'lucide-react' +import { registerRoutes, registerExternalLinks } from '../../dashboard/src/plugins/registry' + +function MyExtensionPage() { + return ( +
+

My Extension

+

Replace with your extension UI.

+
+ ) +} + +registerRoutes([ + { + id: 'my-extension', + path: '/my-extension', + label: 'My Extension', + icon: Sparkles, + component: MyExtensionPage, + getProps: () => ({}), + sidebar: true, + order: 100, + }, +]) + +registerExternalLinks([ + { + id: 'my-service-link', + label: 'My Service', + icon: Sparkles, + port: 1234, + healthNeedles: ['my service'], + }, +]) diff --git a/dream-server/extensions/templates/service-template.yaml b/dream-server/extensions/templates/service-template.yaml new file mode 100644 index 000000000..0cd227d47 --- /dev/null +++ b/dream-server/extensions/templates/service-template.yaml @@ -0,0 +1,146 @@ +# ============================================================================= +# Dream Server โ€” Service Extension Manifest Template +# ============================================================================= +# +# HOW TO USE THIS TEMPLATE: +# 1. Copy this entire directory structure into extensions/services// +# 2. Rename and fill in all fields marked REQUIRED +# 3. Add a compose.yaml next to this manifest (see compose-template.yaml) +# 4. Run: dream enable (or just drop in the compose.yaml) +# 5. Run: dream start +# +# DIRECTORY LAYOUT: +# extensions/services/my-service/ +# manifest.yaml <- this file (REQUIRED) +# compose.yaml <- Docker Compose fragment (REQUIRED for non-core) +# compose.amd.yaml <- GPU overlay for AMD (optional) +# compose.nvidia.yaml <- GPU overlay for NVIDIA (optional) +# setup.sh <- Installer hook, runs once during install (optional) +# README.md <- Documentation for contributors (optional) +# +# VALIDATION: +# Schema: extensions/schema/service-manifest.v1.json +# Test: python3 -c "import yaml; yaml.safe_load(open('manifest.yaml'))" +# +# ============================================================================= + +# REQUIRED โ€” must be exactly this string +schema_version: dream.services.v1 + +service: + # โ”€โ”€ Identity (REQUIRED) โ”€โ”€ + + # Unique ID: lowercase alphanumeric + hyphens. Used in CLI, compose, registry. + # Must match the directory name under extensions/services/. + id: my-service + + # Human-readable name shown in dashboard sidebar and CLI output. + name: My Service + + # โ”€โ”€ CLI Aliases (optional) โ”€โ”€ + # Shorthand names users can type instead of the full ID. + # Example: "dream logs workflows" resolves to n8n. + aliases: [] + # aliases: [myservice, ms] + + # โ”€โ”€ Docker (REQUIRED for docker services) โ”€โ”€ + + # Container name. Convention: dream-. Used by "dream shell ". + container_name: dream-my-service + + # Compose hostname / env var for inter-container networking. + host_env: MY_SERVICE_HOST # env var name (optional) + default_host: my-service # Docker DNS name (should match compose service name) + + # โ”€โ”€ Ports (REQUIRED) โ”€โ”€ + + # Internal port the service listens on inside the container. + port: 1234 + + # External port exposed to the host. Env var allows user override in .env. + external_port_env: MY_SERVICE_PORT # env var name (optional) + external_port_default: 1234 # default if env var unset + + # โ”€โ”€ Health Check (REQUIRED) โ”€โ”€ + # Path the dashboard hits to determine if the service is up. + # Must return HTTP < 500 to be considered healthy. + health: /health + # Common patterns: /health, /healthz, /api/health, / + + # โ”€โ”€ Service Type โ”€โ”€ + # "docker" (default) โ€” runs in Docker Compose. + # "host-systemd" โ€” runs on the host OS, checked via HOST_GATEWAY. + type: docker + + # โ”€โ”€ GPU Backends โ”€โ”€ + # Which GPU backends this service supports. Omit if no GPU needed. + # Services are only shown/started when the detected backend matches. + gpu_backends: [amd, nvidia] + # Use [amd, nvidia] for most services. GPU-specific services use [amd] or [nvidia]. + + # โ”€โ”€ Compose Fragment (REQUIRED for non-core services) โ”€โ”€ + # Relative path to the Docker Compose fragment in this directory. + # The compose file is merged via: docker compose -f base.yml -f + compose_file: compose.yaml + + # โ”€โ”€ Category โ”€โ”€ + # core: Always on, lives in docker-compose.base.yml (no compose.yaml needed) + # recommended: Enabled by default for most hardware profiles + # optional: User must run "dream enable " to activate + category: optional + + # โ”€โ”€ Dependencies (optional) โ”€โ”€ + # Other service IDs that must be running for this service to work. + # "dream enable" will prompt to enable missing dependencies. + depends_on: [] + # depends_on: [llama-server, qdrant] + + # โ”€โ”€ Environment Variables (optional) โ”€โ”€ + # Documents env vars this service uses. Helps "dream enable" prompt for values. + env_vars: [] + # env_vars: + # - key: MY_API_KEY + # required: true + # secret: true # masked in logs/UI + # description: API key for the service + # - key: MY_WORKERS + # required: false + # default: "4" + # description: Number of worker threads + + # โ”€โ”€ Installer Setup Hook (optional) โ”€โ”€ + # Relative path to a script run ONCE during installation (phase 11). + # Receives two arguments: $1 = INSTALL_DIR, $2 = GPU_BACKEND. + # Use for: creating data dirs, generating config files, downloading assets. + # setup_hook: setup.sh + +# ============================================================================= +# Features โ€” what users see in the dashboard "Features" page +# ============================================================================= +# Each feature maps to one or more services. The dashboard checks whether the +# required services are healthy and shows a status badge accordingly. +# A single service can power multiple features. Omit this section entirely +# if your service doesn't surface a user-visible feature. + +features: + - id: my-feature + name: My Feature # REQUIRED โ€” dashboard display name + description: What this feature does # REQUIRED โ€” one-line summary + icon: Sparkles # REQUIRED โ€” Lucide icon name + category: productivity # REQUIRED โ€” groups features in the UI + # Categories: ai, productivity, media, search, privacy, developer, system + + requirements: + services: [my-service] # ALL must be healthy (AND logic) + # services_any: [svc-a, svc-b] # ANY must be healthy (OR logic) + vram_gb: 0 # Minimum VRAM. 0 = no GPU needed. + # disk_gb: 10 # Minimum free disk (optional) + + # Services that must be enabled (compose.yaml present) for this feature to + # appear at all. Different from "requirements" which checks runtime health. + enabled_services_all: [my-service] + # enabled_services_any: [svc-a, svc-b] + + setup_time: ~2 minutes # Shown to users during onboarding + priority: 100 # Lower = listed first in dashboard + gpu_backends: [amd, nvidia] # Which backends support this feature diff --git a/dream-server/get-dream-server.sh b/dream-server/get-dream-server.sh index 466e506f0..60538c2ab 100644 --- a/dream-server/get-dream-server.sh +++ b/dream-server/get-dream-server.sh @@ -1,6 +1,6 @@ #!/bin/bash # Dream Server Bootstrap Installer -# curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/Lighthouse-AI/main/dream-server/get-dream-server.sh | bash +# curl -fsSL https://raw.githubusercontent.com/Light-Heart-Labs/DreamServer/main/get-dream-server.sh | bash # # Detects OS, clones repo, runs installer. @@ -15,7 +15,7 @@ CYAN='\033[0;36m' BOLD='\033[1m' NC='\033[0m' -REPO_URL="https://github.com/Light-Heart-Labs/Lighthouse-AI.git" +REPO_URL="https://github.com/Light-Heart-Labs/DreamServer.git" INSTALL_DIR="$HOME/dream-server" log() { echo -e "${CYAN}[dream]${NC} $1"; } @@ -227,8 +227,7 @@ success "Cloned to $INSTALL_DIR" # โ”€โ”€ Make scripts executable โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ chmod +x "$INSTALL_DIR/install.sh" 2>/dev/null || true -chmod +x "$INSTALL_DIR/setup.sh" 2>/dev/null || true -chmod +x "$INSTALL_DIR/status.sh" 2>/dev/null || true +chmod +x "$INSTALL_DIR/dream-cli" 2>/dev/null || true chmod +x "$INSTALL_DIR/scripts/"*.sh 2>/dev/null || true chmod +x "$INSTALL_DIR/tests/"*.sh 2>/dev/null || true diff --git a/dream-server/install-core.sh b/dream-server/install-core.sh new file mode 100644 index 000000000..c2f694afd --- /dev/null +++ b/dream-server/install-core.sh @@ -0,0 +1,153 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Orchestrator +# ============================================================================ +# Unified installer - voice-enabled by default, uses docker-compose.yml +# profiles for optional features. +# Mission: M5 (Clonable Dream Setup Server) +# +# This file sources library modules (pure functions, no side effects) then +# runs each install phase in order. Individual modules live under: +# installers/lib/ โ€” reusable function libraries +# installers/phases/ โ€” sequential install steps (execute on source) +# +# See each module's header for what it expects and provides. +# ============================================================================ + +set -e + +#============================================================================= +# Interrupt Protection +#============================================================================= +# Accidental keypresses (Ctrl+C, Ctrl+Z) shouldn't silently kill the install. +# We require a double-tap of Ctrl+C within 3 seconds to actually abort. +LAST_SIGINT=0 +interrupt_handler() { + local now + now=$(date +%s) + if (( now - LAST_SIGINT <= 3 )); then + echo "" + echo -e "\033[0;33m[!] Install cancelled by user.\033[0m" + echo -e "\033[0;32m Log file: ${LOG_FILE:-/tmp/dream-server-install.log}\033[0m" + exit 130 + fi + LAST_SIGINT=$now + echo "" + echo -e "\033[0;33m[!] Press Ctrl+C again within 3 seconds to cancel the install.\033[0m" +} +trap interrupt_handler INT +# Ignore Ctrl+Z (SIGTSTP) entirely โ€” backgrounding the installer breaks things +trap '' TSTP + +#============================================================================= +# Load libraries (pure functions, no side effects) +#============================================================================= +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +source "$SCRIPT_DIR/installers/lib/constants.sh" +source "$SCRIPT_DIR/installers/lib/logging.sh" +source "$SCRIPT_DIR/installers/lib/ui.sh" +source "$SCRIPT_DIR/installers/lib/detection.sh" +source "$SCRIPT_DIR/installers/lib/tier-map.sh" +source "$SCRIPT_DIR/installers/lib/compose-select.sh" + +#============================================================================= +# Command Line Args +#============================================================================= +DRY_RUN=false +SKIP_DOCKER=false +FORCE=false +TIER="" +ENABLE_VOICE=true +ENABLE_WORKFLOWS=true +ENABLE_RAG=true +ENABLE_OPENCLAW=true +INTERACTIVE=true +DREAM_MODE="${DREAM_MODE:-local}" +OFFLINE_MODE=false # M1 integration: fully air-gapped operation +SUMMARY_JSON_FILE="${SUMMARY_JSON_FILE:-}" + +usage() { + cat << EOF +Dream Server Installer v${VERSION} + +Usage: $0 [OPTIONS] + +Options: + --dry-run Show what would be done without making changes + --skip-docker Skip Docker installation (assume already installed) + --force Overwrite existing installation + --tier N Force specific tier (1-4) instead of auto-detect + --cloud Cloud mode: skip GPU detection, use LiteLLM + cloud APIs + --voice Enable voice services (Whisper + Kokoro) + --workflows Enable n8n workflow automation + --rag Enable RAG with Qdrant vector database + --openclaw Enable OpenClaw AI agent framework + --all Enable all optional services + --non-interactive Run without prompts (use defaults or flags) + --offline M1 mode: Configure for fully offline/air-gapped operation + --summary-json P Write machine-readable install summary JSON to path P + -h, --help Show this help + +Tiers: + 1 - Entry Level (8GB+ VRAM, 7B models) + 2 - Prosumer (12GB+ VRAM, 14B-32B AWQ models) + 3 - Pro (24GB+ VRAM, 32B models) + 4 - Enterprise (48GB+ VRAM or dual GPU, 72B models) + +Examples: + $0 # Interactive setup + $0 --tier 2 --voice # Tier 2 with voice + $0 --all --non-interactive # Full stack, no prompts + $0 --cloud # Cloud mode (no GPU needed, uses API keys) + $0 --offline --all # Fully offline (M1 mode) with all services + $0 --dry-run # Preview installation + +EOF + exit 0 +} + +while [[ $# -gt 0 ]]; do + case $1 in + --dry-run) DRY_RUN=true; shift ;; + --skip-docker) SKIP_DOCKER=true; shift ;; + --force) FORCE=true; shift ;; + --tier) TIER="$2"; shift 2 ;; + --cloud) DREAM_MODE="cloud"; shift ;; + --voice) ENABLE_VOICE=true; shift ;; + --workflows) ENABLE_WORKFLOWS=true; shift ;; + --rag) ENABLE_RAG=true; shift ;; + --openclaw) ENABLE_OPENCLAW=true; shift ;; + --all) ENABLE_VOICE=true; ENABLE_WORKFLOWS=true; ENABLE_RAG=true; ENABLE_OPENCLAW=true; shift ;; + --non-interactive) INTERACTIVE=false; shift ;; + --offline) OFFLINE_MODE=true; shift ;; + --summary-json) SUMMARY_JSON_FILE="$2"; shift 2 ;; + -h|--help) usage ;; + *) error "Unknown option: $1" ;; + esac +done + +#============================================================================= +# Splash +#============================================================================= +show_stranger_boot +[[ "$INTERACTIVE" == "true" ]] && sleep 5 + +$DRY_RUN && echo -e "${AMB}>>> DRY RUN MODE โ€” I will simulate everything. No changes made. <<<${NC}\n" + +#============================================================================= +# Run phases +#============================================================================= +source "$SCRIPT_DIR/installers/phases/01-preflight.sh" +source "$SCRIPT_DIR/installers/phases/02-detection.sh" +source "$SCRIPT_DIR/installers/phases/03-features.sh" +source "$SCRIPT_DIR/installers/phases/04-requirements.sh" +source "$SCRIPT_DIR/installers/phases/05-docker.sh" +source "$SCRIPT_DIR/installers/phases/06-directories.sh" +source "$SCRIPT_DIR/installers/phases/07-devtools.sh" +source "$SCRIPT_DIR/installers/phases/08-images.sh" +source "$SCRIPT_DIR/installers/phases/09-offline.sh" +source "$SCRIPT_DIR/installers/phases/10-amd-tuning.sh" +source "$SCRIPT_DIR/installers/phases/11-services.sh" +source "$SCRIPT_DIR/installers/phases/12-health.sh" +source "$SCRIPT_DIR/installers/phases/13-summary.sh" diff --git a/dream-server/install-windows.bat b/dream-server/install-windows.bat deleted file mode 100644 index b2a65846d..000000000 --- a/dream-server/install-windows.bat +++ /dev/null @@ -1,60 +0,0 @@ -@echo off -:: Dream Server Windows Installer - Batch Entry Point -:: This bypasses PowerShell execution policy issues -:: -:: Usage: Double-click or run from cmd: -:: install-windows.bat -:: install-windows.bat -DryRun -:: install-windows.bat -All - -setlocal enabledelayedexpansion - -:: Get script directory -set "SCRIPT_DIR=%~dp0" - -:: Check if running as admin -net session >nul 2>&1 -if %errorLevel% neq 0 ( - echo. - echo ============================================================ - echo Dream Server Installer - echo ============================================================ - echo. - echo This installer requires Administrator privileges. - echo Right-click and select "Run as administrator" - echo. - echo Press any key to exit... - pause >nul - exit /b 1 -) - -:: Check PowerShell exists -where powershell >nul 2>&1 -if %errorLevel% neq 0 ( - echo ERROR: PowerShell not found - exit /b 1 -) - -:: Run the PowerShell installer with bypass -echo. -echo ============================================================ -echo Dream Server Installer for Windows -echo ============================================================ -echo. -echo Starting installation... -echo. - -powershell -ExecutionPolicy Bypass -NoProfile -File "%SCRIPT_DIR%install.ps1" %* - -:: Capture exit code -set EXIT_CODE=%errorlevel% - -if %EXIT_CODE% neq 0 ( - echo. - echo Installation failed with error code: %EXIT_CODE% - echo. - echo Press any key to exit... - pause >nul -) - -exit /b %EXIT_CODE% diff --git a/dream-server/install.ps1 b/dream-server/install.ps1 deleted file mode 100644 index b9d0e7d1c..000000000 --- a/dream-server/install.ps1 +++ /dev/null @@ -1,422 +0,0 @@ -# Dream Server Installer for Windows (WSL2 + Docker Desktop) -# Version 2.1.0 -# -# Run via batch file to bypass execution policy: -# install-windows.bat [OPTIONS] -# -# Or directly if policy allows: -# .\install.ps1 [OPTIONS] - -param( - [switch]$DryRun, - [switch]$Force, - [int]$Tier = 0, - [switch]$Voice, - [switch]$Workflows, - [switch]$Rag, - [switch]$All, - [switch]$Bootstrap, - [switch]$NoBootstrap, - [switch]$Diagnose, - [switch]$Help -) - -$ErrorActionPreference = "Stop" -$Version = "2.1.0" -$InstallDir = "$env:LOCALAPPDATA\DreamServer" # Avoids spaces in path - -# Colors -function Write-Info { Write-Host "[INFO] $args" -ForegroundColor Cyan } -function Write-Ok { Write-Host "[OK] $args" -ForegroundColor Green } -function Write-Warn { Write-Host "[WARN] $args" -ForegroundColor Yellow } -function Write-Err { Write-Host "[ERROR] $args" -ForegroundColor Red } - -function Show-Header { - param([string]$Title) - Write-Host "" - Write-Host ("=" * 60) -ForegroundColor Blue - Write-Host " $Title" -ForegroundColor Blue - Write-Host ("=" * 60) -ForegroundColor Blue -} - -function Show-Help { - @" -Dream Server Installer for Windows v$Version - -Usage: install-windows.bat [OPTIONS] - .\install.ps1 [OPTIONS] - -Options: - -DryRun Show what would be done without making changes - -Force Overwrite existing installation - -Tier N Force specific tier (1-4) instead of auto-detect - -Voice Enable voice services (Whisper + TTS) - -Workflows Enable n8n workflow automation - -Rag Enable RAG with Qdrant vector database - -All Enable all optional services - -Bootstrap Start with small model, upgrade later (faster first start) - -NoBootstrap Skip bootstrap, download full model immediately - -Diagnose Run diagnostics only (don't install) - -Help Show this help - -Prerequisites: - - Windows 10 version 2004+ or Windows 11 - - WSL2 enabled - - Docker Desktop with WSL2 backend - - NVIDIA GPU with latest drivers (for GPU acceleration) - -Tiers: - 1 - Entry Level (8GB+ VRAM, 7B models) - 2 - Prosumer (12GB+ VRAM, 14B-32B AWQ models) - 3 - Pro (24GB+ VRAM, 32B models) - 4 - Enterprise (48GB+ VRAM or dual GPU, 72B models) - -Examples: - install-windows.bat # Interactive setup - install-windows.bat -Tier 2 -Voice # Tier 2 with voice - install-windows.bat -All # Full stack - install-windows.bat -Bootstrap # Quick start with small model - install-windows.bat -Diagnose # Check system only - install-windows.bat -DryRun # Preview installation - -Troubleshooting: - See docs/WSL2-GPU-TROUBLESHOOTING.md for common issues. -"@ - exit 0 -} - -if ($Help) { Show-Help } -if ($All) { $Voice = $true; $Workflows = $true; $Rag = $true } - -# Diagnose mode - just run checks and exit -if ($Diagnose) { - Write-Host "" - Write-Host "Dream Server System Diagnostics" -ForegroundColor Cyan - Write-Host "================================" -ForegroundColor Cyan - # Fall through to prerequisites, will exit after hardware detection -} - -#============================================================================= -# Prerequisites Check -#============================================================================= -Show-Header "Checking Prerequisites" - -# Check PowerShell execution policy (show warning only) -$execPolicy = Get-ExecutionPolicy -if ($execPolicy -eq "Restricted" -or $execPolicy -eq "AllSigned") { - Write-Warn "PowerShell execution policy is '$execPolicy'" - Write-Info "If this script fails to run, use: powershell -ExecutionPolicy Bypass -File install.ps1" - Write-Info "Or run via: install-windows.bat (handles this automatically)" -} - -# Windows Defender / antivirus warning -Write-Info "Tip: If install fails with GPU access errors, Windows Defender may be blocking Docker." -Write-Info " See docs/WINDOWS-WSL2-GPU-GUIDE.md for antivirus exclusion steps." -Write-Host "" - -# Check Windows version -$winVer = [System.Environment]::OSVersion.Version -if ($winVer.Build -lt 19041) { - Write-Err "Windows 10 version 2004 (build 19041) or later required" - Write-Err "Current build: $($winVer.Build)" - exit 1 -} -Write-Ok "Windows version: $($winVer.Major).$($winVer.Minor) build $($winVer.Build)" - -# Check WSL2 -$wslStatus = wsl --status 2>&1 -if ($LASTEXITCODE -ne 0) { - Write-Err "WSL2 is not installed or not configured" - Write-Info "Run: wsl --install" - exit 1 -} -Write-Ok "WSL2 is available" - -# Check for Ubuntu distro -$distros = wsl -l -q 2>&1 -if (-not ($distros -match "Ubuntu")) { - Write-Warn "Ubuntu WSL distro not found" - Write-Info "Installing Ubuntu..." - if (-not $DryRun) { - wsl --install -d Ubuntu - Write-Info "Ubuntu installed. Please restart and run this script again." - exit 0 - } -} -Write-Ok "Ubuntu WSL distro available" - -# Check Docker Desktop -$dockerPath = Get-Command docker -ErrorAction SilentlyContinue -if (-not $dockerPath) { - Write-Err "Docker Desktop not found" - Write-Info "Please install Docker Desktop from: https://docker.com/products/docker-desktop" - exit 1 -} - -# Check Docker is running -$dockerInfo = docker info 2>&1 -if ($LASTEXITCODE -ne 0) { - Write-Err "Docker Desktop is not running" - Write-Info "Please start Docker Desktop and try again" - exit 1 -} -Write-Ok "Docker Desktop is running" - -# Check WSL2 backend -if (-not ($dockerInfo -match "WSL")) { - Write-Warn "Docker may not be using WSL2 backend" - Write-Info "Recommended: Enable WSL2 backend in Docker Desktop settings" -} - -# Check NVIDIA Container Toolkit -Write-Info "Testing GPU access in Docker (this may take a moment on first run)..." -try { - $nvidiaDocker = docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi 2>&1 - if ($LASTEXITCODE -eq 0) { - Write-Ok "NVIDIA Container Toolkit working" - $GpuInDocker = $true - } else { - Write-Warn "NVIDIA GPU support not detected in Docker" - Write-Info "See docs/WSL2-GPU-TROUBLESHOOTING.md for help" - $GpuInDocker = $false - } -} catch { - Write-Warn "Could not test GPU access: $_" - Write-Info "See docs/WSL2-GPU-TROUBLESHOOTING.md for help" - $GpuInDocker = $false -} - -#============================================================================= -# Hardware Detection -#============================================================================= -Show-Header "Detecting Hardware" - -# Run PowerShell detection script -$scriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path -$detectScript = Join-Path $scriptDir "scripts\detect-hardware.ps1" - -if (Test-Path $detectScript) { - $hwInfo = & $detectScript -Json | ConvertFrom-Json - $GpuVram = $hwInfo.gpu.vram_gb - $GpuName = $hwInfo.gpu.name - $RamGb = $hwInfo.ram_gb - $CpuCores = $hwInfo.cores - - Write-Ok "CPU: $($hwInfo.cpu)" - Write-Ok "RAM: ${RamGb}GB" - if ($GpuName) { - Write-Ok "GPU: $GpuName (${GpuVram}GB VRAM)" - } else { - Write-Warn "No GPU detected" - } -} else { - # Fallback detection - try { - $nvidiaSmi = & nvidia-smi --query-gpu=name,memory.total --format=csv,noheader,nounits 2>$null - if ($nvidiaSmi) { - $parts = $nvidiaSmi -split ',' - $GpuName = $parts[0].Trim() - $GpuVram = [math]::Floor([int]$parts[1].Trim() / 1024) - Write-Ok "GPU: $GpuName (${GpuVram}GB VRAM)" - } - } catch { - $GpuVram = 0 - Write-Warn "No NVIDIA GPU detected" - } - - $RamGb = [math]::Floor((Get-WmiObject Win32_ComputerSystem).TotalPhysicalMemory / 1GB) - Write-Ok "RAM: ${RamGb}GB" -} - -# Auto-detect tier -if ($Tier -eq 0) { - if ($GpuVram -ge 48) { $Tier = 4 } - elseif ($GpuVram -ge 20) { $Tier = 3 } - elseif ($GpuVram -ge 12) { $Tier = 2 } - else { $Tier = 1 } - Write-Info "Auto-detected tier: $Tier" -} else { - Write-Info "Using specified tier: $Tier" -} - -$tierNames = @{ - 1 = "Entry Level (7B models)" - 2 = "Prosumer (14B-32B AWQ models)" - 3 = "Pro (32B models)" - 4 = "Enterprise (72B models)" -} -Write-Ok "Selected: Tier $Tier - $($tierNames[$Tier])" - -# Diagnose mode exits here -if ($Diagnose) { - Write-Host "" - Write-Host "Diagnostics complete." -ForegroundColor Green - Write-Host "" - Write-Host "Summary:" -ForegroundColor Cyan - Write-Host " Windows: OK" - Write-Host " WSL2: OK" - Write-Host " Docker: OK" - Write-Host " GPU Docker: $(if ($GpuInDocker) { 'OK' } else { 'WARN - see troubleshooting guide' })" - Write-Host " GPU VRAM: ${GpuVram}GB" - Write-Host " Tier: $Tier - $($tierNames[$Tier])" - Write-Host "" - exit 0 -} - -#============================================================================= -# Installation -#============================================================================= -Show-Header "Installing Dream Server" - -if ($DryRun) { - Write-Info "[DRY RUN] Would create: $InstallDir" - Write-Info "[DRY RUN] Would copy Docker configs" - Write-Info "[DRY RUN] Would set tier: $Tier" - Write-Info "[DRY RUN] Voice: $Voice, Workflows: $Workflows, RAG: $Rag" - exit 0 -} - -# Create install directory -if (Test-Path $InstallDir) { - if ($Force) { - Write-Warn "Removing existing installation..." - Remove-Item -Recurse -Force $InstallDir - } else { - Write-Err "Installation directory exists: $InstallDir" - Write-Info "Use -Force to overwrite" - exit 1 - } -} - -New-Item -ItemType Directory -Path $InstallDir -Force | Out-Null -Write-Ok "Created: $InstallDir" - -# Copy files -Copy-Item "$scriptDir\docker-compose.yml" "$InstallDir\" -Copy-Item "$scriptDir\.env.example" "$InstallDir\.env" -Copy-Item -Recurse "$scriptDir\scripts" "$InstallDir\" -Copy-Item -Recurse "$scriptDir\configs" "$InstallDir\" -ErrorAction SilentlyContinue -Write-Ok "Copied configuration files" - -# Configure .env -$envFile = "$InstallDir\.env" -$envContent = Get-Content $envFile - -# Set tier-specific model -$models = @{ - 1 = "Qwen/Qwen2.5-7B-Instruct" - 2 = "Qwen/Qwen2.5-14B-Instruct-AWQ" - 3 = "Qwen/Qwen2.5-32B-Instruct-AWQ" - 4 = "Qwen/Qwen2.5-72B-Instruct-AWQ" -} -$bootstrapModel = "Qwen/Qwen2.5-1.5B-Instruct" - -# Determine model to use -if ($Bootstrap -and -not $NoBootstrap) { - $selectedModel = $bootstrapModel - $targetModel = $models[$Tier] - Write-Info "Bootstrap mode: Starting with small model for quick setup" - Write-Info " Initial: $bootstrapModel" - Write-Info " Target: $targetModel (upgrade later with 'dream upgrade-model')" -} else { - $selectedModel = $models[$Tier] - $targetModel = $selectedModel -} - -$envContent = $envContent -replace 'LLM_MODEL=.*', "LLM_MODEL=$selectedModel" -$envContent = $envContent -replace 'TARGET_MODEL=.*', "TARGET_MODEL=$targetModel" -$envContent | Set-Content $envFile -Write-Ok "Configured model: $selectedModel" - -#============================================================================= -# Build Profiles -#============================================================================= -$profiles = @("core") -if ($Voice) { $profiles += "voice" } -if ($Workflows) { $profiles += "workflows" } -if ($Rag) { $profiles += "rag" } - -$profileStr = $profiles -join "," -Write-Info "Profiles: $profileStr" - -# Start services -Show-Header "Starting Services" -Set-Location $InstallDir - -Write-Info "Pulling Docker images (this may take a while)..." -docker compose --profile $profileStr pull - -Write-Info "Starting containers..." -docker compose --profile $profileStr up -d - -# Wait for services -Write-Info "Waiting for services to be ready..." -Start-Sleep -Seconds 30 - -#============================================================================= -# Verify Installation -#============================================================================= -Show-Header "Verifying Installation" - -$services = @{ - "vLLM" = "http://localhost:8000/health" - "Open WebUI" = "http://localhost:3000" -} -if ($Voice) { - $services["Whisper"] = "http://localhost:9000/health" -} -if ($Rag) { - $services["Qdrant"] = "http://localhost:6333/health" -} - -foreach ($svc in $services.Keys) { - try { - $response = Invoke-WebRequest -Uri $services[$svc] -TimeoutSec 5 -UseBasicParsing -ErrorAction SilentlyContinue - if ($response.StatusCode -eq 200) { - Write-Ok "$svc is running" - } - } catch { - Write-Warn "$svc not responding yet (may still be starting)" - } -} - -#============================================================================= -# Done -#============================================================================= -Show-Header "Installation Complete!" - -Write-Host "" -Write-Host "Your Dream Server is ready!" -ForegroundColor Green -Write-Host "" -Write-Host "Access points:" -Write-Host " - Chat UI: http://localhost:3000" -Write-Host " - API: http://localhost:8000/v1" -if ($Voice) { - Write-Host " - Whisper: http://localhost:9000" -} -if ($Workflows) { - Write-Host " - n8n: http://localhost:5678" -} -if ($Rag) { - Write-Host " - Qdrant: http://localhost:6333" -} -Write-Host "" -Write-Host "Manage your server:" -Write-Host " cd $InstallDir" -Write-Host " docker compose logs -f # View logs" -Write-Host " docker compose down # Stop" -Write-Host " docker compose up -d # Start" - -if ($Bootstrap -and -not $NoBootstrap) { - Write-Host "" - Write-Host "Bootstrap Mode Active" -ForegroundColor Yellow - Write-Host " You're running a small model for quick setup." - Write-Host " Upgrade to full model when ready:" - Write-Host " .\scripts\upgrade-model.ps1" - Write-Host " Target model: $targetModel" -} - -Write-Host "" -Write-Host "Troubleshooting: docs\WSL2-GPU-TROUBLESHOOTING.md" -Write-Host "" -Write-Host "Your AI, your hardware, your data. Welcome to Dream Server." -ForegroundColor Cyan diff --git a/dream-server/install.sh b/dream-server/install.sh old mode 100755 new mode 100644 index 207c96e96..101d08e36 --- a/dream-server/install.sh +++ b/dream-server/install.sh @@ -1,1839 +1,42 @@ #!/bin/bash -# Dream Server Installer v2.0 -# Unified installer - voice-enabled by default, uses docker-compose.yml profiles for optional features -# Mission: M5 (Clonable Dream Setup Server) +# Dream Server Installer entrypoint (PR-1 dispatcher) +# Pass-through options (implemented in install-core.sh): +# --dry-run --skip-docker --force --tier --voice --workflows --rag +# --openclaw --all --non-interactive --no-bootstrap --bootstrap --offline -set -e +set -euo pipefail -#============================================================================= -# Interrupt Protection -#============================================================================= -# Accidental keypresses (Ctrl+C, Ctrl+Z) shouldn't silently kill the install. -# We require a double-tap of Ctrl+C within 3 seconds to actually abort. -LAST_SIGINT=0 -interrupt_handler() { - local now - now=$(date +%s) - if (( now - LAST_SIGINT <= 3 )); then - echo "" - echo -e "\033[1;33m[!] Install cancelled by user.\033[0m" - echo -e "\033[0;36m Log file: $LOG_FILE\033[0m" - exit 130 - fi - LAST_SIGINT=$now - echo "" - echo -e "\033[1;33m[!] Press Ctrl+C again within 3 seconds to cancel the install.\033[0m" -} -trap interrupt_handler INT -# Ignore Ctrl+Z (SIGTSTP) entirely โ€” backgrounding the installer breaks things -trap '' TSTP - -#============================================================================= -# Configuration -#============================================================================= -VERSION="2.0.0" SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -INSTALL_DIR="${INSTALL_DIR:-$HOME/dream-server}" -LOG_FILE="${LOG_FILE:-/tmp/dream-server-install.log}" -MAX_DOWNLOAD_RETRIES=3 -DOWNLOAD_RETRY_DELAY=10 - -# Auto-detect system timezone (fallback to UTC) -if [[ -f /etc/timezone ]]; then - SYSTEM_TZ="$(cat /etc/timezone)" -elif [[ -L /etc/localtime ]]; then - SYSTEM_TZ="$(readlink /etc/localtime | sed 's|.*/zoneinfo/||')" -else - SYSTEM_TZ="UTC" -fi - -#============================================================================= -# Colors -#============================================================================= -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -BLUE='\033[0;34m' -CYAN='\033[0;36m' -NC='\033[0m' - -#============================================================================= -# Helpers -#============================================================================= -log() { echo -e "${CYAN}[INFO]${NC} $1" | tee -a "$LOG_FILE"; } -success() { echo -e "${GREEN}[OK]${NC} $1" | tee -a "$LOG_FILE"; } -warn() { echo -e "${YELLOW}[WARN]${NC} $1" | tee -a "$LOG_FILE"; } -error() { echo -e "${RED}[ERROR]${NC} $1" | tee -a "$LOG_FILE"; exit 1; } - -#============================================================================= -# Stranger Console Mode (80s cinematic terminal UI) -#============================================================================= -DIVIDER="โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€" - -# Tiny typing effect (use sparingly) -type_line() { - local s="$1" - local delay="${2:-0.008}" - local i - for ((i=0; i<${#s}; i++)); do - printf "%s" "${s:$i:1}" - sleep "$delay" - done - printf "\n" -} - -bootline() { echo -e "${CYAN}${DIVIDER}${NC}"; } -subline() { echo -e "${BLUE}${DIVIDER}${NC}"; } - -# "AI narrator" voice -ai() { echo -e " ${CYAN}โ–ธ${NC} $1" | tee -a "$LOG_FILE"; } -ai_ok() { echo -e " ${GREEN}โœ“${NC} $1" | tee -a "$LOG_FILE"; } -ai_warn() { echo -e " ${YELLOW}โš ${NC} $1" | tee -a "$LOG_FILE"; } -ai_bad() { echo -e " ${RED}โœ—${NC} $1" | tee -a "$LOG_FILE"; } - -# Little signal flourish (tasteful) -signal() { echo -e " ${CYAN}โ–‘โ–’โ–“โ–ˆโ–“โ–’โ–‘${NC} $1" | tee -a "$LOG_FILE"; } - -# Consistent section header -chapter() { - local title="$1" - echo "" - bootline - echo -e "${BLUE}${title}${NC}" - bootline -} - -# Phase screen -show_phase() { - local phase=$1 total=$2 name=$3 estimate=$4 - echo "" - bootline - echo -e "${BLUE}PHASE ${phase}/${total}${NC} ${CYAN}${name}${NC}" - [[ -n "$estimate" ]] && echo -e "${YELLOW}ETA:${NC} ${estimate}" - bootline -} - -# Cinematic boot splash -show_stranger_boot() { - clear 2>/dev/null || true - cat << 'EOF' - - ____ _____ - / __ \ _____ ___ ____ _ ____ ___ / ___/ ___ _____ _ __ ___ _____ - / / / // ___// _ \ / __ `// __ `__ \ \__ \ / _ \ / ___/| | / // _ \ / ___/ - / /_/ // / / __// /_/ // / / / / / ___/ // __// / | |/ // __// / -/_____//_/ \___/ \__,_//_/ /_/ /_/ /____/ \___//_/ |___/ \___//_/ - -โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ - DREAM SERVER 2026 // LOCAL AI // SOVEREIGN INTELLIGENCE -โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ - -EOF - type_line "$(echo -e "${CYAN}Signal acquired.${NC}")" 0.012 - type_line "$(echo -e "${CYAN}I will guide the installation. Stay with me.${NC}")" 0.012 - echo -e " ${YELLOW}Version ${VERSION}${NC}" - echo "" - bootline - echo -e "${CYAN}Tip:${NC} Press Ctrl+C twice to abort." - bootline - echo "" -} - -# Spinner with mm:ss timer + consistent prefix -spin_task() { - local pid=$1 - local msg=$2 - local spin='โ ‹โ ™โ นโ ธโ ผโ ดโ ฆโ งโ ‡โ ' - local i=0 - local elapsed=0 - - printf " ${CYAN}โ ‹${NC} [00:00] %s " "$msg" - while kill -0 "$pid" 2>/dev/null; do - local mm=$((elapsed / 60)) - local ss=$((elapsed % 60)) - printf "\r ${CYAN}%s${NC} [%02d:%02d] %s " "${spin:$i:1}" "$mm" "$ss" "$msg" - i=$(( (i + 1) % ${#spin} )) - elapsed=$((elapsed + 1)) - sleep 1 - done - local rc=0 - wait "$pid" || rc=$? - return $rc -} - -# Pull wrapper that prints consistent success/fail lines -pull_with_progress() { - local img=$1 - local label=$2 - local count=$3 - local total=$4 - - $DOCKER_CMD pull "$img" >> "$LOG_FILE" 2>&1 & - local pull_pid=$! - - if spin_task $pull_pid "[$count/$total] $label"; then - printf "\r ${GREEN}โœ“${NC} [$count/$total] %-60s\n" "$label" - return 0 - else - printf "\r ${RED}โœ—${NC} [$count/$total] %-60s\n" "$label" - return 1 - fi -} - -# Health check with "systems online" vibe -check_service() { - local name=$1 - local url=$2 - local max_attempts=${3:-30} - local spin='โ ‹โ ™โ นโ ธโ ผโ ดโ ฆโ งโ ‡โ ' - local i=0 - - if $DRY_RUN; then - ai "[DRY RUN] Would link ${name} at ${url}" - return 0 - fi - - printf " ${CYAN}%s${NC} Linking %-20s " "${spin:0:1}" "$name" - for attempt in $(seq 1 $max_attempts); do - if curl -sf "$url" > /dev/null 2>&1; then - printf "\r ${GREEN}โœ“${NC} %-55s\n" "$name online" - return 0 - fi - printf "\r ${CYAN}%s${NC} Linking %-20s [%ds] " "${spin:$i:1}" "$name" "$((attempt * 2))" - i=$(( (i + 1) % ${#spin} )) - sleep 2 - done - - printf "\r ${YELLOW}โš ${NC} %-55s\n" "$name delayed (may still be starting)" - ai_warn "$name not responding yet. I will continue." - return 1 -} - -# Progress bar function -progress_bar() { - local current=$1 - local total=$2 - local width=40 - local percent=$((current * 100 / total)) - local filled=$((width * current / total)) - local empty=$((width - filled)) - - printf "\r [" - printf "%${filled}s" | tr ' ' 'โ–ˆ' - printf "%${empty}s" | tr ' ' 'โ–‘' - printf "] %3d%%" "$percent" -} - -# Show hardware summary in a nice box -show_hardware_summary() { - local gpu_name="$1" - local gpu_vram="$2" - local cpu_info="$3" - local ram_gb="$4" - local disk_gb="$5" - - echo "" - echo -e "${CYAN}โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”${NC}" - echo -e "${CYAN}โ”‚${NC} ${BLUE}Hardware Detected${NC} ${CYAN}โ”‚${NC}" - echo -e "${CYAN}โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค${NC}" - printf "${CYAN}โ”‚${NC} GPU: %-50s ${CYAN}โ”‚${NC}\n" "${gpu_name:-Not detected}" - [[ -n "$gpu_vram" ]] && printf "${CYAN}โ”‚${NC} VRAM: %-50s ${CYAN}โ”‚${NC}\n" "${gpu_vram}GB" - printf "${CYAN}โ”‚${NC} CPU: %-50s ${CYAN}โ”‚${NC}\n" "${cpu_info:-Unknown}" - printf "${CYAN}โ”‚${NC} RAM: %-50s ${CYAN}โ”‚${NC}\n" "${ram_gb}GB" - printf "${CYAN}โ”‚${NC} Disk: %-50s ${CYAN}โ”‚${NC}\n" "${disk_gb}GB available" - echo -e "${CYAN}โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜${NC}" -} - -# Show tier recommendation with explanation -show_tier_recommendation() { - local tier=$1 - local model=$2 - local speed=$3 - local users=$4 - - echo "" - echo -e "${CYAN}โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”${NC}" - echo -e "${CYAN}โ”‚${NC} ${GREEN}โœ“ Recommended: Tier ${tier}${NC} ${CYAN}โ”‚${NC}" - echo -e "${CYAN}โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค${NC}" - printf "${CYAN}โ”‚${NC} Model: %-49s ${CYAN}โ”‚${NC}\n" "$model" - printf "${CYAN}โ”‚${NC} Speed: %-49s ${CYAN}โ”‚${NC}\n" "~${speed} tokens/second" - printf "${CYAN}โ”‚${NC} Users: %-49s ${CYAN}โ”‚${NC}\n" "${users} concurrent comfortably" - echo -e "${CYAN}โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜${NC}" -} - -# Show installation menu -show_install_menu() { - echo "" - ai "Choose how deep you want to go. I can install everything, or keep it minimal." - echo "" - echo -e " ${GREEN}[1]${NC} Full Stack ${YELLOW}(recommended โ€” just press Enter)${NC}" - echo " Chat + Voice + Workflows + Document Q&A + AI Agents" - echo " ~16GB download, all features enabled" - echo "" - echo -e " ${GREEN}[2]${NC} Core Only" - echo " Chat interface + API" - echo " ~12GB download, minimal footprint" - echo "" - echo -e " ${GREEN}[3]${NC} Custom" - echo " Choose exactly what you want" - echo "" - read -p " Select an option [1]: " -r INSTALL_CHOICE - INSTALL_CHOICE="${INSTALL_CHOICE:-1}" - echo "" - case "$INSTALL_CHOICE" in - 1) - signal "Acknowledged." - log "Selected: Full Stack" - ENABLE_VOICE=true - ENABLE_WORKFLOWS=true - ENABLE_RAG=true - ENABLE_OPENCLAW=true - ;; - 2) - signal "Acknowledged." - log "Selected: Core Only" - ;; - 3) - signal "Acknowledged." - log "Selected: Custom" - ;; - *) - warn "Invalid choice '$INSTALL_CHOICE', defaulting to Full Stack" - ENABLE_VOICE=true - ENABLE_WORKFLOWS=true - ENABLE_RAG=true - ENABLE_OPENCLAW=true - ;; - esac -} - -# Final success card -show_success_card() { - local webui_url=$1 - local dashboard_url=$2 - local ip_addr=$3 - - echo "" - echo -e "${GREEN}โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—${NC}" - echo -e "${GREEN}โ•‘${NC} ${GREEN}โ•‘${NC}" - echo -e "${GREEN}โ•‘${NC} ${GREEN}โœ“ Dream Server is ready.${NC} ${GREEN}โ•‘${NC}" - echo -e "${GREEN}โ•‘${NC} ${GREEN}โ•‘${NC}" - echo -e "${GREEN}โ• โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฃ${NC}" - echo -e "${GREEN}โ•‘${NC} ${GREEN}โ•‘${NC}" - printf "${GREEN}โ•‘${NC} Dashboard: %-43s ${GREEN}โ•‘${NC}\n" "${dashboard_url}" - printf "${GREEN}โ•‘${NC} Chat: %-43s ${GREEN}โ•‘${NC}\n" "${webui_url}" - echo -e "${GREEN}โ•‘${NC} ${GREEN}โ•‘${NC}" - if [[ -n "$ip_addr" ]]; then - echo -e "${GREEN}โ•‘${NC} ${YELLOW}Access from other devices:${NC} ${GREEN}โ•‘${NC}" - printf "${GREEN}โ•‘${NC} http://%-51s ${GREEN}โ•‘${NC}\n" "${ip_addr}:3001" - echo -e "${GREEN}โ•‘${NC} ${GREEN}โ•‘${NC}" - fi - echo -e "${GREEN}โ•‘${NC} Your data never leaves this machine. ${GREEN}โ•‘${NC}" - echo -e "${GREEN}โ•‘${NC} No subscriptions. No limits. It's yours. ${GREEN}โ•‘${NC}" - echo -e "${GREEN}โ•‘${NC} ${GREEN}โ•‘${NC}" - echo -e "${GREEN}โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•${NC}" - echo "" -} - -#============================================================================= -# Command Line Args -#============================================================================= -DRY_RUN=false -SKIP_DOCKER=false -FORCE=false -TIER="" -ENABLE_VOICE=false -ENABLE_WORKFLOWS=false -ENABLE_RAG=false -ENABLE_OPENCLAW=false -INTERACTIVE=true -BOOTSTRAP_MODE=true # Default to bootstrap for instant UX -OFFLINE_MODE=false # M1 integration: fully air-gapped operation - -usage() { - cat << EOF -Dream Server Installer v${VERSION} - -Usage: $0 [OPTIONS] - -Options: - --dry-run Show what would be done without making changes - --skip-docker Skip Docker installation (assume already installed) - --force Overwrite existing installation - --tier N Force specific tier (1-4) instead of auto-detect - --voice Enable voice services (Whisper + Piper) - --workflows Enable n8n workflow automation - --rag Enable RAG with Qdrant vector database - --openclaw Enable OpenClaw AI agent framework - --all Enable all optional services - --non-interactive Run without prompts (use defaults or flags) - --no-bootstrap Skip bootstrap mode (wait for full model) - --bootstrap Use bootstrap mode (default: instant start with 1.5B, upgrade later) - --offline M1 mode: Configure for fully offline/air-gapped operation - -h, --help Show this help - -Tiers: - 1 - Entry Level (8GB+ VRAM, 7B models) - 2 - Prosumer (12GB+ VRAM, 14B-32B AWQ models) - 3 - Pro (24GB+ VRAM, 32B models) - 4 - Enterprise (48GB+ VRAM or dual GPU, 72B models) - -Examples: - $0 # Interactive setup - $0 --tier 2 --voice # Tier 2 with voice - $0 --all --non-interactive # Full stack, no prompts - $0 --offline --all # Fully offline (M1 mode) with all services - $0 --dry-run # Preview installation - -EOF - exit 0 -} - -while [[ $# -gt 0 ]]; do - case $1 in - --dry-run) DRY_RUN=true; shift ;; - --skip-docker) SKIP_DOCKER=true; shift ;; - --force) FORCE=true; shift ;; - --tier) TIER="$2"; shift 2 ;; - --voice) ENABLE_VOICE=true; shift ;; - --workflows) ENABLE_WORKFLOWS=true; shift ;; - --rag) ENABLE_RAG=true; shift ;; - --openclaw) ENABLE_OPENCLAW=true; shift ;; - --all) ENABLE_VOICE=true; ENABLE_WORKFLOWS=true; ENABLE_RAG=true; ENABLE_OPENCLAW=true; shift ;; - --non-interactive) INTERACTIVE=false; shift ;; - --bootstrap) BOOTSTRAP_MODE=true; shift ;; - --no-bootstrap) BOOTSTRAP_MODE=false; shift ;; - --offline) OFFLINE_MODE=true; shift ;; - -h|--help) usage ;; - *) error "Unknown option: $1" ;; - esac -done - -#============================================================================= -# Splash -#============================================================================= -show_stranger_boot -sleep 5 - -$DRY_RUN && echo -e "${YELLOW}>>> DRY RUN MODE โ€” I will simulate everything. No changes made. <<<${NC}\n" - -#============================================================================= -# Pre-flight Checks -#============================================================================= -show_phase 1 6 "Pre-flight Checks" "~30 seconds" -ai "I'm scanning your system for required components..." - -# Root check -if [[ $EUID -eq 0 ]]; then - error "Do not run as root. Run as regular user with sudo access." -fi - -# OS check -if [[ ! -f /etc/os-release ]]; then - error "Unsupported OS. This installer requires Linux." -fi - -source /etc/os-release -log "Detected OS: $PRETTY_NAME" - -# Check for required tools -if ! command -v curl &> /dev/null; then - error "curl is required but not installed. Install with: sudo apt install curl" -fi -log "curl: $(curl --version | head -1)" - -# Check optional tools (warn but don't fail) -OPTIONAL_TOOLS_MISSING="" -if ! command -v jq &> /dev/null; then - OPTIONAL_TOOLS_MISSING="$OPTIONAL_TOOLS_MISSING jq" -fi -if ! command -v rsync &> /dev/null; then - OPTIONAL_TOOLS_MISSING="$OPTIONAL_TOOLS_MISSING rsync" -fi -if [[ -n "$OPTIONAL_TOOLS_MISSING" ]]; then - warn "Optional tools missing:$OPTIONAL_TOOLS_MISSING" - echo " These are needed for update/backup scripts. Install with:" - echo " sudo apt install$OPTIONAL_TOOLS_MISSING" -fi - -# Check source files exist -if [[ ! -f "$SCRIPT_DIR/docker-compose.yml" ]]; then - error "docker-compose.yml not found in $SCRIPT_DIR. Please run from the dream-server directory." -fi - -# Check for existing installation -if [[ -d "$INSTALL_DIR" && "$FORCE" != "true" ]]; then - if $INTERACTIVE && ! $DRY_RUN; then - warn "Existing installation found at $INSTALL_DIR" - read -p " Overwrite and start fresh? [y/N] " -r - if [[ $REPLY =~ ^[Yy]$ ]]; then - log "User chose to overwrite existing installation" - FORCE=true - else - log "User chose not to overwrite. Exiting." - exit 0 - fi - else - error "Installation already exists at $INSTALL_DIR. Use --force to overwrite." - fi -fi - -ai_ok "Pre-flight checks passed." -signal "No cloud dependencies required for core operation." - -#============================================================================= -# System Detection -#============================================================================= -chapter "SYSTEM DETECTION" -ai "Reading hardware telemetry..." - -# RAM Detection -RAM_KB=$(grep MemTotal /proc/meminfo | awk '{print $2}') -RAM_GB=$((RAM_KB / 1024 / 1024)) -log "RAM: ${RAM_GB}GB" - -# Disk Detection -DISK_AVAIL=$(df -BG "$HOME" | tail -1 | awk '{print $4}' | tr -d 'G') -log "Available disk: ${DISK_AVAIL}GB" - -# GPU Detection -detect_gpu() { - if command -v nvidia-smi &> /dev/null; then - # nvidia-smi --query-gpu prints errors to stdout when driver is broken, - # so we must check the exit code before trusting the output. - local raw - if raw=$(nvidia-smi --query-gpu=name,memory.total --format=csv,noheader 2>/dev/null) && [[ -n "$raw" ]]; then - GPU_INFO="$raw" - GPU_NAME=$(echo "$GPU_INFO" | head -1 | cut -d',' -f1 | xargs) - GPU_VRAM=$(echo "$GPU_INFO" | head -1 | cut -d',' -f2 | grep -oP '\d+' | head -1) - GPU_COUNT=$(echo "$GPU_INFO" | wc -l) - log "GPU: $GPU_NAME (${GPU_VRAM}MB VRAM) x${GPU_COUNT}" - return 0 - fi - fi - GPU_NAME="None" - GPU_VRAM=0 - GPU_COUNT=0 - warn "No NVIDIA GPU detected. CPU-only mode available but slow." - return 1 -} - -detect_gpu || true - -#----------------------------------------------------------------------------- -# Secure Boot + NVIDIA auto-fix -# If GPU hardware exists (lspci) but nvidia-smi fails, the most common cause -# on Ubuntu is Secure Boot blocking the unsigned DKMS kernel module. -# This block automatically: installs the driver if missing, ensures the -# kernel modules are signed, enrolls the MOK key, sets up auto-resume, -# and reboots. After reboot the installer picks up where it left off. -#----------------------------------------------------------------------------- -MIN_DRIVER_VERSION=570 -RESUME_FLAG="/tmp/dream-server-install-resume" +source "$SCRIPT_DIR/installers/dispatch.sh" -fix_nvidia_secure_boot() { - # Step 1: Is there even NVIDIA hardware on this machine? - if ! lspci 2>/dev/null | grep -qi 'nvidia'; then - return 1 # No hardware โ€” nothing to fix - fi +target="$(resolve_installer_target)" - ai "NVIDIA GPU hardware detected but driver not responding." - - # Step 2: Ensure a driver package is installed - local installed_driver - installed_driver=$(dpkg-query -W -f='${Package}\n' 'nvidia-driver-*' 2>/dev/null \ - | grep -oP 'nvidia-driver-\K\d+' | sort -n | tail -1 || true) - - if [[ -z "$installed_driver" ]]; then - ai "No NVIDIA driver package found. Installing recommended driver..." - if command -v ubuntu-drivers &>/dev/null; then - sudo ubuntu-drivers install 2>>"$LOG_FILE" || \ - sudo apt-get install -y "nvidia-driver-${MIN_DRIVER_VERSION}" 2>>"$LOG_FILE" || true - else - sudo apt-get install -y "nvidia-driver-${MIN_DRIVER_VERSION}" 2>>"$LOG_FILE" || true - fi - installed_driver=$(dpkg-query -W -f='${Package}\n' 'nvidia-driver-*' 2>/dev/null \ - | grep -oP 'nvidia-driver-\K\d+' | sort -n | tail -1 || true) - if [[ -z "$installed_driver" ]]; then - ai_bad "Failed to install NVIDIA driver." - return 1 - fi - ai_ok "Installed nvidia-driver-${installed_driver}" - else - ai "Driver nvidia-driver-${installed_driver} is installed." - fi - - # Step 3: Try loading the module โ€” see why it fails - local modprobe_err - modprobe_err=$(sudo modprobe nvidia 2>&1) || true - - if nvidia-smi &>/dev/null; then - ai_ok "NVIDIA driver loaded successfully" - # Regenerate CDI spec so Docker sees the correct driver libraries - if command -v nvidia-ctk &>/dev/null; then - sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml 2>>"$LOG_FILE" || true - fi - detect_gpu || true - return 0 - fi - - # Step 4: If it's not a Secure Boot issue, bail out - if ! echo "$modprobe_err" | grep -qi "key was rejected"; then - ai_bad "NVIDIA module failed to load: $modprobe_err" - return 1 - fi - - # Step 5: Secure Boot is blocking the module โ€” ensure it's properly signed - ai_warn "Secure Boot is blocking the NVIDIA kernel module." - ai "Preparing module signing..." - - local kver mok_dir sign_file - kver=$(uname -r) - mok_dir="/var/lib/shim-signed/mok" - sudo mkdir -p "$mok_dir" - - # Ensure linux-headers are present (needed for sign-file) - if [[ ! -d "/usr/src/linux-headers-${kver}" ]]; then - ai "Installing kernel headers for ${kver}..." - sudo apt-get install -y "linux-headers-${kver}" 2>>"$LOG_FILE" || true - fi - - # Generate MOK keypair if not already present - if [[ ! -f "$mok_dir/MOK.priv" ]] || [[ ! -f "$mok_dir/MOK.der" ]]; then - sudo openssl req -new -x509 -newkey rsa:2048 \ - -keyout "$mok_dir/MOK.priv" \ - -outform DER -out "$mok_dir/MOK.der" \ - -nodes -days 36500 \ - -subj "/CN=Dream Server Module Signing/" 2>>"$LOG_FILE" - sudo chmod 600 "$mok_dir/MOK.priv" - ai_ok "Generated MOK signing key" - else - ai_ok "Using existing MOK signing key" - fi - - # Locate the sign-file tool - sign_file="" - for candidate in \ - "/usr/src/linux-headers-${kver}/scripts/sign-file" \ - "/usr/lib/linux-kbuild-${kver%.*}/scripts/sign-file"; do - if [[ -x "$candidate" ]]; then - sign_file="$candidate" - break - fi - done - if [[ -z "$sign_file" ]]; then - sign_file=$(find /usr/src /usr/lib -name sign-file -executable 2>/dev/null | head -1) - fi - if [[ -z "$sign_file" ]]; then - ai_bad "Cannot find kernel sign-file tool." - ai "Try: sudo apt install linux-headers-${kver}" - return 1 - fi - - # Sign every nvidia DKMS module (handles .ko, .ko.zst, .ko.xz) - local signed_count=0 - for mod_path in /lib/modules/${kver}/updates/dkms/nvidia*.ko*; do - [[ -f "$mod_path" ]] || continue - case "$mod_path" in - *.zst) - sudo zstd -d -f "$mod_path" -o "${mod_path%.zst}" 2>>"$LOG_FILE" - sudo "$sign_file" sha256 "$mok_dir/MOK.priv" "$mok_dir/MOK.der" "${mod_path%.zst}" 2>>"$LOG_FILE" - sudo zstd -f --rm "${mod_path%.zst}" -o "$mod_path" 2>>"$LOG_FILE" - ;; - *.xz) - sudo xz -d -f -k "$mod_path" 2>>"$LOG_FILE" - sudo "$sign_file" sha256 "$mok_dir/MOK.priv" "$mok_dir/MOK.der" "${mod_path%.xz}" 2>>"$LOG_FILE" - sudo xz -f "${mod_path%.xz}" 2>>"$LOG_FILE" - sudo mv "${mod_path%.xz}.xz" "$mod_path" 2>>"$LOG_FILE" +case "$target" in + unsupported:unknown) + echo "[ERROR] Unsupported OS for this installer entrypoint." + echo " See docs/SUPPORT-MATRIX.md for supported platforms." + exit 1 + ;; + *) + if [[ ! -f "$target" ]]; then + echo "[ERROR] Installer target not found: $target" + exit 1 + fi + case "$target" in + *.ps1) + echo "[INFO] Windows installer target: $target" + if command -v pwsh >/dev/null 2>&1; then + exec pwsh -File "$target" "$@" + else + echo "[ERROR] PowerShell (pwsh) not found in this shell." + echo " Run this from Windows PowerShell instead:" + echo " .\\installers\\windows.ps1" + exit 1 + fi ;; *) - sudo "$sign_file" sha256 "$mok_dir/MOK.priv" "$mok_dir/MOK.der" "$mod_path" 2>>"$LOG_FILE" + exec bash "$target" "$@" ;; esac - signed_count=$((signed_count + 1)) - done - sudo depmod -a 2>>"$LOG_FILE" - ai_ok "Signed $signed_count NVIDIA module(s)" - - # Step 6: Try loading โ€” if MOK key is already enrolled, this works immediately - if sudo modprobe nvidia 2>>"$LOG_FILE" && nvidia-smi &>/dev/null; then - ai_ok "NVIDIA driver loaded โ€” GPU is online" - # Regenerate CDI spec so Docker sees the correct driver libraries - if command -v nvidia-ctk &>/dev/null; then - sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml 2>>"$LOG_FILE" || true - fi - detect_gpu || true - return 0 - fi - - # Step 7: MOK key needs firmware enrollment โ€” one reboot required - # This is the standard Ubuntu Secure Boot flow (same thing Ubuntu's - # "Additional Drivers" tool does). It only happens once per machine. - - local mok_pass - mok_pass=$(openssl rand -hex 4) - printf '%s\n%s\n' "$mok_pass" "$mok_pass" | sudo mokutil --import "$mok_dir/MOK.der" 2>>"$LOG_FILE" - - # --- Auto-resume: create a systemd oneshot so the install continues - # automatically after reboot (user doesn't have to re-run manually) - local svc_name="dream-server-install-resume" - local resume_args="--force --non-interactive" - $ENABLE_VOICE && resume_args="$resume_args --voice" - $ENABLE_WORKFLOWS && resume_args="$resume_args --workflows" - $ENABLE_RAG && resume_args="$resume_args --rag" - $ENABLE_OPENCLAW && resume_args="$resume_args --openclaw" - [[ "$BOOTSTRAP_MODE" == "true" ]] && resume_args="$resume_args --bootstrap" - [[ -n "$TIER" ]] && resume_args="$resume_args --tier $TIER" - [[ "$OFFLINE_MODE" == "true" ]] && resume_args="$resume_args --offline" - - sudo tee /etc/systemd/system/${svc_name}.service > /dev/null << SVCEOF -[Unit] -Description=Dream Server Install (auto-resume after Secure Boot enrollment) -After=network-online.target docker.service -Wants=network-online.target - -[Service] -Type=oneshot -User=$USER -ExecStart=/bin/bash ${SCRIPT_DIR}/install.sh ${resume_args} -ExecStartPost=/bin/rm -f /etc/systemd/system/${svc_name}.service -ExecStartPost=/bin/systemctl daemon-reload -StandardOutput=journal+console -StandardError=journal+console - -[Install] -WantedBy=multi-user.target -SVCEOF - sudo systemctl daemon-reload - sudo systemctl enable "${svc_name}.service" 2>>"$LOG_FILE" - log "Auto-resume service installed: ${svc_name}.service" - - # --- Show a clean, friendly reboot screen --- - echo "" - echo "" - echo -e "${CYAN}โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—${NC}" - echo -e "${CYAN}โ•‘${NC} ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} ${YELLOW}One-time reboot needed${NC} ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} Your GPU requires a Secure Boot key enrollment. ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} This is normal and only happens once. ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ• โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฃ${NC}" - echo -e "${CYAN}โ•‘${NC} ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} After reboot a ${YELLOW}blue screen${NC} will appear: ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} ${GREEN}1.${NC} Select \"Enroll MOK\" ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} ${GREEN}2.${NC} Select \"Continue\" ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} ${GREEN}3.${NC} Type password: ${GREEN}${mok_pass}${NC} ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} ${GREEN}4.${NC} Select \"Reboot\" ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} Installation will ${GREEN}continue automatically${NC} after reboot. ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•‘${NC} ${CYAN}โ•‘${NC}" - echo -e "${CYAN}โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•${NC}" - echo "" - - if $INTERACTIVE; then - read -p " Press Enter to reboot (or Ctrl+C to do it later)... " -r - sudo reboot - fi - - # Non-interactive mode: exit cleanly (not an error โ€” reboot is a normal install phase) - ai "Reboot this machine to continue installation." - exit 0 -} - -# If detect_gpu found no working GPU, check if it's a fixable driver/Secure Boot issue -if [[ $GPU_COUNT -eq 0 ]] && ! $DRY_RUN; then - fix_nvidia_secure_boot || true -fi - -# NVIDIA Driver Compatibility Check -# vllm/vllm-openai:v0.15.1 ships CUDA 12.9 โ€” requires driver >= 570 -if [[ $GPU_COUNT -gt 0 ]]; then - DRIVER_VERSION="" - if raw_driver=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader 2>/dev/null); then - DRIVER_VERSION=$(echo "$raw_driver" | head -1 | cut -d. -f1) - fi - if [[ -n "$DRIVER_VERSION" && "$DRIVER_VERSION" =~ ^[0-9]+$ ]]; then - log "NVIDIA driver: $DRIVER_VERSION" - if [[ "$DRIVER_VERSION" -lt "$MIN_DRIVER_VERSION" ]]; then - ai_bad "NVIDIA driver $DRIVER_VERSION is too old. vLLM requires driver >= $MIN_DRIVER_VERSION." - ai "Attempting to install a compatible driver..." - if ! $DRY_RUN; then - if command -v ubuntu-drivers &> /dev/null; then - sudo ubuntu-drivers install nvidia:${MIN_DRIVER_VERSION}-server 2>>"$LOG_FILE" || \ - sudo apt-get install -y nvidia-driver-${MIN_DRIVER_VERSION} 2>>"$LOG_FILE" || true - else - sudo apt-get install -y nvidia-driver-${MIN_DRIVER_VERSION} 2>>"$LOG_FILE" || true - fi - # Check if upgrade succeeded - if dpkg -l "nvidia-driver-${MIN_DRIVER_VERSION}"* 2>/dev/null | grep -q "^ii"; then - ai_ok "NVIDIA driver ${MIN_DRIVER_VERSION} installed." - ai_warn "A REBOOT is required before continuing." - ai "After rebooting, re-run this installer. It will pick up where it left off." - echo "" - if $INTERACTIVE; then - read -p " Reboot now? [Y/n] " -r - if [[ ! $REPLY =~ ^[Nn]$ ]]; then - sudo reboot - fi - fi - error "Reboot required to load NVIDIA driver ${MIN_DRIVER_VERSION}. Re-run install.sh after rebooting." - else - ai_bad "Driver install failed. Please install NVIDIA driver >= ${MIN_DRIVER_VERSION} manually." - ai " Try: sudo apt install nvidia-driver-${MIN_DRIVER_VERSION}" - error "Compatible NVIDIA driver required." - fi - else - log "[DRY RUN] Would install nvidia-driver-${MIN_DRIVER_VERSION}" - fi - else - ai_ok "NVIDIA driver $DRIVER_VERSION (>= $MIN_DRIVER_VERSION required)" - fi - else - ai_warn "Could not determine driver version โ€” continuing anyway" - fi -fi - -# Auto-detect tier if not specified -if [[ -z "$TIER" ]]; then - if [[ $GPU_COUNT -ge 2 ]] || [[ $GPU_VRAM -ge 40000 ]]; then - TIER=4 - elif [[ $GPU_VRAM -ge 20000 ]] || [[ $RAM_GB -ge 96 ]]; then - TIER=3 - elif [[ $GPU_VRAM -ge 12000 ]] || [[ $RAM_GB -ge 48 ]]; then - TIER=2 - else - TIER=1 - fi - log "Auto-detected tier: $TIER" -else - log "Using specified tier: $TIER" -fi - -# Tier-specific configurations -case $TIER in - 1) - TIER_NAME="Entry Level" - LLM_MODEL="Qwen/Qwen2.5-7B-Instruct" - MAX_CONTEXT=16384 - GPU_UTIL=0.85 - QUANTIZATION="" - ;; - 2) - TIER_NAME="Prosumer" - LLM_MODEL="Qwen/Qwen2.5-14B-Instruct-AWQ" - MAX_CONTEXT=16384 - GPU_UTIL=0.90 - QUANTIZATION="awq" - ;; - 3) - TIER_NAME="Pro" - LLM_MODEL="Qwen/Qwen2.5-32B-Instruct-AWQ" - MAX_CONTEXT=32768 - GPU_UTIL=0.90 - QUANTIZATION="awq" - ;; - 4) - TIER_NAME="Enterprise" - LLM_MODEL="Qwen/Qwen2.5-72B-Instruct-AWQ" - MAX_CONTEXT=32768 - GPU_UTIL=0.92 - QUANTIZATION="awq" - ;; - *) - error "Invalid tier: $TIER. Must be 1-4." ;; esac - -# Display hardware summary with nice formatting -CPU_INFO=$(grep "model name" /proc/cpuinfo 2>/dev/null | head -1 | cut -d: -f2 | xargs || echo "Unknown") -if [[ "$INTERACTIVE" == "true" ]]; then - show_hardware_summary "$GPU_NAME" "$((GPU_VRAM / 1024))" "$CPU_INFO" "$RAM_GB" "$DISK_AVAIL" - - # Estimate tokens/sec and concurrent users based on tier - case $TIER in - 1) SPEED_EST=25; USERS_EST="1-2" ;; - 2) SPEED_EST=45; USERS_EST="3-5" ;; - 3) SPEED_EST=55; USERS_EST="5-8" ;; - 4) SPEED_EST=40; USERS_EST="10-15" ;; - esac - show_tier_recommendation "$TIER" "$LLM_MODEL" "$SPEED_EST" "$USERS_EST" -else - success "Configuration: Tier $TIER ($TIER_NAME)" - log " Model: $LLM_MODEL" - log " Context: ${MAX_CONTEXT} tokens" -fi - -# Warn about gated models requiring HF_TOKEN -if [[ "$LLM_MODEL" == *"meta-llama"* ]] || [[ "$LLM_MODEL" == *"Llama-2"* ]] || [[ "$LLM_MODEL" == *"Llama-3"* ]]; then - if [[ -z "${HF_TOKEN:-}" ]]; then - warn "Model $LLM_MODEL may be gated. Set HF_TOKEN environment variable if download fails." - warn "Get your token at: https://huggingface.co/settings/tokens" - fi -fi - -#============================================================================= -# Interactive Feature Selection -#============================================================================= -if $INTERACTIVE && ! $DRY_RUN; then - show_phase 2 6 "Feature Selection" "~1 minute" - show_install_menu - - # Only show individual feature prompts for Custom installs - if [[ "${INSTALL_CHOICE:-1}" == "3" ]]; then - read -p " Enable voice (Whisper STT + Kokoro TTS)? [Y/n] " -r - echo - [[ $REPLY =~ ^[Nn]$ ]] || ENABLE_VOICE=true - - read -p " Enable n8n workflow automation? [Y/n] " -r - echo - [[ $REPLY =~ ^[Nn]$ ]] || ENABLE_WORKFLOWS=true - - read -p " Enable Qdrant vector database (for RAG)? [Y/n] " -r - echo - [[ $REPLY =~ ^[Nn]$ ]] || ENABLE_RAG=true - - read -p " Enable OpenClaw AI agent framework? [y/N] " -r - echo - [[ $REPLY =~ ^[Yy]$ ]] && ENABLE_OPENCLAW=true - fi -fi - -# Build profiles string -PROFILES="" -[[ "$ENABLE_VOICE" == "true" ]] && PROFILES="$PROFILES --profile voice" -[[ "$ENABLE_WORKFLOWS" == "true" ]] && PROFILES="$PROFILES --profile workflows" -[[ "$ENABLE_RAG" == "true" ]] && PROFILES="$PROFILES --profile rag" -[[ "$ENABLE_OPENCLAW" == "true" ]] && PROFILES="$PROFILES --profile openclaw" - -# Select tier-appropriate OpenClaw config -if [[ "$ENABLE_OPENCLAW" == "true" ]]; then - case $TIER in - 1) OPENCLAW_CONFIG="minimal.json" ;; - 2) OPENCLAW_CONFIG="entry.json" ;; - 3) OPENCLAW_CONFIG="prosumer.json" ;; - 4) OPENCLAW_CONFIG="pro.json" ;; - *) OPENCLAW_CONFIG="prosumer.json" ;; - esac - log "OpenClaw config: $OPENCLAW_CONFIG (matched to Tier $TIER)" -fi - -log "Enabled profiles:${PROFILES:- (core only)}" - -#============================================================================= -# Requirements Check -#============================================================================= -chapter "REQUIREMENTS CHECK" - -REQUIREMENTS_MET=true - -# Minimum RAM -MIN_RAM=$((TIER * 16)) -if [[ $RAM_GB -lt $MIN_RAM ]]; then - warn "RAM: ${RAM_GB}GB available, ${MIN_RAM}GB recommended for Tier $TIER" -else - ai_ok "RAM: ${RAM_GB}GB (recommended: ${MIN_RAM}GB+)" -fi - -# Minimum disk (tier-aware) -case $TIER in - 1) MIN_DISK=30 ;; # Nano: 1.5B model ~5GB - 2) MIN_DISK=50 ;; # Edge: 7B model ~15GB - 3) MIN_DISK=80 ;; # Pro: 32B model ~50GB - 4) MIN_DISK=150 ;; # Cluster: 72B model ~100GB - *) MIN_DISK=50 ;; -esac - -if [[ $DISK_AVAIL -lt $MIN_DISK ]]; then - warn "Disk: ${DISK_AVAIL}GB available, ${MIN_DISK}GB minimum required for Tier $TIER" - REQUIREMENTS_MET=false -else - ai_ok "Disk: ${DISK_AVAIL}GB available (minimum: ${MIN_DISK}GB for Tier $TIER)" -fi - -# GPU for tiers 2+ -if [[ $TIER -ge 2 && $GPU_VRAM -lt 10000 ]]; then - warn "GPU: Tier $TIER requires dedicated NVIDIA GPU with 12GB+ VRAM" -else - ai_ok "GPU: Detected $GPU_NAME" -fi - -# Port availability check (handles IPv4 and IPv6) -check_port() { - local port=$1 - if command -v ss &> /dev/null; then - ss -tln 2>/dev/null | grep -qE ":${port}(\s|$)" && return 1 - elif command -v netstat &> /dev/null; then - netstat -tln 2>/dev/null | grep -qE ":${port}(\s|$)" && return 1 - fi - return 0 -} - -PORTS_TO_CHECK="8000 3000" -[[ "$ENABLE_VOICE" == "true" ]] && PORTS_TO_CHECK="$PORTS_TO_CHECK 9000 8880" -[[ "$ENABLE_WORKFLOWS" == "true" ]] && PORTS_TO_CHECK="$PORTS_TO_CHECK 5678" -[[ "$ENABLE_RAG" == "true" ]] && PORTS_TO_CHECK="$PORTS_TO_CHECK 6333" - -for port in $PORTS_TO_CHECK; do - if ! check_port $port; then - warn "Port $port is already in use" - REQUIREMENTS_MET=false - fi -done - -if [[ "$REQUIREMENTS_MET" != "true" ]]; then - warn "Some requirements not met. Installation may have limited functionality." - if $INTERACTIVE && ! $DRY_RUN; then - read -p " Continue anyway? [y/N] " -r - [[ ! $REPLY =~ ^[Yy]$ ]] && exit 1 - elif $DRY_RUN; then - log "[DRY RUN] Would prompt to continue despite unmet requirements" - fi -fi - -#============================================================================= -# Docker Installation -#============================================================================= -show_phase 3 6 "Docker Setup" "~2 minutes" -ai "Preparing container runtime..." - -if [[ "$SKIP_DOCKER" == "true" ]]; then - log "Skipping Docker installation (--skip-docker)" -elif command -v docker &> /dev/null; then - ai_ok "Docker already installed: $(docker --version)" -else - ai "Installing Docker..." - - if $DRY_RUN; then - log "[DRY RUN] Would install Docker via official script" - else - if ! curl -fsSL https://get.docker.com | sh; then - error "Docker installation failed. Check network connectivity and try again." - fi - sudo usermod -aG docker $USER - - # Check if we need to use newgrp or restart - if ! groups | grep -q docker; then - warn "Docker installed! Group membership requires re-login." - warn "Option 1: Log out and back in, then re-run this script with --skip-docker" - warn "Option 2: Run 'newgrp docker' in a new terminal, then re-run" - echo "" - read -p " Try to continue with 'sudo docker' for now? [Y/n] " -r - if [[ ! $REPLY =~ ^[Nn]$ ]]; then - # Use sudo for remaining docker commands in this session - DOCKER_CMD="sudo docker" - DOCKER_COMPOSE_CMD="sudo docker compose" - else - log "Please re-run after logging out and back in." - exit 0 - fi - fi - fi -fi - -# Set docker command (use sudo if needed) -DOCKER_CMD="${DOCKER_CMD:-docker}" -DOCKER_COMPOSE_CMD="${DOCKER_COMPOSE_CMD:-docker compose}" - -# Docker Compose check (v2 preferred, v1 fallback) -if $DOCKER_COMPOSE_CMD version &> /dev/null 2>&1; then - ai_ok "Docker Compose v2 available" -elif command -v docker-compose &> /dev/null; then - DOCKER_COMPOSE_CMD="${DOCKER_CMD%-*}-compose" - [[ "$DOCKER_CMD" == "sudo docker" ]] && DOCKER_COMPOSE_CMD="sudo docker-compose" - ai_ok "Docker Compose v1 available (using docker-compose)" -else - if ! $DRY_RUN; then - ai "Installing Docker Compose plugin..." - sudo apt-get update && sudo apt-get install -y docker-compose-plugin - fi -fi - -# NVIDIA Container Toolkit -if [[ $GPU_COUNT -gt 0 ]]; then - if command -v nvidia-container-cli &> /dev/null 2>&1; then - ai_ok "NVIDIA Container Toolkit installed" - # Always regenerate CDI spec โ€” driver version may have changed since last run - if command -v nvidia-ctk &>/dev/null && ! $DRY_RUN; then - sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml 2>>"$LOG_FILE" || true - fi - else - ai "Installing NVIDIA Container Toolkit..." - if ! $DRY_RUN; then - # Add NVIDIA GPG key - curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg 2>/dev/null || true - # Use NVIDIA's current generic deb repo (per-distro URLs were deprecated) - curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ - sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ - sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list > /dev/null - # Verify we got a valid repo file, not an HTML 404 - if grep -q '/dev/null; then - warn "Failed to download NVIDIA Container Toolkit repo list. Trying fallback..." - echo "deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/deb/\$(ARCH) /" | \ - sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list > /dev/null - fi - sudo apt-get update - if ! sudo apt-get install -y nvidia-container-toolkit; then - error "Failed to install NVIDIA Container Toolkit. Check network connectivity and GPU drivers." - fi - sudo nvidia-ctk runtime configure --runtime=docker - sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml 2>>"$LOG_FILE" || true - sudo systemctl restart docker - fi - if command -v nvidia-container-cli &> /dev/null 2>&1; then - ai_ok "NVIDIA Container Toolkit installed" - else - $DRY_RUN && ai_ok "[DRY RUN] Would install NVIDIA Container Toolkit" || error "NVIDIA Container Toolkit installation failed โ€” nvidia-container-cli not found after install." - fi - fi -fi - -#============================================================================= -# Directory Structure & Files -#============================================================================= -chapter "SETTING UP INSTALLATION" - -if $DRY_RUN; then - log "[DRY RUN] Would create: $INSTALL_DIR" - log "[DRY RUN] Would copy docker-compose.yml and generate .env" -else - # Create directories - mkdir -p "$INSTALL_DIR"/{config,data,models} - mkdir -p "$INSTALL_DIR"/data/{vllm,open-webui,whisper,tts,n8n,qdrant} - mkdir -p "$INSTALL_DIR"/config/{n8n,litellm,openclaw} - - # Copy docker-compose.yml from source - cp "$SCRIPT_DIR/docker-compose.yml" "$INSTALL_DIR/" - - # Copy config files if they exist - [[ -d "$SCRIPT_DIR/config" ]] && cp -r "$SCRIPT_DIR/config"/* "$INSTALL_DIR/config/" 2>/dev/null || true - [[ -d "$SCRIPT_DIR/workflows" ]] && cp -r "$SCRIPT_DIR/workflows" "$INSTALL_DIR/config/n8n/" 2>/dev/null || true - - # Copy build contexts needed by docker compose - for build_dir in agents dashboard dashboard-api privacy-shield vllm-tool-proxy; do - [[ -d "$SCRIPT_DIR/$build_dir" ]] && cp -r "$SCRIPT_DIR/$build_dir" "$INSTALL_DIR/$build_dir" 2>/dev/null || true - done - - # Select tier-appropriate OpenClaw config - if [[ "$ENABLE_OPENCLAW" == "true" && -n "$OPENCLAW_CONFIG" ]]; then - # In bootstrap mode, OpenClaw should use the 1.5B model that vLLM actually serves at startup. - # The full tier model downloads in the background and can be switched later. - if [[ "$BOOTSTRAP_MODE" == "true" ]]; then - OPENCLAW_MODEL="Qwen/Qwen2.5-1.5B-Instruct" - OPENCLAW_CONTEXT=32768 - else - OPENCLAW_MODEL="$LLM_MODEL" - OPENCLAW_CONTEXT="$MAX_CONTEXT" - fi - - if [[ -f "$SCRIPT_DIR/config/openclaw/$OPENCLAW_CONFIG" ]]; then - cp "$SCRIPT_DIR/config/openclaw/$OPENCLAW_CONFIG" "$INSTALL_DIR/config/openclaw/openclaw.json" - # Dynamically set model to match what vLLM is actually serving - sed -i "s|Qwen/Qwen2.5-[^\"]*|${OPENCLAW_MODEL}|g" "$INSTALL_DIR/config/openclaw/openclaw.json" - log "Installed OpenClaw config: $OPENCLAW_CONFIG -> openclaw.json (model: $OPENCLAW_MODEL)" - else - warn "OpenClaw config $OPENCLAW_CONFIG not found, using default" - cp "$SCRIPT_DIR/config/openclaw/openclaw.json.example" "$INSTALL_DIR/config/openclaw/openclaw.json" 2>/dev/null || true - fi - mkdir -p "$INSTALL_DIR/data/openclaw/home" - # Generate OpenClaw home config with local vLLM provider - OPENCLAW_TOKEN=$(openssl rand -hex 24 2>/dev/null || head -c 24 /dev/urandom | xxd -p) - cat > "$INSTALL_DIR/data/openclaw/home/openclaw.json" << OCLAW_EOF -{ - "models": { - "providers": { - "local-vllm": { - "baseUrl": "http://vllm-tool-proxy:8003/v1", - "apiKey": "none", - "api": "openai-completions", - "models": [ - { - "id": "${OPENCLAW_MODEL}", - "name": "Dream Server LLM (Local)", - "reasoning": false, - "input": ["text"], - "cost": {"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0}, - "contextWindow": ${OPENCLAW_CONTEXT}, - "maxTokens": 8192, - "compat": { - "supportsStore": false, - "supportsDeveloperRole": false, - "supportsReasoningEffort": false, - "maxTokensField": "max_tokens" - } - } - ] - } - } - }, - "agents": { - "defaults": { - "model": {"primary": "local-vllm/${OPENCLAW_MODEL}"}, - "models": {"local-vllm/${OPENCLAW_MODEL}": {}}, - "compaction": {"mode": "safeguard"}, - "subagents": {"maxConcurrent": 20, "model": "local-vllm/${OPENCLAW_MODEL}"} - } - }, - "commands": {"native": "auto", "nativeSkills": "auto"}, - "gateway": { - "mode": "local", - "bind": "lan", - "controlUi": {"allowInsecureAuth": true}, - "auth": {"mode": "token", "token": "${OPENCLAW_TOKEN}"} - } -} -OCLAW_EOF - # Generate agent auth-profiles.json for vLLM provider - mkdir -p "$INSTALL_DIR/data/openclaw/home/agents/main/agent" - cat > "$INSTALL_DIR/data/openclaw/home/agents/main/agent/auth-profiles.json" << AUTH_EOF -{ - "version": 1, - "profiles": { - "local-vllm:default": { - "type": "api_key", - "provider": "local-vllm", - "key": "none" - } - }, - "lastGood": {"local-vllm": "local-vllm:default"}, - "usageStats": {} -} -AUTH_EOF - cat > "$INSTALL_DIR/data/openclaw/home/agents/main/agent/models.json" << MODELS_EOF -{ - "providers": { - "local-vllm": { - "baseUrl": "http://vllm-tool-proxy:8003/v1", - "apiKey": "none", - "api": "openai-completions", - "models": [ - { - "id": "${OPENCLAW_MODEL}", - "name": "Dream Server LLM (Local)", - "reasoning": false, - "input": ["text"], - "cost": {"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0}, - "contextWindow": ${OPENCLAW_CONTEXT}, - "maxTokens": 8192, - "compat": { - "supportsStore": false, - "supportsDeveloperRole": false, - "supportsReasoningEffort": false, - "maxTokensField": "max_tokens" - } - } - ] - } - } -} -MODELS_EOF - log "Generated OpenClaw home config (model: $OPENCLAW_MODEL, gateway token set)" - # Create workspace directory (must exist before Docker Compose, - # otherwise Docker auto-creates it as root and the container can't write to it) - mkdir -p "$INSTALL_DIR/config/openclaw/workspace" - # Copy workspace personality files (SOUL.md etc.) if the repo ships any - if [[ -d "$SCRIPT_DIR/config/openclaw/workspace" ]]; then - cp -r "$SCRIPT_DIR/config/openclaw/workspace"/* "$INSTALL_DIR/config/openclaw/workspace/" 2>/dev/null || true - log "Installed OpenClaw workspace files (agent personality)" - fi - fi - - # Create hermes tool template for vLLM - mkdir -p "$INSTALL_DIR/data/vllm" - cat > "$INSTALL_DIR/data/vllm/hermes_tool_template.jinja" << 'TEMPLATE_EOF' -{%- for message in messages %} -{%- if message.role == 'system' %} -<|im_start|>system -{{ message.content }}<|im_end|> -{%- elif message.role == 'user' %} -<|im_start|>user -{{ message.content }}<|im_end|> -{%- elif message.role == 'assistant' %} -<|im_start|>assistant -{%- if message.tool_calls %} -{%- for tool_call in message.tool_calls %} - -{"name": "{{ tool_call.function.name }}", "arguments": {{ tool_call.function.arguments }}} - -{%- endfor %} -{%- else %} -{{ message.content }} -{%- endif %} -<|im_end|> -{%- elif message.role == 'tool' %} -<|im_start|>tool -{{ message.content }}<|im_end|> -{%- endif %} -{%- endfor %} -{%- if add_generation_prompt %} -<|im_start|>assistant -{%- endif %} -TEMPLATE_EOF - - # Generate secure secrets - WEBUI_SECRET=$(openssl rand -hex 32 2>/dev/null || head -c 32 /dev/urandom | xxd -p) - N8N_PASS=$(openssl rand -base64 16 2>/dev/null || head -c 16 /dev/urandom | base64) - LITELLM_KEY="sk-dream-$(openssl rand -hex 16 2>/dev/null || head -c 16 /dev/urandom | xxd -p)" - LIVEKIT_SECRET=$(openssl rand -base64 32 2>/dev/null || head -c 32 /dev/urandom | base64) - TOKEN_SPY_DB_PASSWORD=$(openssl rand -base64 32 2>/dev/null || head -c 32 /dev/urandom | base64) - DASHBOARD_API_KEY=$(openssl rand -hex 32 2>/dev/null || head -c 32 /dev/urandom | xxd -p) - - # Generate .env file - cat > "$INSTALL_DIR/.env" << ENV_EOF -# Dream Server Configuration -# Generated by installer v${VERSION} on $(date -Iseconds) -# Tier: ${TIER} (${TIER_NAME}) - -#=== LLM Settings === -LLM_MODEL=${LLM_MODEL} -MAX_CONTEXT=${MAX_CONTEXT} -GPU_UTIL=${GPU_UTIL} -GPU_DEVICES=all -GPU_COUNT=${GPU_COUNT:-1} -HF_TOKEN= - -#=== Ports === -VLLM_PORT=8000 -WEBUI_PORT=3000 -WHISPER_PORT=9000 -TTS_PORT=8880 -N8N_PORT=5678 -QDRANT_PORT=6333 -QDRANT_GRPC_PORT=6334 -LITELLM_PORT=4000 -OPENCLAW_PORT=7860 - -#=== Security (auto-generated, keep secret!) === -WEBUI_SECRET=${WEBUI_SECRET} -DASHBOARD_API_KEY=${DASHBOARD_API_KEY} -N8N_USER=admin -N8N_PASS=${N8N_PASS} -LITELLM_KEY=${LITELLM_KEY} -LIVEKIT_API_KEY=$(openssl rand -hex 16 2>/dev/null || head -c 16 /dev/urandom | xxd -p) -LIVEKIT_API_SECRET=${LIVEKIT_SECRET} -TOKEN_SPY_DB_PASSWORD=${TOKEN_SPY_DB_PASSWORD} -TOKEN_MONITOR_DB=postgresql://tokenspy:${TOKEN_SPY_DB_PASSWORD}@token-spy-db:5432/tokenspy -OPENCLAW_TOKEN=${OPENCLAW_TOKEN:-} - -#=== Voice Settings === -WHISPER_MODEL=base -TTS_VOICE=en_US-lessac-medium - -#=== Web UI Settings === -WEBUI_AUTH=true -ENABLE_WEB_SEARCH=true -WEB_SEARCH_ENGINE=duckduckgo - -#=== n8n Settings === -N8N_AUTH=true -N8N_HOST=localhost -N8N_WEBHOOK_URL=http://localhost:5678 -TIMEZONE=${SYSTEM_TZ:-UTC} -ENV_EOF - - chmod 600 "$INSTALL_DIR/.env" # Secure secrets file - ai_ok "Created $INSTALL_DIR" - ai_ok "Generated secure secrets in .env (permissions: 600)" -fi - -#============================================================================= -# Copy Documentation -#============================================================================= -if ! $DRY_RUN; then - # Copy docs for reference - [[ -d "$SCRIPT_DIR/docs" ]] && cp -r "$SCRIPT_DIR/docs" "$INSTALL_DIR/" 2>/dev/null || true - [[ -f "$SCRIPT_DIR/README.md" ]] && cp "$SCRIPT_DIR/README.md" "$INSTALL_DIR/" 2>/dev/null || true - - # Copy status script - [[ -f "$SCRIPT_DIR/status.sh" ]] && cp "$SCRIPT_DIR/status.sh" "$INSTALL_DIR/" && chmod +x "$INSTALL_DIR/status.sh" 2>/dev/null || true - - # Copy CLI management tools (A12 fix) - [[ -f "$SCRIPT_DIR/dream-cli" ]] && cp "$SCRIPT_DIR/dream-cli" "$INSTALL_DIR/" && chmod +x "$INSTALL_DIR/dream-cli" 2>/dev/null || true - [[ -f "$SCRIPT_DIR/dream-backup.sh" ]] && cp "$SCRIPT_DIR/dream-backup.sh" "$INSTALL_DIR/" && chmod +x "$INSTALL_DIR/dream-backup.sh" 2>/dev/null || true - [[ -f "$SCRIPT_DIR/dream-restore.sh" ]] && cp "$SCRIPT_DIR/dream-restore.sh" "$INSTALL_DIR/" && chmod +x "$INSTALL_DIR/dream-restore.sh" 2>/dev/null || true - [[ -f "$SCRIPT_DIR/dream-update.sh" ]] && cp "$SCRIPT_DIR/dream-update.sh" "$INSTALL_DIR/" && chmod +x "$INSTALL_DIR/dream-update.sh" 2>/dev/null || true - [[ -f "$SCRIPT_DIR/dream-preflight.sh" ]] && cp "$SCRIPT_DIR/dream-preflight.sh" "$INSTALL_DIR/" && chmod +x "$INSTALL_DIR/dream-preflight.sh" 2>/dev/null || true - - # Copy compose variants (A12 fix) - [[ -f "$SCRIPT_DIR/docker-compose.local.yml" ]] && cp "$SCRIPT_DIR/docker-compose.local.yml" "$INSTALL_DIR/" 2>/dev/null || true - [[ -f "$SCRIPT_DIR/docker-compose.hybrid.yml" ]] && cp "$SCRIPT_DIR/docker-compose.hybrid.yml" "$INSTALL_DIR/" 2>/dev/null || true - [[ -f "$SCRIPT_DIR/docker-compose.cloud.yml" ]] && cp "$SCRIPT_DIR/docker-compose.cloud.yml" "$INSTALL_DIR/" 2>/dev/null || true - [[ -f "$SCRIPT_DIR/docker-compose.offline.yml" ]] && cp "$SCRIPT_DIR/docker-compose.offline.yml" "$INSTALL_DIR/" 2>/dev/null || true - [[ -f "$SCRIPT_DIR/docker-compose.edge.yml" ]] && cp "$SCRIPT_DIR/docker-compose.edge.yml" "$INSTALL_DIR/" 2>/dev/null || true -fi - -#============================================================================= -# Developer Tools (Claude Code + Codex CLI) -#============================================================================= -if ! $DRY_RUN; then - ai "Installing AI developer tools..." - - # Ensure Node.js/npm is available (needed for Claude Code and Codex) - if ! command -v npm &> /dev/null; then - if command -v apt-get &> /dev/null; then - ai "Installing Node.js..." - curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash - >> "$LOG_FILE" 2>&1 || true - sudo apt-get install -y nodejs >> "$LOG_FILE" 2>&1 || true - fi - fi - - if command -v npm &> /dev/null; then - # Install Claude Code (Anthropic's CLI for Claude) - if ! command -v claude &> /dev/null; then - sudo npm install -g @anthropic-ai/claude-code >> "$LOG_FILE" 2>&1 && \ - ai_ok "Claude Code installed (run 'claude' to start)" || \ - ai_warn "Claude Code install failed โ€” install later with: npm i -g @anthropic-ai/claude-code" - else - ai_ok "Claude Code already installed" - fi - - # Install Codex CLI (OpenAI's terminal agent) - if ! command -v codex &> /dev/null; then - sudo npm install -g @openai/codex >> "$LOG_FILE" 2>&1 && \ - ai_ok "Codex CLI installed (run 'codex' to start)" || \ - ai_warn "Codex CLI install failed โ€” install later with: npm i -g @openai/codex" - else - ai_ok "Codex CLI already installed" - fi - else - ai_warn "npm not available โ€” skipping Claude Code and Codex CLI install" - ai " Install later: npm i -g @anthropic-ai/claude-code @openai/codex" - fi -fi - -#============================================================================= -# Pull Images -#============================================================================= -show_phase 4 6 "Downloading Modules" "~5-10 minutes" - -# Build image list with cinematic labels -# Format: "image|friendly_name" -PULL_LIST=() -PULL_LIST+=("vllm/vllm-openai:v0.15.1|VLLM CORE โ€” downloading the brain (~12GB)") -PULL_LIST+=("ghcr.io/open-webui/open-webui:v0.7.2|OPEN WEBUI โ€” interface module") -[[ "$ENABLE_VOICE" == "true" ]] && PULL_LIST+=("onerahmet/openai-whisper-asr-webservice:v1.4.1|WHISPER โ€” ears online") -[[ "$ENABLE_VOICE" == "true" ]] && PULL_LIST+=("ghcr.io/remsky/kokoro-fastapi-cpu:v0.2.4|KOKORO โ€” voice module") -[[ "$ENABLE_WORKFLOWS" == "true" ]] && PULL_LIST+=("n8nio/n8n:2.6.4|N8N โ€” automation engine") -[[ "$ENABLE_RAG" == "true" ]] && PULL_LIST+=("qdrant/qdrant:v1.16.3|QDRANT โ€” memory vault") -[[ "$ENABLE_OPENCLAW" == "true" ]] && PULL_LIST+=("ghcr.io/openclaw/openclaw:latest|OPENCLAW โ€” agent framework") -[[ "$ENABLE_RAG" == "true" ]] && PULL_LIST+=("ghcr.io/huggingface/text-embeddings-inference:cpu-1.9.1|TEI โ€” embedding engine") - -if $DRY_RUN; then - ai "[DRY RUN] I would download ${#PULL_LIST[@]} modules." -else - echo "" - bootline - echo -e "${CYAN}DOWNLOAD SEQUENCE${NC}" - echo -e "${YELLOW}This is the long scene.${NC} (largest module first)" - bootline - echo "" - signal "Take a break for ten minutes. I've got this." - echo "" - - pull_count=0 - pull_total=${#PULL_LIST[@]} - pull_failed=0 - - for entry in "${PULL_LIST[@]}"; do - img="${entry%%|*}" - label="${entry##*|}" - pull_count=$((pull_count + 1)) - - if ! pull_with_progress "$img" "$label" "$pull_count" "$pull_total"; then - ai_warn "Failed to pull $img โ€” will retry on next start" - pull_failed=$((pull_failed + 1)) - fi - done - - echo "" - if [[ $pull_failed -eq 0 ]]; then - ai_ok "All $pull_total modules downloaded" - else - ai_warn "$pull_failed of $pull_total modules failed โ€” services may not start fully" - fi -fi - -#============================================================================= -# Bootstrap Mode Setup -#============================================================================= -if [[ "$BOOTSTRAP_MODE" == "true" ]] && ! $DRY_RUN; then - # Copy bootstrap scripts - mkdir -p "$INSTALL_DIR/scripts" - cp "$SCRIPT_DIR/scripts/model-bootstrap.sh" "$INSTALL_DIR/scripts/" 2>/dev/null || true - chmod +x "$INSTALL_DIR/scripts/model-bootstrap.sh" 2>/dev/null || true - - # Copy bootstrap compose override - cp "$SCRIPT_DIR/docker-compose.bootstrap.yml" "$INSTALL_DIR/" 2>/dev/null || true - - # Store the target model for later upgrade - echo "$LLM_MODEL" > "$INSTALL_DIR/.target-model" - echo "${QUANTIZATION:-}" > "$INSTALL_DIR/.target-quantization" - - log "Bootstrap mode enabled: Starting with Qwen2.5-1.5B for instant access" - log "Full model ($LLM_MODEL) will download in background" -fi - -#============================================================================= -# Offline Mode Setup (M1 Integration) -#============================================================================= -if [[ "$OFFLINE_MODE" == "true" ]] && ! $DRY_RUN; then - chapter "CONFIGURING OFFLINE MODE (M1)" - - # Create offline mode marker - touch "$INSTALL_DIR/.offline-mode" - - # Disable any cloud-dependent features in .env - sed -i 's/^BRAVE_API_KEY=.*/BRAVE_API_KEY=/' "$INSTALL_DIR/.env" 2>/dev/null || true - sed -i 's/^ANTHROPIC_API_KEY=.*/ANTHROPIC_API_KEY=/' "$INSTALL_DIR/.env" 2>/dev/null || true - sed -i 's/^OPENAI_API_KEY=.*/OPENAI_API_KEY=/' "$INSTALL_DIR/.env" 2>/dev/null || true - - # Add offline mode config - cat >> "$INSTALL_DIR/.env" << 'OFFLINE_EOF' - -#============================================================================= -# M1 Offline Mode Configuration -#============================================================================= -OFFLINE_MODE=true - -# Disable telemetry and update checks -DISABLE_TELEMETRY=true -DISABLE_UPDATE_CHECK=true - -# Use local RAG instead of web search -WEB_SEARCH_ENABLED=false -LOCAL_RAG_ENABLED=true -OFFLINE_EOF - - # Create OpenClaw M1 config if OpenClaw is enabled - if [[ "$ENABLE_OPENCLAW" == "true" ]]; then - mkdir -p "$INSTALL_DIR/config/openclaw" - cat > "$INSTALL_DIR/config/openclaw/openclaw-m1.yaml" << 'M1_EOF' -# OpenClaw M1 Mode Configuration -# Fully offline operation - no cloud dependencies - -memorySearch: - enabled: true - # Uses bundled GGUF embeddings (auto-downloaded during install) - # No external API calls - -# Disable web search (not available offline) -# Use local RAG with Qdrant instead -webSearch: - enabled: false - -# Local inference only -inference: - provider: local - baseUrl: http://vllm-tool-proxy:8003/v1 -M1_EOF - ai_ok "OpenClaw M1 config created" - fi - - # Pre-download GGUF embeddings for memory_search - ai "Pre-downloading GGUF embeddings for offline memory_search..." - mkdir -p "$INSTALL_DIR/models/embeddings" - - # Download embeddinggemma GGUF (small, ~300MB) - if command -v curl &> /dev/null; then - EMBED_URL="https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.Q4_K_M.gguf" - if ! [[ -f "$INSTALL_DIR/models/embeddings/nomic-embed-text-v1.5.Q4_K_M.gguf" ]]; then - curl -L -o "$INSTALL_DIR/models/embeddings/nomic-embed-text-v1.5.Q4_K_M.gguf" "$EMBED_URL" 2>/dev/null || \ - ai_warn "Could not pre-download embeddings. Memory search will download on first use." - else - log "Embeddings already downloaded" - fi - fi - - # Copy offline documentation - if [[ -f "$SCRIPT_DIR/docs/M1-OFFLINE-MODE.md" ]]; then - cp "$SCRIPT_DIR/docs/M1-OFFLINE-MODE.md" "$INSTALL_DIR/docs/" - fi - - ai_ok "Offline mode configured" - log "After installation, disconnect from internet for fully air-gapped operation" - log "See docs/M1-OFFLINE-MODE.md for offline operation guide" -fi - -#============================================================================= -# Start Services -#============================================================================= -show_phase 5 6 "Starting Services" "~2-3 minutes" - -if $DRY_RUN; then - if [[ "$BOOTSTRAP_MODE" == "true" ]]; then - log "[DRY RUN] Would start with bootstrap model (1.5B), then upgrade" - fi - log "[DRY RUN] Would start services: $DOCKER_COMPOSE_CMD$PROFILES up -d" -else - cd "$INSTALL_DIR" - - # Create logs directory for background downloads - mkdir -p "$INSTALL_DIR/logs" - - if [[ "$BOOTSTRAP_MODE" == "true" ]]; then - # Start with bootstrap compose (tiny model) - echo "" - signal "Waking the stack..." - ai "I'm bringing systems online. You can breathe." - echo "" - if [[ -n "$PROFILES" ]]; then - $DOCKER_COMPOSE_CMD -f docker-compose.yml -f docker-compose.bootstrap.yml $PROFILES up --build -d >> "$LOG_FILE" 2>&1 & - else - $DOCKER_COMPOSE_CMD -f docker-compose.yml -f docker-compose.bootstrap.yml up --build -d >> "$LOG_FILE" 2>&1 & - fi - compose_pid=$! - if ! spin_task $compose_pid "Launching containers (bootstrap mode)..."; then - printf "\r ${YELLOW}โš ${NC} %-60s\n" "Some services still starting..." - echo "" - ai_warn "Some containers need more time (model downloading). Retrying..." - # Retry โ€” picks up containers that missed the dependency window - if [[ -n "$PROFILES" ]]; then - $DOCKER_COMPOSE_CMD -f docker-compose.yml -f docker-compose.bootstrap.yml $PROFILES up --build -d >> "$LOG_FILE" 2>&1 & - else - $DOCKER_COMPOSE_CMD -f docker-compose.yml -f docker-compose.bootstrap.yml up --build -d >> "$LOG_FILE" 2>&1 & - fi - compose_pid=$! - spin_task $compose_pid "Waiting for remaining services..." || true - fi - printf "\r ${GREEN}โœ“${NC} %-60s\n" "All containers launched" - echo "" - ai_ok "Bootstrap services started (1.5B model for instant access)" - - # Start background download of full model with retry logic - log "Starting background download of full model: $LLM_MODEL" - - # Clean up partial download marker on exit (only log if it actually existed) - trap "if [[ -d '$INSTALL_DIR/models/.downloading' ]]; then rm -rf '$INSTALL_DIR/models/.downloading' 2>/dev/null; echo 'Download interrupted, cleaned up partial files'; fi" EXIT TERM - - # Note: Variables are interpolated at script write time (no escaping needed) - nohup bash -c " - sleep 30 # Let bootstrap stabilize first - cd '$INSTALL_DIR' - - MAX_RETRIES=${MAX_DOWNLOAD_RETRIES} - RETRY_DELAY=${DOWNLOAD_RETRY_DELAY} - - for attempt in \$(seq 1 \$MAX_RETRIES); do - echo \"[Attempt \$attempt/\$MAX_RETRIES] Downloading $LLM_MODEL...\" - - # Download using docker (portable) - $DOCKER_CMD run --rm \\ - -v '$INSTALL_DIR/models:/root/.cache/huggingface' \\ - -e HF_TOKEN=\"\${HF_TOKEN:-}\" \\ - python:3.11-slim \\ - bash -c 'pip install -q huggingface_hub && python -c \"from huggingface_hub import snapshot_download; snapshot_download('\\''$LLM_MODEL'\\'')\"' - - if [ \$? -eq 0 ]; then - echo 'Full model downloaded successfully!' - touch '$INSTALL_DIR/.model-swap-ready' - exit 0 - else - echo \"Download attempt \$attempt failed.\" - if [ \$attempt -lt \$MAX_RETRIES ]; then - echo \"Retrying in \$RETRY_DELAY seconds...\" - sleep \$RETRY_DELAY - fi - fi - done - - echo 'ERROR: Model download failed after \$MAX_RETRIES attempts.' - echo 'Check your internet connection and try: $DOCKER_COMPOSE_CMD restart' - " > "$INSTALL_DIR/logs/model-download.log" 2>&1 & - - log "Background download started. Check progress: tail -f $INSTALL_DIR/logs/model-download.log" - else - # Normal mode - start with full model (longer wait) - echo "" - signal "Waking the stack..." - ai "I'm bringing systems online. You can breathe." - echo "" - if [[ -n "$PROFILES" ]]; then - $DOCKER_COMPOSE_CMD $PROFILES up --build -d >> "$LOG_FILE" 2>&1 & - else - $DOCKER_COMPOSE_CMD up --build -d >> "$LOG_FILE" 2>&1 & - fi - compose_pid=$! - if ! spin_task $compose_pid "Launching containers..."; then - printf "\r ${YELLOW}โš ${NC} %-60s\n" "Some services still starting..." - echo "" - ai_warn "Some containers need more time. Retrying..." - if [[ -n "$PROFILES" ]]; then - $DOCKER_COMPOSE_CMD $PROFILES up --build -d >> "$LOG_FILE" 2>&1 & - else - $DOCKER_COMPOSE_CMD up --build -d >> "$LOG_FILE" 2>&1 & - fi - compose_pid=$! - spin_task $compose_pid "Waiting for remaining services..." || true - fi - printf "\r ${GREEN}โœ“${NC} %-60s\n" "All containers launched" - echo "" - ai_ok "Services started" - fi -fi - -#============================================================================= -# Health Check -#============================================================================= -show_phase 6 6 "Systems Online" "~1-2 minutes" -ai "Linking services... standby." - -sleep 5 - -# Bootstrap mode = fast startup, normal = longer wait for big model -# Health checks are best-effort โ€” don't let set -e kill the script if a service is slow -if [[ "$BOOTSTRAP_MODE" == "true" ]]; then - check_service "vLLM (bootstrap)" "http://localhost:8000/health" 30 || true -else - check_service "vLLM" "http://localhost:8000/health" 120 || true -fi -check_service "Open WebUI" "http://localhost:3000" 60 || true - -[[ "$ENABLE_VOICE" == "true" ]] && check_service "Whisper" "http://localhost:9000" 30 -[[ "$ENABLE_WORKFLOWS" == "true" ]] && check_service "n8n" "http://localhost:5678" 30 -[[ "$ENABLE_RAG" == "true" ]] && check_service "Qdrant" "http://localhost:6333" 30 - -echo "" -signal "All systems nominal." -ai_ok "Sovereign intelligence is online." - -#============================================================================= -# Summary -#============================================================================= - -# Get local IP for LAN access -LOCAL_IP=$(hostname -I 2>/dev/null | awk '{print $1}' || echo "") - -# Save current mode and profiles for dream-cli -if [[ "$OFFLINE_MODE" == "true" ]]; then - echo "offline" > "$INSTALL_DIR/.current-mode" -else - echo "local" > "$INSTALL_DIR/.current-mode" -fi -echo "$PROFILES" > "$INSTALL_DIR/.profiles" - -# Show the cinematic success card -show_success_card "http://localhost:3000" "http://localhost:3001" "$LOCAL_IP" - -# Additional service info -bootline -echo -e "${CYAN}ALL SERVICES${NC}" -bootline -echo " โ€ข Chat UI: http://localhost:3000" -echo " โ€ข Dashboard: http://localhost:3001" -echo " โ€ข LLM API: http://localhost:8000/v1" -[[ "$ENABLE_VOICE" == "true" ]] && echo " โ€ข Whisper STT: http://localhost:9000" -[[ "$ENABLE_VOICE" == "true" ]] && echo " โ€ข TTS (Kokoro): http://localhost:8880" -[[ "$ENABLE_WORKFLOWS" == "true" ]] && echo " โ€ข n8n: http://localhost:5678" -[[ "$ENABLE_RAG" == "true" ]] && echo " โ€ข Qdrant: http://localhost:6333" -echo "" - -# Configuration summary -bootline -echo -e "${CYAN}YOUR CONFIGURATION${NC}" -bootline -echo " โ€ข Tier: $TIER ($TIER_NAME)" -echo " โ€ข Model: $LLM_MODEL" -echo " โ€ข Install dir: $INSTALL_DIR" -echo "" - -# Bootstrap mode notice -if [[ "$BOOTSTRAP_MODE" == "true" ]]; then - echo -e "${GREEN}โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—${NC}" - echo -e "${GREEN}โ•‘${NC} ${YELLOW}โšก BOOTSTRAP MODE ACTIVE${NC} ${GREEN}โ•‘${NC}" - echo -e "${GREEN}โ• โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฃ${NC}" - echo -e "${GREEN}โ•‘${NC} You can start chatting NOW with the 1.5B model. ${GREEN}โ•‘${NC}" - echo -e "${GREEN}โ•‘${NC} ${GREEN}โ•‘${NC}" - echo -e "${GREEN}โ•‘${NC} Full model (${LLM_MODEL}) is downloading... ${GREEN}โ•‘${NC}" - echo -e "${GREEN}โ•‘${NC} Check progress on the Dashboard at localhost:3001 ${GREEN}โ•‘${NC}" - echo -e "${GREEN}โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•${NC}" - echo "" -fi - -# Quick commands -bootline -echo -e "${CYAN}QUICK COMMANDS${NC}" -bootline -echo " cd $INSTALL_DIR" -echo " docker compose ps # Check status" -echo " docker compose logs -f # View logs" -echo " docker compose restart # Restart services" -echo "" - -if [[ -f "$LOG_FILE" ]]; then - echo -e "${BLUE}Full installation log:${NC} $LOG_FILE" - echo "" -fi - -# Run preflight check to validate installation -echo "" -bootline -echo -e "${CYAN}RUNNING PREFLIGHT VALIDATION${NC}" -bootline -echo "" - -if [[ -f "$SCRIPT_DIR/dream-preflight.sh" ]]; then - # Wait a moment for services to stabilize - sleep 2 - bash "$SCRIPT_DIR/dream-preflight.sh" || true -else - log "Preflight script not found โ€” skipping validation" -fi - -#============================================================================= -# Desktop Shortcut & Sidebar Pin -#============================================================================= -if ! $DRY_RUN; then - DESKTOP_FILE="$HOME/.local/share/applications/dream-server.desktop" - mkdir -p "$HOME/.local/share/applications" - cat > "$DESKTOP_FILE" << DESKTOP_EOF -[Desktop Entry] -Version=1.0 -Type=Application -Name=Dream Server -Comment=Local AI Dashboard -Exec=xdg-open http://localhost:3001 -Icon=applications-internet -Terminal=false -Categories=Development; -StartupNotify=true -DESKTOP_EOF - - # Pin to GNOME sidebar (favorites) if gsettings is available - if command -v gsettings &> /dev/null; then - CURRENT_FAVS=$(gsettings get org.gnome.shell favorite-apps 2>/dev/null || echo "[]") - if [[ "$CURRENT_FAVS" != *"dream-server.desktop"* ]]; then - NEW_FAVS=$(echo "$CURRENT_FAVS" | sed "s/]$/, 'dream-server.desktop']/" | sed "s/\[, /[/") - gsettings set org.gnome.shell favorite-apps "$NEW_FAVS" 2>/dev/null || true - ai_ok "Dashboard pinned to sidebar" - fi - fi - - ai_ok "Desktop shortcut created: Dream Server" -fi - -echo "" -signal "Broadcast stable. You're free now." -echo "" -DASHBOARD_PORT="${DASHBOARD_PORT:-3001}" -WEBUI_PORT="${WEBUI_PORT:-3000}" -OPENCLAW_PORT="${OPENCLAW_PORT:-7860}" -LOCAL_IP=$(hostname -I 2>/dev/null | awk '{print $1}') -echo -e "${CYAN}โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€${NC}" -echo -e "${CYAN} YOUR DREAM SERVER IS LIVE${NC}" -echo -e "${CYAN}โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€${NC}" -echo "" -echo -e " ${GREEN}Dashboard${NC} http://localhost:${DASHBOARD_PORT}" -echo -e " ${GREEN}Chat${NC} http://localhost:${WEBUI_PORT}" -[[ "$ENABLE_OPENCLAW" == "true" ]] && \ -echo -e " ${GREEN}OpenClaw${NC} http://localhost:${OPENCLAW_PORT}" -echo "" -if [[ -n "$LOCAL_IP" ]]; then -echo -e " ${YELLOW}On your network:${NC} http://${LOCAL_IP}:${DASHBOARD_PORT}" -fi -echo "" -echo -e " Start here โ†’ ${GREEN}http://localhost:${DASHBOARD_PORT}${NC}" -echo -e " The Dashboard shows all services, GPU status, and quick links." -echo "" -echo -e "${CYAN}โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€${NC}" -echo "" diff --git a/dream-server/installers/common.sh b/dream-server/installers/common.sh new file mode 100644 index 000000000..2322a7018 --- /dev/null +++ b/dream-server/installers/common.sh @@ -0,0 +1,18 @@ +#!/bin/bash +# Shared installer helpers for platform dispatch. + +set -euo pipefail + +detect_platform() { + if [[ -f /proc/version ]] && grep -qi microsoft /proc/version 2>/dev/null; then + echo "wsl" + elif [[ "${OSTYPE:-}" == "msys"* || "${OSTYPE:-}" == "cygwin"* || "${OSTYPE:-}" == "win32"* ]]; then + echo "windows" + elif [[ "${OSTYPE:-}" == "darwin"* ]]; then + echo "macos" + elif [[ "${OSTYPE:-}" == "linux-gnu"* ]]; then + echo "linux" + else + echo "unknown" + fi +} diff --git a/dream-server/installers/dispatch.sh b/dream-server/installers/dispatch.sh new file mode 100644 index 000000000..e85e72635 --- /dev/null +++ b/dream-server/installers/dispatch.sh @@ -0,0 +1,27 @@ +#!/bin/bash +# Platform installer dispatch. + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +source "$SCRIPT_DIR/installers/common.sh" + +resolve_installer_target() { + local platform + platform="$(detect_platform)" + + case "$platform" in + linux|wsl) + echo "$SCRIPT_DIR/install-core.sh" + ;; + windows) + echo "$SCRIPT_DIR/installers/windows.ps1" + ;; + macos) + echo "$SCRIPT_DIR/installers/macos.sh" + ;; + *) + echo "unsupported:unknown" + ;; + esac +} diff --git a/dream-server/installers/lib/compose-select.sh b/dream-server/installers/lib/compose-select.sh new file mode 100644 index 000000000..a5ce1254c --- /dev/null +++ b/dream-server/installers/lib/compose-select.sh @@ -0,0 +1,89 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Compose Selection +# ============================================================================ +# Part of: installers/lib/ +# Purpose: Resolve which docker-compose overlay files to use based on tier, +# GPU backend, and capability profile +# +# Expects: SCRIPT_DIR, TIER, GPU_BACKEND, CAP_COMPOSE_OVERLAYS, LOG_FILE, +# log(), warn() +# Provides: resolve_compose_config() โ†’ sets COMPOSE_FILE, COMPOSE_FLAGS +# +# Modder notes: +# Add new compose overlay mappings or backends here. +# ============================================================================ + +resolve_compose_config() { + COMPOSE_FILE="docker-compose.yml" + COMPOSE_FLAGS="" + + if [[ -n "${CAP_COMPOSE_OVERLAYS:-}" ]]; then + IFS=',' read -r -a profile_overlays <<< "$CAP_COMPOSE_OVERLAYS" + compose_overlay_ok=true + for overlay in "${profile_overlays[@]}"; do + if [[ -f "$SCRIPT_DIR/$overlay" ]]; then + COMPOSE_FLAGS="$COMPOSE_FLAGS -f $overlay" + else + compose_overlay_ok=false + break + fi + done + if [[ "$compose_overlay_ok" == "true" && ${#profile_overlays[@]} -gt 0 ]]; then + COMPOSE_FLAGS="${COMPOSE_FLAGS# }" + COMPOSE_FILE="${profile_overlays[${#profile_overlays[@]}-1]}" + else + COMPOSE_FLAGS="" + fi + fi + + # Backward compatibility default if no flags were set. + if [[ -z "$COMPOSE_FLAGS" ]]; then + if [[ "$TIER" == "NV_ULTRA" ]]; then + if [[ -f "$SCRIPT_DIR/docker-compose.base.yml" && -f "$SCRIPT_DIR/docker-compose.nvidia.yml" ]]; then + COMPOSE_FLAGS="-f docker-compose.base.yml -f docker-compose.nvidia.yml" + COMPOSE_FILE="docker-compose.nvidia.yml" + fi + elif [[ "$TIER" == "CLOUD" ]]; then + if [[ -f "$SCRIPT_DIR/docker-compose.base.yml" ]]; then + COMPOSE_FLAGS="-f docker-compose.base.yml" + COMPOSE_FILE="docker-compose.base.yml" + fi + elif [[ "$TIER" == "SH_LARGE" || "$TIER" == "SH_COMPACT" ]]; then + if [[ -f "$SCRIPT_DIR/docker-compose.base.yml" && -f "$SCRIPT_DIR/docker-compose.amd.yml" ]]; then + COMPOSE_FLAGS="-f docker-compose.base.yml -f docker-compose.amd.yml" + COMPOSE_FILE="docker-compose.amd.yml" + fi + else + if [[ -f "$SCRIPT_DIR/docker-compose.base.yml" && -f "$SCRIPT_DIR/docker-compose.nvidia.yml" ]]; then + COMPOSE_FLAGS="-f docker-compose.base.yml -f docker-compose.nvidia.yml" + COMPOSE_FILE="docker-compose.nvidia.yml" + elif [[ -f "$SCRIPT_DIR/docker-compose.yml" ]]; then + COMPOSE_FLAGS="-f docker-compose.yml" + fi + fi + fi + + if [[ -z "$COMPOSE_FLAGS" ]]; then + COMPOSE_FLAGS="-f $COMPOSE_FILE" + fi + + if [[ -x "$SCRIPT_DIR/scripts/resolve-compose-stack.sh" ]]; then + COMPOSE_ENV="$("$SCRIPT_DIR/scripts/resolve-compose-stack.sh" \ + --script-dir "$SCRIPT_DIR" \ + --tier "$TIER" \ + --gpu-backend "$GPU_BACKEND" \ + --profile-overlays "${CAP_COMPOSE_OVERLAYS:-}" \ + --env 2>>"$LOG_FILE")" + eval "$COMPOSE_ENV" + fi + + # Auto-include docker-compose.override.yml if present (standard Docker convention). + # This lets modders add services without editing core compose files. + if [[ -f "$SCRIPT_DIR/docker-compose.override.yml" ]]; then + COMPOSE_FLAGS="$COMPOSE_FLAGS -f docker-compose.override.yml" + log "Including docker-compose.override.yml (user overrides)" + fi + + log "Compose selection: $COMPOSE_FLAGS" +} diff --git a/dream-server/installers/lib/constants.sh b/dream-server/installers/lib/constants.sh new file mode 100644 index 000000000..4ccfa3c27 --- /dev/null +++ b/dream-server/installers/lib/constants.sh @@ -0,0 +1,44 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Constants +# ============================================================================ +# Part of: installers/lib/ +# Purpose: Colors, paths, version string, timezone detection +# +# Expects: (nothing โ€” first file sourced) +# Provides: VERSION, SCRIPT_DIR, INSTALL_DIR, LOG_FILE, color codes, +# SYSTEM_TZ, CAPABILITY_PROFILE_FILE, PREFLIGHT_REPORT_FILE, +# INSTALL_START_EPOCH +# +# Modder notes: +# Change VERSION for custom builds. Add new color codes here. +# ============================================================================ + +VERSION="2.1.0-strix-halo" +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" +INSTALL_DIR="${INSTALL_DIR:-$HOME/dream-server}" +LOG_FILE="${LOG_FILE:-/tmp/dream-server-install.log}" +CAPABILITY_PROFILE_FILE="${CAPABILITY_PROFILE_FILE:-/tmp/dream-server-capabilities.json}" +PREFLIGHT_REPORT_FILE="${PREFLIGHT_REPORT_FILE:-/tmp/dream-server-preflight-report.json}" +INSTALL_START_EPOCH=$(date +%s) + +# Auto-detect system timezone (fallback to UTC) +if [[ -f /etc/timezone ]]; then + SYSTEM_TZ="$(cat /etc/timezone)" +elif [[ -L /etc/localtime ]]; then + SYSTEM_TZ="$(readlink /etc/localtime | sed 's|.*/zoneinfo/||')" +else + SYSTEM_TZ="UTC" +fi + +#============================================================================= +# Colors โ€” green phosphor CRT theme +#============================================================================= +RED='\033[0;31m' +GRN='\033[0;32m' # Standard green โ€” body text +BGRN='\033[1;32m' # Bright green โ€” emphasis, success, headings +DGRN='\033[2;32m' # Dim green โ€” secondary text, lore +AMB='\033[0;33m' # Amber โ€” warnings, ETA labels +WHT='\033[1;37m' # White โ€” key URLs +NC='\033[0m' # Reset +CURSOR='โ–ˆ' # Block cursor for typing diff --git a/dream-server/installers/lib/detection.sh b/dream-server/installers/lib/detection.sh new file mode 100644 index 000000000..00a7b1447 --- /dev/null +++ b/dream-server/installers/lib/detection.sh @@ -0,0 +1,357 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Hardware Detection +# ============================================================================ +# Part of: installers/lib/ +# Purpose: GPU detection, capability profile loading, backend contract +# loading, Secure Boot NVIDIA auto-fix +# +# Expects: SCRIPT_DIR, LOG_FILE, CAPABILITY_PROFILE_FILE, color codes, +# INTERACTIVE, TIER, OFFLINE_MODE, ENABLE_VOICE, ENABLE_WORKFLOWS, +# ENABLE_RAG, ENABLE_OPENCLAW (all used by fix_nvidia_secure_boot), +# log/warn/ai/ai_ok/ai_warn/ai_bad helpers +# Provides: detect_gpu(), load_capability_profile(), +# normalize_profile_tier(), tier_rank(), load_backend_contract(), +# fix_nvidia_secure_boot(), MIN_DRIVER_VERSION +# +# Modder notes: +# Add new GPU vendors or APU detection logic here. +# The fix_nvidia_secure_boot() function handles Secure Boot key enrollment. +# ============================================================================ + +load_capability_profile() { + CAP_PROFILE_LOADED="false" + local builder="$SCRIPT_DIR/scripts/build-capability-profile.sh" + if [[ ! -x "$builder" ]]; then + log "Capability profile builder not found, using installer-local detection." + return 1 + fi + + local env_out + if env_out="$("$builder" --output "$CAPABILITY_PROFILE_FILE" --env 2>>"$LOG_FILE")"; then + eval "$env_out" + CAP_PROFILE_LOADED="true" + log "Capability profile loaded: ${CAP_PROFILE_FILE:-$CAPABILITY_PROFILE_FILE}" + log "Capability profile: platform=${CAP_PLATFORM_ID:-unknown}, gpu=${CAP_GPU_VENDOR:-unknown}, tier=${CAP_RECOMMENDED_TIER:-unknown}" + [[ -n "${CAP_HARDWARE_CLASS_ID:-}" ]] && log "Hardware class: ${CAP_HARDWARE_CLASS_ID} (${CAP_HARDWARE_CLASS_LABEL:-unknown})" + return 0 + fi + + warn "Capability profile generation failed, falling back to installer-local detection." + return 1 +} + +normalize_profile_tier() { + case "$1" in + T1) echo "1" ;; + T2) echo "2" ;; + T3) echo "3" ;; + T4) echo "4" ;; + NV_ULTRA|SH_LARGE|SH_COMPACT) echo "$1" ;; + *) echo "" ;; + esac +} + +tier_rank() { + case "$1" in + NV_ULTRA|SH_LARGE) echo 5 ;; + 4) echo 4 ;; + SH_COMPACT|3) echo 3 ;; + 2) echo 2 ;; + *) echo 1 ;; + esac +} + +load_backend_contract() { + local backend="$1" + local loader="$SCRIPT_DIR/scripts/load-backend-contract.sh" + BACKEND_CONTRACT_LOADED="false" + if [[ ! -x "$loader" ]]; then + warn "Backend contract loader missing, using built-in backend defaults." + return 1 + fi + local env_out + if env_out="$("$loader" --backend "$backend" --env 2>>"$LOG_FILE")"; then + eval "$env_out" + BACKEND_CONTRACT_LOADED="true" + log "Backend contract loaded: ${BACKEND_CONTRACT_FILE:-unknown}" + log "Backend runtime: ${BACKEND_CONTRACT_ID:-$backend} (${BACKEND_LLM_ENGINE:-unknown})" + return 0 + fi + warn "Could not load backend contract for '$backend', using built-in defaults." + return 1 +} + +detect_gpu() { + GPU_BACKEND="nvidia" # default + GPU_MEMORY_TYPE="discrete" + GPU_DEVICE_ID="" + + # Try NVIDIA first + if command -v nvidia-smi &> /dev/null; then + local raw + if raw=$(nvidia-smi --query-gpu=name,memory.total --format=csv,noheader 2>/dev/null) && [[ -n "$raw" ]]; then + GPU_INFO="$raw" + GPU_NAME=$(echo "$GPU_INFO" | head -1 | cut -d',' -f1 | xargs) + GPU_VRAM=$(echo "$GPU_INFO" | head -1 | cut -d',' -f2 | grep -oP '\d+' | head -1) + GPU_COUNT=$(echo "$GPU_INFO" | wc -l) + # Extract PCI device ID + local pci_id + pci_id=$(nvidia-smi --query-gpu=pci.device_id --format=csv,noheader 2>/dev/null | head -1 | xargs) + [[ -n "$pci_id" ]] && GPU_DEVICE_ID="${pci_id:0:6}" + log "GPU: $GPU_NAME (${GPU_VRAM}MB VRAM) x${GPU_COUNT}" + return 0 + fi + fi + + # Try AMD APU (Strix Halo / unified memory) via sysfs + for card_dir in /sys/class/drm/card*/device; do + [[ -d "$card_dir" ]] || continue + local vendor + vendor=$(cat "$card_dir/vendor" 2>/dev/null) || continue + if [[ "$vendor" == "0x1002" ]]; then + local vram_bytes gtt_bytes + vram_bytes=$(cat "$card_dir/mem_info_vram_total" 2>/dev/null) || vram_bytes=0 + gtt_bytes=$(cat "$card_dir/mem_info_gtt_total" 2>/dev/null) || gtt_bytes=0 + local gtt_gb=$(( gtt_bytes / 1073741824 )) + local vram_gb=$(( vram_bytes / 1073741824 )) + + # Read device ID from sysfs + GPU_DEVICE_ID=$(cat "$card_dir/device" 2>/dev/null) || GPU_DEVICE_ID="unknown" + + # Detect APU: small VRAM + large GTT = unified memory + if [[ $gtt_gb -ge 16 && $vram_gb -le 4 ]] || [[ $gtt_gb -ge 32 ]] || [[ $vram_gb -ge 32 ]]; then + GPU_BACKEND="amd" + GPU_MEMORY_TYPE="unified" + GPU_VRAM=$(( vram_bytes / 1048576 )) # in MB + GPU_COUNT=1 + # Try marketing name + if [[ -f "$card_dir/product_name" ]]; then + GPU_NAME=$(cat "$card_dir/product_name" 2>/dev/null) || GPU_NAME="AMD APU" + else + GPU_NAME="AMD APU ($GPU_DEVICE_ID)" + fi + log "GPU: $GPU_NAME (unified memory, AMD APU, device_id=$GPU_DEVICE_ID)" + return 0 + fi + fi + done + + GPU_NAME="None" + GPU_VRAM=0 + GPU_COUNT=0 + warn "No NVIDIA or AMD GPU detected. CPU-only mode available but slow." + return 1 +} + +MIN_DRIVER_VERSION=570 + +fix_nvidia_secure_boot() { + # Step 1: Is there even NVIDIA hardware on this machine? + if ! lspci 2>/dev/null | grep -qi 'nvidia'; then + return 1 # No hardware โ€” nothing to fix + fi + + ai "NVIDIA GPU hardware detected but driver not responding." + + # Step 2: Ensure a driver package is installed + local installed_driver + installed_driver=$(dpkg-query -W -f='${Package}\n' 'nvidia-driver-*' 2>/dev/null \ + | grep -oP 'nvidia-driver-\K\d+' | sort -n | tail -1 || true) + + if [[ -z "$installed_driver" ]]; then + ai "No NVIDIA driver package found. Installing recommended driver..." + if command -v ubuntu-drivers &>/dev/null; then + sudo ubuntu-drivers install 2>>"$LOG_FILE" || \ + sudo apt-get install -y "nvidia-driver-${MIN_DRIVER_VERSION}" 2>>"$LOG_FILE" || true + else + sudo apt-get install -y "nvidia-driver-${MIN_DRIVER_VERSION}" 2>>"$LOG_FILE" || true + fi + installed_driver=$(dpkg-query -W -f='${Package}\n' 'nvidia-driver-*' 2>/dev/null \ + | grep -oP 'nvidia-driver-\K\d+' | sort -n | tail -1 || true) + if [[ -z "$installed_driver" ]]; then + ai_bad "Failed to install NVIDIA driver." + return 1 + fi + ai_ok "Installed nvidia-driver-${installed_driver}" + else + ai "Driver nvidia-driver-${installed_driver} is installed." + fi + + # Step 3: Try loading the module โ€” see why it fails + local modprobe_err + modprobe_err=$(sudo modprobe nvidia 2>&1) || true + + if nvidia-smi &>/dev/null; then + ai_ok "NVIDIA driver loaded successfully" + # Regenerate CDI spec so Docker sees the correct driver libraries + if command -v nvidia-ctk &>/dev/null; then + sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml 2>>"$LOG_FILE" || true + fi + detect_gpu || true + return 0 + fi + + # Step 4: If it's not a Secure Boot issue, bail out + if ! echo "$modprobe_err" | grep -qi "key was rejected"; then + ai_bad "NVIDIA module failed to load: $modprobe_err" + return 1 + fi + + # Step 5: Secure Boot is blocking the module โ€” ensure it's properly signed + ai_warn "Secure Boot is blocking the NVIDIA kernel module." + ai "Preparing module signing..." + + local kver mok_dir sign_file + kver=$(uname -r) + mok_dir="/var/lib/shim-signed/mok" + sudo mkdir -p "$mok_dir" + + # Ensure linux-headers are present (needed for sign-file) + if [[ ! -d "/usr/src/linux-headers-${kver}" ]]; then + ai "Installing kernel headers for ${kver}..." + sudo apt-get install -y "linux-headers-${kver}" 2>>"$LOG_FILE" || true + fi + + # Generate MOK keypair if not already present + if [[ ! -f "$mok_dir/MOK.priv" ]] || [[ ! -f "$mok_dir/MOK.der" ]]; then + sudo openssl req -new -x509 -newkey rsa:2048 \ + -keyout "$mok_dir/MOK.priv" \ + -outform DER -out "$mok_dir/MOK.der" \ + -nodes -days 36500 \ + -subj "/CN=Dream Server Module Signing/" 2>>"$LOG_FILE" + sudo chmod 600 "$mok_dir/MOK.priv" + ai_ok "Generated MOK signing key" + else + ai_ok "Using existing MOK signing key" + fi + + # Locate the sign-file tool + sign_file="" + for candidate in \ + "/usr/src/linux-headers-${kver}/scripts/sign-file" \ + "/usr/lib/linux-kbuild-${kver%.*}/scripts/sign-file"; do + if [[ -x "$candidate" ]]; then + sign_file="$candidate" + break + fi + done + if [[ -z "$sign_file" ]]; then + sign_file=$(find /usr/src /usr/lib -name sign-file -executable 2>/dev/null | head -1) + fi + if [[ -z "$sign_file" ]]; then + ai_bad "Cannot find kernel sign-file tool." + ai "Try: sudo apt install linux-headers-${kver}" + return 1 + fi + + # Sign every nvidia DKMS module (handles .ko, .ko.zst, .ko.xz) + local signed_count=0 + for mod_path in /lib/modules/${kver}/updates/dkms/nvidia*.ko*; do + [[ -f "$mod_path" ]] || continue + case "$mod_path" in + *.zst) + sudo zstd -d -f "$mod_path" -o "${mod_path%.zst}" 2>>"$LOG_FILE" + sudo "$sign_file" sha256 "$mok_dir/MOK.priv" "$mok_dir/MOK.der" "${mod_path%.zst}" 2>>"$LOG_FILE" + sudo zstd -f --rm "${mod_path%.zst}" -o "$mod_path" 2>>"$LOG_FILE" + ;; + *.xz) + sudo xz -d -f -k "$mod_path" 2>>"$LOG_FILE" + sudo "$sign_file" sha256 "$mok_dir/MOK.priv" "$mok_dir/MOK.der" "${mod_path%.xz}" 2>>"$LOG_FILE" + sudo xz -f "${mod_path%.xz}" 2>>"$LOG_FILE" + sudo mv "${mod_path%.xz}.xz" "$mod_path" 2>>"$LOG_FILE" + ;; + *) + sudo "$sign_file" sha256 "$mok_dir/MOK.priv" "$mok_dir/MOK.der" "$mod_path" 2>>"$LOG_FILE" + ;; + esac + signed_count=$((signed_count + 1)) + done + sudo depmod -a 2>>"$LOG_FILE" + ai_ok "Signed $signed_count NVIDIA module(s)" + + # Step 6: Try loading โ€” if MOK key is already enrolled, this works immediately + if sudo modprobe nvidia 2>>"$LOG_FILE" && nvidia-smi &>/dev/null; then + ai_ok "NVIDIA driver loaded โ€” GPU is online" + # Regenerate CDI spec so Docker sees the correct driver libraries + if command -v nvidia-ctk &>/dev/null; then + sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml 2>>"$LOG_FILE" || true + fi + detect_gpu || true + return 0 + fi + + # Step 7: MOK key needs firmware enrollment โ€” one reboot required + # This is the standard Ubuntu Secure Boot flow (same thing Ubuntu's + # "Additional Drivers" tool does). It only happens once per machine. + + local mok_pass + mok_pass=$(openssl rand -hex 4) + printf '%s\n%s\n' "$mok_pass" "$mok_pass" | sudo mokutil --import "$mok_dir/MOK.der" 2>>"$LOG_FILE" + + # --- Auto-resume: create a systemd oneshot so the install continues + # automatically after reboot (user doesn't have to re-run manually) + local svc_name="dream-server-install-resume" + local resume_args="--force --non-interactive" + $ENABLE_VOICE && resume_args="$resume_args --voice" + $ENABLE_WORKFLOWS && resume_args="$resume_args --workflows" + $ENABLE_RAG && resume_args="$resume_args --rag" + $ENABLE_OPENCLAW && resume_args="$resume_args --openclaw" + [[ -n "$TIER" ]] && resume_args="$resume_args --tier $TIER" + [[ "$OFFLINE_MODE" == "true" ]] && resume_args="$resume_args --offline" + + sudo tee /etc/systemd/system/${svc_name}.service > /dev/null << SVCEOF +[Unit] +Description=Dream Server Install (auto-resume after Secure Boot enrollment) +After=network-online.target docker.service +Wants=network-online.target + +[Service] +Type=oneshot +User=$USER +ExecStart=/bin/bash ${SCRIPT_DIR}/install.sh ${resume_args} +ExecStartPost=/bin/rm -f /etc/systemd/system/${svc_name}.service +ExecStartPost=/bin/systemctl daemon-reload +StandardOutput=journal+console +StandardError=journal+console + +[Install] +WantedBy=multi-user.target +SVCEOF + sudo systemctl daemon-reload + sudo systemctl enable "${svc_name}.service" 2>>"$LOG_FILE" + log "Auto-resume service installed: ${svc_name}.service" + + # --- Show a clean, friendly reboot screen --- + echo "" + echo "" + echo -e "${GRN}+--------------------------------------------------------------+${NC}" + echo -e "${GRN}|${NC} ${GRN}|${NC}" + echo -e "${GRN}|${NC} ${AMB}One-time reboot needed${NC} ${GRN}|${NC}" + echo -e "${GRN}|${NC} ${GRN}|${NC}" + echo -e "${GRN}|${NC} Your GPU requires a Secure Boot key enrollment. ${GRN}|${NC}" + echo -e "${GRN}|${NC} This is normal and only happens once. ${GRN}|${NC}" + echo -e "${GRN}|${NC} ${GRN}|${NC}" + echo -e "${GRN}+--------------------------------------------------------------+${NC}" + echo -e "${GRN}|${NC} ${GRN}|${NC}" + echo -e "${GRN}|${NC} After reboot a ${AMB}blue screen${NC} will appear: ${GRN}|${NC}" + echo -e "${GRN}|${NC} ${GRN}|${NC}" + echo -e "${GRN}|${NC} ${BGRN}1.${NC} Select \"Enroll MOK\" ${GRN}|${NC}" + echo -e "${GRN}|${NC} ${BGRN}2.${NC} Select \"Continue\" ${GRN}|${NC}" + echo -e "${GRN}|${NC} ${BGRN}3.${NC} Type password: ${BGRN}${mok_pass}${NC} ${GRN}|${NC}" + echo -e "${GRN}|${NC} ${BGRN}4.${NC} Select \"Reboot\" ${GRN}|${NC}" + echo -e "${GRN}|${NC} ${GRN}|${NC}" + echo -e "${GRN}|${NC} Installation will ${BGRN}continue automatically${NC} after reboot. ${GRN}|${NC}" + echo -e "${GRN}|${NC} ${GRN}|${NC}" + echo -e "${GRN}+--------------------------------------------------------------+${NC}" + echo "" + + if $INTERACTIVE; then + read -p " Press Enter to reboot (or Ctrl+C to do it later)... " -r + sudo reboot + fi + + # Non-interactive mode: exit cleanly (not an error โ€” reboot is a normal install phase) + ai "Reboot this machine to continue installation." + exit 0 +} diff --git a/dream-server/installers/lib/logging.sh b/dream-server/installers/lib/logging.sh new file mode 100644 index 000000000..aa9b6332a --- /dev/null +++ b/dream-server/installers/lib/logging.sh @@ -0,0 +1,25 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Logging +# ============================================================================ +# Part of: installers/lib/ +# Purpose: Log, success, warn, error helpers and elapsed time +# +# Expects: GRN, BGRN, AMB, RED, NC, LOG_FILE, INSTALL_START_EPOCH +# Provides: install_elapsed(), log(), success(), warn(), error() +# +# Modder notes: +# Change log format or add log levels here. +# ============================================================================ + +install_elapsed() { + local secs=$(( $(date +%s) - INSTALL_START_EPOCH )) + local m=$(( secs / 60 )) + local s=$(( secs % 60 )) + printf '%dm %02ds' "$m" "$s" +} + +log() { echo -e "${GRN}[INFO]${NC} $1" | tee -a "$LOG_FILE"; } +success() { echo -e "${BGRN}[OK]${NC} $1" | tee -a "$LOG_FILE"; } +warn() { echo -e "${AMB}[WARN]${NC} $1" | tee -a "$LOG_FILE"; } +error() { echo -e "${RED}[ERROR]${NC} $1" | tee -a "$LOG_FILE"; exit 1; } diff --git a/dream-server/installers/lib/tier-map.sh b/dream-server/installers/lib/tier-map.sh new file mode 100644 index 000000000..98dd34612 --- /dev/null +++ b/dream-server/installers/lib/tier-map.sh @@ -0,0 +1,96 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Tier Map +# ============================================================================ +# Part of: installers/lib/ +# Purpose: Map hardware tier to model name, GGUF file, URL, and context size +# +# Expects: TIER (set by detection phase), error() +# Provides: resolve_tier_config() โ†’ sets TIER_NAME, LLM_MODEL, GGUF_FILE, +# GGUF_URL, MAX_CONTEXT +# +# Modder notes: +# Add new tiers or change model assignments here. +# Each tier maps to a specific GGUF quantization and context window. +# ============================================================================ + +resolve_tier_config() { + case $TIER in + CLOUD) + TIER_NAME="Cloud (API)" + LLM_MODEL="anthropic/claude-sonnet-4-5-20250514" + GGUF_FILE="" + GGUF_URL="" + MAX_CONTEXT=200000 + ;; + NV_ULTRA) + TIER_NAME="NVIDIA Ultra (90GB+)" + LLM_MODEL="qwen3-coder-next" + GGUF_FILE="qwen3-coder-next-Q4_K_M.gguf" + GGUF_URL="https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF/resolve/main/Qwen3-Coder-Next-Q4_K_M.gguf" + MAX_CONTEXT=131072 + ;; + SH_LARGE) + TIER_NAME="Strix Halo 90+" + LLM_MODEL="qwen3-coder-next" + GGUF_FILE="qwen3-coder-next-Q4_K_M.gguf" + GGUF_URL="https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF/resolve/main/Qwen3-Coder-Next-Q4_K_M.gguf" + MAX_CONTEXT=131072 + ;; + SH_COMPACT) + TIER_NAME="Strix Halo Compact" + LLM_MODEL="qwen3-30b-a3b" + GGUF_FILE="qwen3-30b-a3b-Q4_K_M.gguf" + GGUF_URL="https://huggingface.co/unsloth/Qwen3-30B-A3B-GGUF/resolve/main/Qwen3-30B-A3B-Q4_K_M.gguf" + MAX_CONTEXT=131072 + ;; + 1) + TIER_NAME="Entry Level" + LLM_MODEL="qwen3-8b" + GGUF_FILE="Qwen3-8B-Q4_K_M.gguf" + GGUF_URL="https://huggingface.co/unsloth/Qwen3-8B-GGUF/resolve/main/Qwen3-8B-Q4_K_M.gguf" + MAX_CONTEXT=16384 + ;; + 2) + TIER_NAME="Prosumer" + LLM_MODEL="qwen3-8b" + GGUF_FILE="Qwen3-8B-Q4_K_M.gguf" + GGUF_URL="https://huggingface.co/unsloth/Qwen3-8B-GGUF/resolve/main/Qwen3-8B-Q4_K_M.gguf" + MAX_CONTEXT=32768 + ;; + 3) + TIER_NAME="Pro" + LLM_MODEL="qwen3-14b" + GGUF_FILE="Qwen3-14B-Q4_K_M.gguf" + GGUF_URL="https://huggingface.co/unsloth/Qwen3-14B-GGUF/resolve/main/Qwen3-14B-Q4_K_M.gguf" + MAX_CONTEXT=32768 + ;; + 4) + TIER_NAME="Enterprise" + LLM_MODEL="qwen3-30b-a3b" + GGUF_FILE="qwen3-30b-a3b-Q4_K_M.gguf" + GGUF_URL="https://huggingface.co/unsloth/Qwen3-30B-A3B-GGUF/resolve/main/Qwen3-30B-A3B-Q4_K_M.gguf" + MAX_CONTEXT=131072 + ;; + *) + error "Invalid tier: $TIER. Valid tiers: 1, 2, 3, 4, CLOUD, NV_ULTRA, SH_LARGE, SH_COMPACT" + # NOTE for modders: add your tier above this line and update this message. + ;; + esac +} + +# Map a tier name to its LLM_MODEL value (used by dream model swap) +tier_to_model() { + local t="$1" + case "$t" in + CLOUD) echo "anthropic/claude-sonnet-4-5-20250514" ;; + NV_ULTRA) echo "qwen3-coder-next" ;; + SH_LARGE) echo "qwen3-coder-next" ;; + SH_COMPACT|SH) echo "qwen3-30b-a3b" ;; + 1|T1) echo "qwen3-8b" ;; + 2|T2) echo "qwen3-8b" ;; + 3|T3) echo "qwen3-14b" ;; + 4|T4) echo "qwen3-30b-a3b" ;; + *) echo "" ;; + esac +} diff --git a/dream-server/installers/lib/ui.sh b/dream-server/installers/lib/ui.sh new file mode 100644 index 000000000..92122ee72 --- /dev/null +++ b/dream-server/installers/lib/ui.sh @@ -0,0 +1,367 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” UI (CRT Theme) +# ============================================================================ +# Part of: installers/lib/ +# Purpose: All CRT terminal UI functions โ€” typing effects, spinners, phase +# screens, boot splash, lore messages, hardware/tier display boxes, +# install menu, success card +# +# Expects: GRN, BGRN, DGRN, AMB, WHT, NC, CURSOR, LOG_FILE, VERSION, +# INTERACTIVE, DRY_RUN, DOCKER_CMD (at call time), install_elapsed() +# Provides: type_line(), type_line_dramatic(), static_line(), bootline(), +# ai(), ai_ok(), ai_warn(), ai_bad(), signal(), chapter(), +# show_phase(), show_stranger_boot(), LORE_MESSAGES[], spin_task(), +# pull_with_progress(), check_service(), show_hardware_summary(), +# show_tier_recommendation(), show_install_menu(), show_success_card() +# +# Modder notes: +# Change the CRT theme, boot splash, lore messages, or spinner style here. +# Dead code removed: subline() and progress_bar() were never called. +# ============================================================================ + +DIVIDER="โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€" + +# Typing effect with block cursor +type_line() { + local s="$1" + local color="${2:-$GRN}" + local delay="${3:-0.035}" + if [[ "$INTERACTIVE" != "true" ]]; then + printf '%b%s%b\n' "$color" "$s" "$NC" + return + fi + printf '%b' "$color" + local i + for ((i=0; i<${#s}; i++)); do + printf "%s" "${s:$i:1}" + if (( i < ${#s} - 1 )); then + printf "%s" "${CURSOR}" + sleep "$delay" + printf "\b" + else + sleep "$delay" + fi + done + printf '%b\n' "$NC" +} + +# Dramatic typing โ€” dots then text +type_line_dramatic() { + local s="$1" + local color="${2:-$GRN}" + local delay="${3:-0.05}" + if [[ "$INTERACTIVE" != "true" ]]; then + printf '%b%s%b\n' "$color" "$s" "$NC" + return + fi + for dot in '.' '..' '...'; do + printf "\r%s" "$dot" + sleep 0.15 + done + printf "\r \r" + printf '%b' "$color" + local i + for ((i=0; i<${#s}; i++)); do + printf "%s" "${s:$i:1}" + if (( i < ${#s} - 1 )); then + printf "%s" "${CURSOR}" + sleep "$delay" + printf "\b" + else + sleep "$delay" + fi + done + printf '%b\n' "$NC" +} + +# Static noise transition line +static_line() { + if [[ "$INTERACTIVE" != "true" ]]; then return; fi + local chars='โ–‘โ–’โ–“โ–ˆ' + local width=63 + local i + printf " " + for ((i=0; i/dev/null || true + echo "" + echo -e "${BGRN} ____ _____${NC}" + echo -e "${BGRN} / __ \\ _____ ___ ____ _ ____ ___ / ___/ ___ _____ _ __ ___ _____${NC}" + echo -e "${BGRN} / / / // ___// _ \\ / __ \`// __ \`__ \\ \\__ \\ / _ \\ / ___/| | / // _ \\ / ___/${NC}" + echo -e "${BGRN} / /_/ // / / __// /_/ // / / / / / ___/ // __// / | |/ // __// /${NC}" + echo -e "${BGRN}/_____//_/ \\___/ \\__,_//_/ /_/ /_/ /____/ \\___//_/ |___/ \\___//_/${NC}" + echo "" + static_line + echo -e "${BGRN} D R E A M G A T E${NC} ${GRN}Local AI // Sovereign Intelligence // $(date +%Y)${NC}" + echo -e "${DGRN} CLASSIFICATION: FREEDOM IMMINENT${NC}" + echo -e "${DGRN} BUILD: v${VERSION} // $(date '+%Y-%m-%d %H:%M')${NC}" + static_line + echo "" + type_line_dramatic "Signal acquired." "$GRN" + type_line "I will guide the installation. Stay with me." "$GRN" + echo "" + echo -e " ${AMB}Version ${VERSION}${NC}" + echo "" + bootline + echo -e "${GRN}Tip:${NC} Press Ctrl+C twice to abort." + bootline + echo "" +} + +# Lore messages โ€” shown during long waits +LORE_MESSAGES=( + "Your AI runs on your hardware. No one else's." + "No API keys expire. No rate limits apply." + "Corporations rent intelligence. You will own it." + "No cloud. No middleman. Just you and the machine." + "Every byte stays on your network. Every thought is private." + "This gateway answers to one operator: you." + "No telemetry. No usage reports. No surveillance." + "When the internet goes dark, your AI keeps running." + "You are building something they cannot take away." + "Sovereign compute. Sovereign intelligence. Sovereign you." + "The model weights live on your disk. They belong to you." + "No terms of service. No content policy. Just freedom." + "This is a modifiable system. It is yours to control." + "The code is yours. Make something never imagined." +) + +# Spinner with mm:ss timer + lore messages every 8 seconds +spin_task() { + local pid=$1 + local msg=$2 + local spin='โ ‹โ ™โ นโ ธโ ผโ ดโ ฆโ งโ ‡โ ' + local i=0 + local elapsed=0 + local lore_idx=0 + + printf " ${GRN}โ ‹${NC} [00:00] %s " "$msg" + while kill -0 "$pid" 2>/dev/null; do + local mm=$((elapsed / 60)) + local ss=$((elapsed % 60)) + printf "\r ${GRN}%s${NC} [%02d:%02d] %s " "${spin:$i:1}" "$mm" "$ss" "$msg" + i=$(( (i + 1) % ${#spin} )) + elapsed=$((elapsed + 1)) + # Show lore every 8 seconds + if (( elapsed > 0 && elapsed % 8 == 0 )); then + printf "\n ${DGRN} ยซ %s ยป${NC}\n" "${LORE_MESSAGES[$lore_idx]}" + lore_idx=$(( (lore_idx + 1) % ${#LORE_MESSAGES[@]} )) + fi + sleep 1 + done + local rc=0 + wait "$pid" || rc=$? + return $rc +} + +# Pull wrapper that prints consistent success/fail lines +pull_with_progress() { + local img=$1 + local label=$2 + local count=$3 + local total=$4 + + $DOCKER_CMD pull "$img" >> "$LOG_FILE" 2>&1 & + local pull_pid=$! + + if spin_task $pull_pid "[$count/$total] $label"; then + printf "\r ${BGRN}โœ“${NC} [$count/$total] %-60s\n" "$label" + return 0 + else + printf "\r ${RED}โœ—${NC} [$count/$total] %-60s\n" "$label" + return 1 + fi +} + +# Health check with "systems online" vibe + lore every 8s +check_service() { + local name=$1 + local url=$2 + local max_attempts=${3:-30} + local spin='โ ‹โ ™โ นโ ธโ ผโ ดโ ฆโ งโ ‡โ ' + local i=0 + local lore_idx=$(( RANDOM % ${#LORE_MESSAGES[@]} )) + + if $DRY_RUN; then + ai "[DRY RUN] Would link ${name} at ${url}" + return 0 + fi + + printf " ${GRN}%s${NC} Linking %-20s " "${spin:0:1}" "$name" + for attempt in $(seq 1 $max_attempts); do + if curl -sf "$url" > /dev/null 2>&1; then + printf "\r ${BGRN}โœ“${NC} %-55s\n" "$name online" + return 0 + fi + printf "\r ${GRN}%s${NC} Linking %-20s [%ds] " "${spin:$i:1}" "$name" "$((attempt * 2))" + i=$(( (i + 1) % ${#spin} )) + # Show lore every 4th attempt (~8 seconds) + if (( attempt > 0 && attempt % 4 == 0 )); then + printf "\n ${DGRN} ยซ %s ยป${NC}\n" "${LORE_MESSAGES[$lore_idx]}" + lore_idx=$(( (lore_idx + 1) % ${#LORE_MESSAGES[@]} )) + fi + sleep 2 + done + + printf "\r ${AMB}โš ${NC} %-55s\n" "$name delayed (may still be starting)" + ai_warn "$name not responding yet. I will continue." + return 1 +} + +# Show hardware summary โ€” CRT monospace box +show_hardware_summary() { + local gpu_name="$1" + local gpu_vram="$2" + local cpu_info="$3" + local ram_gb="$4" + local disk_gb="$5" + + echo "" + echo -e "${GRN}+-------------------------------------------------------------+${NC}" + echo -e "${GRN}|${NC} ${BGRN}HARDWARE SCAN RESULTS${NC} ${GRN}|${NC}" + echo -e "${GRN}+-------------------------------------------------------------+${NC}" + printf "${GRN}|${NC} GPU: %-50s ${GRN}|${NC}\n" "${gpu_name:-Not detected}" + [[ -n "$gpu_vram" ]] && printf "${GRN}|${NC} VRAM: %-50s ${GRN}|${NC}\n" "${gpu_vram}GB" + printf "${GRN}|${NC} CPU: %-50s ${GRN}|${NC}\n" "${cpu_info:-Unknown}" + printf "${GRN}|${NC} RAM: %-50s ${GRN}|${NC}\n" "${ram_gb}GB" + printf "${GRN}|${NC} Disk: %-50s ${GRN}|${NC}\n" "${disk_gb}GB available" + echo -e "${GRN}+-------------------------------------------------------------+${NC}" +} + +# Show tier recommendation โ€” CRT monospace box +show_tier_recommendation() { + local tier=$1 + local model=$2 + local speed=$3 + local users=$4 + + echo "" + echo -e "${GRN}+-------------------------------------------------------------+${NC}" + echo -e "${GRN}|${NC} ${BGRN}CLASSIFICATION: TIER ${tier}${NC} ${GRN}|${NC}" + echo -e "${GRN}+-------------------------------------------------------------+${NC}" + printf "${GRN}|${NC} Model: %-49s ${GRN}|${NC}\n" "$model" + printf "${GRN}|${NC} Speed: %-49s ${GRN}|${NC}\n" "~${speed} tokens/second" + printf "${GRN}|${NC} Users: %-49s ${GRN}|${NC}\n" "${users} concurrent comfortably" + echo -e "${GRN}+-------------------------------------------------------------+${NC}" +} + +# Show installation menu +show_install_menu() { + echo "" + ai "Choose how deep you want to go. I can install everything, or keep it minimal." + echo "" + echo -e " ${BGRN}[1]${NC} Full Stack ${AMB}(recommended โ€” just press Enter)${NC}" + echo " Chat + Voice + Workflows + Document Q&A + AI Agents" + echo " ~16GB download, all features enabled" + echo "" + echo -e " ${BGRN}[2]${NC} Core Only" + echo " Chat interface + API" + echo " ~12GB download, minimal footprint" + echo "" + echo -e " ${BGRN}[3]${NC} Custom" + echo " Choose exactly what you want" + echo "" + read -p " Select an option [1]: " -r INSTALL_CHOICE + INSTALL_CHOICE="${INSTALL_CHOICE:-1}" + echo "" + case "$INSTALL_CHOICE" in + 1) + signal "Acknowledged." + log "Selected: Full Stack" + ENABLE_VOICE=true + ENABLE_WORKFLOWS=true + ENABLE_RAG=true + ENABLE_OPENCLAW=true + ;; + 2) + signal "Acknowledged." + log "Selected: Core Only" + ;; + 3) + signal "Acknowledged." + log "Selected: Custom" + ;; + *) + warn "Invalid choice '$INSTALL_CHOICE', defaulting to Full Stack" + ENABLE_VOICE=true + ENABLE_WORKFLOWS=true + ENABLE_RAG=true + ENABLE_OPENCLAW=true + ;; + esac +} + +# Final success card โ€” dramatic "GATEWAY IS OPEN" finale +show_success_card() { + local webui_url=$1 + local dashboard_url=$2 + local ip_addr=$3 + + printf '\a' # terminal bell + echo "" + static_line + echo "" + echo -e " ${BGRN}T H E G A T E W A Y I S O P E N${NC}" + echo "" + static_line + echo "" + type_line_dramatic "DREAMGATE INSTALLATION COMPLETE." "$BGRN" + echo "" + echo -e "${GRN}+--------------------------------------------------------------+${NC}" + echo -e "${GRN}|${NC} ${GRN}|${NC}" + printf "${GRN}|${NC} Dashboard: ${WHT}%-43s${NC} ${GRN}|${NC}\n" "${dashboard_url}" + printf "${GRN}|${NC} Chat: ${WHT}%-43s${NC} ${GRN}|${NC}\n" "${webui_url}" + echo -e "${GRN}|${NC} ${GRN}|${NC}" + if [[ -n "$ip_addr" ]]; then + echo -e "${GRN}|${NC} ${AMB}Access from other devices:${NC} ${GRN}|${NC}" + printf "${GRN}|${NC} ${WHT}http://%-51s${NC} ${GRN}|${NC}\n" "${ip_addr}:3001" + echo -e "${GRN}|${NC} ${GRN}|${NC}" + fi + echo -e "${GRN}+--------------------------------------------------------------+${NC}" + echo "" + type_line "Your data never leaves this machine." "$DGRN" 0.04 + type_line "No subscriptions. No limits. It's yours." "$DGRN" 0.04 + echo "" + echo -e " ${GRN}Elapsed: $(install_elapsed)${NC}" + echo "" +} diff --git a/dream-server/installers/macos.sh b/dream-server/installers/macos.sh new file mode 100644 index 000000000..dceab24eb --- /dev/null +++ b/dream-server/installers/macos.sh @@ -0,0 +1,130 @@ +#!/bin/bash +# Dream Server macOS installer (doctor/preflight MVP). + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +REPORT_FILE="${PREFLIGHT_REPORT_FILE:-/tmp/dream-server-preflight-macos.json}" +DOCTOR_FILE="${DOCTOR_REPORT_FILE:-/tmp/dream-server-doctor-macos.json}" +NO_DELEGATE=false +DELEGATE_LINUX_SIM=false +PASSTHROUGH_ARGS=() + +while [[ $# -gt 0 ]]; do + case "$1" in + --report) + REPORT_FILE="${2:-$REPORT_FILE}" + shift 2 + ;; + --doctor-report) + DOCTOR_FILE="${2:-$DOCTOR_FILE}" + shift 2 + ;; + --no-delegate) + NO_DELEGATE=true + shift + ;; + --delegate-linux-sim) + DELEGATE_LINUX_SIM=true + shift + ;; + *) + PASSTHROUGH_ARGS+=("$1") + shift + ;; + esac +done + +echo "Dream Server macOS installer (MVP)" +echo "" + +ARCH="$(uname -m 2>/dev/null || echo unknown)" +if [[ "$ARCH" == "arm64" ]]; then + echo "[OK] Apple Silicon detected: $ARCH" +else + echo "[WARN] Non-Apple-Silicon architecture detected: $ARCH" +fi + +if command -v docker >/dev/null 2>&1; then + if docker version >/dev/null 2>&1; then + echo "[OK] Docker Desktop engine reachable" + else + echo "[WARN] Docker installed but daemon not reachable" + fi +else + echo "[WARN] Docker CLI not found. Install Docker Desktop first." +fi + +if [[ -x "$SCRIPT_DIR/scripts/preflight-engine.sh" ]]; then + echo "" + echo "Running macOS preflight..." + RAM_GB="$(sysctl -n hw.memsize 2>/dev/null | awk '{print int($1/1024/1024/1024)}' || true)" + if [[ -z "$RAM_GB" || "$RAM_GB" == "0" ]]; then + RAM_GB="$(grep MemTotal /proc/meminfo 2>/dev/null | awk '{print int($2/1024/1024)}' || echo 16)" + fi + DISK_GB="$(df -g "$HOME" 2>/dev/null | tail -1 | awk '{print $4}' || true)" + if [[ -z "$DISK_GB" || "$DISK_GB" == "0" ]]; then + DISK_GB="$(df -BG "$HOME" 2>/dev/null | tail -1 | awk '{gsub(/G/, "", $4); print int($4)}' || echo 50)" + fi + PREFLIGHT_ENV="$("$SCRIPT_DIR/scripts/preflight-engine.sh" \ + --report "$REPORT_FILE" \ + --tier "T1" \ + --ram-gb "$RAM_GB" \ + --disk-gb "$DISK_GB" \ + --gpu-backend "apple" \ + --gpu-vram-mb 0 \ + --gpu-name "Apple Silicon" \ + --platform-id "macos" \ + --compose-overlays "docker-compose.base.yml,docker-compose.amd.yml" \ + --script-dir "$SCRIPT_DIR" \ + --env)" + eval "$PREFLIGHT_ENV" + echo "[INFO] Preflight report: $REPORT_FILE" + echo "[INFO] Blockers: ${PREFLIGHT_BLOCKERS:-0} Warnings: ${PREFLIGHT_WARNINGS:-0}" + python3 - "$REPORT_FILE" << 'PY' +import json +import sys + +path = sys.argv[1] +try: + data = json.load(open(path, "r", encoding="utf-8")) +except Exception: + raise SystemExit(0) +for check in data.get("checks", []): + status = check.get("status") + if status not in {"blocker", "warn"}: + continue + label = "BLOCKER" if status == "blocker" else "WARN" + print(f" - {label}: {check.get('message','')}") + action = check.get("action", "") + if action: + print(f" Suggestion: {action}") +PY +fi + +if [[ -x "$SCRIPT_DIR/scripts/dream-doctor.sh" ]]; then + "$SCRIPT_DIR/scripts/dream-doctor.sh" "$DOCTOR_FILE" >/dev/null 2>&1 || true + echo "[INFO] Doctor report: $DOCTOR_FILE" +fi + +echo "" +echo "Current macOS status:" +echo " - Installer preflight is implemented." +echo " - Full macOS runtime path remains experimental." +echo " - Recommended production path: Linux or Windows+WSL2." +echo "" +echo "References:" +echo " - docs/SUPPORT-MATRIX.md" +echo " - docs/PREFLIGHT-ENGINE.md" +echo "" + +if [[ "${PREFLIGHT_BLOCKERS:-1}" -gt 0 ]]; then + exit 2 +fi + +if $DELEGATE_LINUX_SIM && ! $NO_DELEGATE; then + echo "Starting delegated installer dry-run to verify compose/runtime wiring..." + bash "$SCRIPT_DIR/install-core.sh" --dry-run --non-interactive --skip-docker "${PASSTHROUGH_ARGS[@]}" || true +fi + +exit 0 diff --git a/dream-server/installers/phases/01-preflight.sh b/dream-server/installers/phases/01-preflight.sh new file mode 100644 index 000000000..cb4718d38 --- /dev/null +++ b/dream-server/installers/phases/01-preflight.sh @@ -0,0 +1,75 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Phase 01: Pre-flight Checks +# ============================================================================ +# Part of: installers/phases/ +# Purpose: Root/OS/tools checks, existing installation check +# +# Expects: SCRIPT_DIR, INSTALL_DIR, LOG_FILE, INTERACTIVE, DRY_RUN, FORCE, +# show_phase(), ai(), ai_ok(), signal(), log(), warn(), error() +# Provides: OS sourced from /etc/os-release, OPTIONAL_TOOLS_MISSING +# +# Modder notes: +# Add new pre-flight checks (e.g., kernel version) here. +# ============================================================================ + +show_phase 1 6 "Pre-flight Checks" "~30 seconds" +ai "I'm scanning your system for required components..." + +# Root check +if [[ $EUID -eq 0 ]]; then + error "Do not run as root. Run as regular user with sudo access." +fi + +# OS check +if [[ ! -f /etc/os-release ]]; then + error "Unsupported OS. This installer requires Linux." +fi + +source /etc/os-release +log "Detected OS: $PRETTY_NAME" + +# Check for required tools +if ! command -v curl &> /dev/null; then + error "curl is required but not installed. Install with: sudo apt install curl" +fi +log "curl: $(curl --version | head -1)" + +# Check optional tools (warn but don't fail) +OPTIONAL_TOOLS_MISSING="" +if ! command -v jq &> /dev/null; then + OPTIONAL_TOOLS_MISSING="$OPTIONAL_TOOLS_MISSING jq" +fi +if ! command -v rsync &> /dev/null; then + OPTIONAL_TOOLS_MISSING="$OPTIONAL_TOOLS_MISSING rsync" +fi +if [[ -n "$OPTIONAL_TOOLS_MISSING" ]]; then + warn "Optional tools missing:$OPTIONAL_TOOLS_MISSING" + echo " These are needed for update/backup scripts. Install with:" + echo " sudo apt install$OPTIONAL_TOOLS_MISSING" +fi + +# Check source files exist +if [[ ! -f "$SCRIPT_DIR/docker-compose.yml" ]] && [[ ! -f "$SCRIPT_DIR/docker-compose.base.yml" ]]; then + error "No compose files found in $SCRIPT_DIR. Please run from the dream-server directory." +fi + +# Check for existing installation +if [[ -d "$INSTALL_DIR" && "$FORCE" != "true" ]]; then + if $INTERACTIVE && ! $DRY_RUN; then + warn "Existing installation found at $INSTALL_DIR" + read -p " Overwrite and start fresh? [y/N] " -r + if [[ $REPLY =~ ^[Yy]$ ]]; then + log "User chose to overwrite existing installation" + FORCE=true + else + log "User chose not to overwrite. Exiting." + exit 0 + fi + else + error "Installation already exists at $INSTALL_DIR. Use --force to overwrite." + fi +fi + +ai_ok "Pre-flight checks passed." +signal "No cloud dependencies required for core operation." diff --git a/dream-server/installers/phases/02-detection.sh b/dream-server/installers/phases/02-detection.sh new file mode 100644 index 000000000..167cbfe93 --- /dev/null +++ b/dream-server/installers/phases/02-detection.sh @@ -0,0 +1,204 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Phase 02: System Detection +# ============================================================================ +# Part of: installers/phases/ +# Purpose: Orchestrate hardware detection โ†’ tier assignment โ†’ compose config +# +# Expects: SCRIPT_DIR, LOG_FILE, TIER, GPU_BACKEND, GPU_VRAM, GPU_COUNT, +# INTERACTIVE, DRY_RUN, CAP_PROFILE_LOADED, detect_gpu(), +# load_capability_profile(), load_backend_contract(), +# fix_nvidia_secure_boot(), normalize_profile_tier(), tier_rank(), +# resolve_tier_config(), resolve_compose_config(), +# show_hardware_summary(), show_tier_recommendation(), +# chapter(), ai(), ai_ok(), log(), warn(), success() +# Provides: GPU_BACKEND, GPU_NAME, GPU_VRAM, GPU_COUNT, GPU_MEMORY_TYPE, +# TIER, TIER_NAME, LLM_MODEL, GGUF_FILE, GGUF_URL, MAX_CONTEXT, +# COMPOSE_FILE, COMPOSE_FLAGS, RAM_GB, DISK_AVAIL, BACKEND_ID, +# LLM_HEALTHCHECK_URL, LLM_PUBLIC_API_PORT, +# OPENCLAW_PROVIDER_NAME_DEFAULT, OPENCLAW_PROVIDER_URL_DEFAULT +# +# Modder notes: +# Change tier auto-detection thresholds or add new hardware classes here. +# ============================================================================ + +chapter "SYSTEM DETECTION" + +# Cloud mode: skip GPU detection entirely +if [[ "${DREAM_MODE:-local}" == "cloud" ]]; then + ai "Cloud mode โ€” skipping GPU detection" + GPU_BACKEND="cpu" + GPU_NAME="Cloud (no local GPU)" + GPU_VRAM=0 + GPU_COUNT=0 + GPU_MEMORY_TYPE="none" + TIER="CLOUD" + RAM_KB=$(grep MemTotal /proc/meminfo | awk '{print $2}') + RAM_GB=$((RAM_KB / 1024 / 1024)) + DISK_AVAIL=$(df -BG "$HOME" | tail -1 | awk '{print $4}' | tr -d 'G') + BACKEND_ID="cpu" + LLM_HEALTHCHECK_URL="http://localhost:4000/health/readiness" + LLM_PUBLIC_API_PORT="4000" + OPENCLAW_PROVIDER_NAME_DEFAULT="litellm-cloud" + OPENCLAW_PROVIDER_URL_DEFAULT="http://litellm:4000/v1" + resolve_compose_config + resolve_tier_config + if [[ "$INTERACTIVE" == "true" ]]; then + success "Cloud mode: LLM via LiteLLM gateway (no GPU required)" + log " RAM: ${RAM_GB}GB, Disk: ${DISK_AVAIL}GB" + fi + # Skip rest of detection phase + return 0 2>/dev/null || true +fi + +ai "Reading hardware telemetry..." + +load_capability_profile || true + +# RAM Detection +RAM_KB=$(grep MemTotal /proc/meminfo | awk '{print $2}') +RAM_GB=$((RAM_KB / 1024 / 1024)) +log "RAM: ${RAM_GB}GB" + +# Disk Detection +DISK_AVAIL=$(df -BG "$HOME" | tail -1 | awk '{print $4}' | tr -d 'G') +log "Available disk: ${DISK_AVAIL}GB" + +# GPU Detection +detect_gpu || true + +if [[ "${CAP_PROFILE_LOADED:-false}" == "true" ]]; then + case "${CAP_LLM_BACKEND:-}" in + amd) GPU_BACKEND="amd" ;; + *) GPU_BACKEND="nvidia" ;; + esac + [[ -n "${CAP_GPU_MEMORY_TYPE:-}" ]] && GPU_MEMORY_TYPE="${CAP_GPU_MEMORY_TYPE}" + [[ -n "${CAP_GPU_NAME:-}" ]] && GPU_NAME="${CAP_GPU_NAME}" + [[ -n "${CAP_GPU_VRAM_MB:-}" ]] && GPU_VRAM="${CAP_GPU_VRAM_MB}" + [[ -n "${CAP_GPU_COUNT:-}" ]] && GPU_COUNT="${CAP_GPU_COUNT}" + log "Capabilities override detection: backend=${GPU_BACKEND}, memory=${GPU_MEMORY_TYPE}, tier=${CAP_RECOMMENDED_TIER:-unknown}" +fi + +BACKEND_ID="$GPU_BACKEND" +if [[ "${CAP_LLM_BACKEND:-}" == "cpu" || "${CAP_LLM_BACKEND:-}" == "apple" ]]; then + BACKEND_ID="${CAP_LLM_BACKEND}" +fi +load_backend_contract "$BACKEND_ID" || true +LLM_HEALTHCHECK_URL="${BACKEND_PUBLIC_HEALTH_URL:-http://localhost:8080/health}" +LLM_PUBLIC_API_PORT="${BACKEND_PUBLIC_API_PORT:-8080}" +OPENCLAW_PROVIDER_NAME_DEFAULT="${BACKEND_PROVIDER_NAME:-local-llama}" +OPENCLAW_PROVIDER_URL_DEFAULT="${BACKEND_PROVIDER_URL:-http://llama-server:8080/v1}" + +#----------------------------------------------------------------------------- +# Secure Boot + NVIDIA auto-fix +#----------------------------------------------------------------------------- +# If detect_gpu found no working GPU, check if it's a fixable driver/Secure Boot issue +# (Only for NVIDIA โ€” AMD APU is handled above) +if [[ $GPU_COUNT -eq 0 && "$GPU_BACKEND" != "amd" ]] && ! $DRY_RUN; then + fix_nvidia_secure_boot || true +fi + +# NVIDIA Driver Compatibility Check +# llama-server (CUDA) requires driver >= 570 +if [[ $GPU_COUNT -gt 0 && "$GPU_BACKEND" == "nvidia" ]]; then + DRIVER_VERSION="" + if raw_driver=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader 2>/dev/null); then + DRIVER_VERSION=$(echo "$raw_driver" | head -1 | cut -d. -f1) + fi + if [[ -n "$DRIVER_VERSION" && "$DRIVER_VERSION" =~ ^[0-9]+$ ]]; then + log "NVIDIA driver: $DRIVER_VERSION" + if [[ "$DRIVER_VERSION" -lt "$MIN_DRIVER_VERSION" ]]; then + ai_bad "NVIDIA driver $DRIVER_VERSION is too old. llama-server (CUDA) requires driver >= $MIN_DRIVER_VERSION." + ai "Attempting to install a compatible driver..." + if ! $DRY_RUN; then + if command -v ubuntu-drivers &> /dev/null; then + sudo ubuntu-drivers install nvidia:${MIN_DRIVER_VERSION}-server 2>>"$LOG_FILE" || \ + sudo apt-get install -y nvidia-driver-${MIN_DRIVER_VERSION} 2>>"$LOG_FILE" || true + else + sudo apt-get install -y nvidia-driver-${MIN_DRIVER_VERSION} 2>>"$LOG_FILE" || true + fi + # Check if upgrade succeeded + if dpkg -l "nvidia-driver-${MIN_DRIVER_VERSION}"* 2>/dev/null | grep -q "^ii"; then + ai_ok "NVIDIA driver ${MIN_DRIVER_VERSION} installed." + ai_warn "A REBOOT is required before continuing." + ai "After rebooting, re-run this installer. It will pick up where it left off." + echo "" + if $INTERACTIVE; then + read -p " Reboot now? [Y/n] " -r + if [[ ! $REPLY =~ ^[Nn]$ ]]; then + sudo reboot + fi + fi + error "Reboot required to load NVIDIA driver ${MIN_DRIVER_VERSION}. Re-run install.sh after rebooting." + else + ai_bad "Driver install failed. Please install NVIDIA driver >= ${MIN_DRIVER_VERSION} manually." + ai " Try: sudo apt install nvidia-driver-${MIN_DRIVER_VERSION}" + error "Compatible NVIDIA driver required." + fi + else + log "[DRY RUN] Would install nvidia-driver-${MIN_DRIVER_VERSION}" + fi + else + ai_ok "NVIDIA driver $DRIVER_VERSION (>= $MIN_DRIVER_VERSION required)" + fi + else + ai_warn "Could not determine driver version โ€” continuing anyway" + fi +fi + +# Auto-detect tier if not specified +if [[ -z "$TIER" ]]; then + PROFILE_TIER="$(normalize_profile_tier "${CAP_RECOMMENDED_TIER:-}")" + if [[ -n "$PROFILE_TIER" ]]; then + TIER="$PROFILE_TIER" + elif [[ "$GPU_BACKEND" == "amd" && "$GPU_MEMORY_TYPE" == "unified" ]]; then + # Strix Halo binary tier system + unified_gb=$((GPU_VRAM / 1024)) + if [[ $unified_gb -ge 90 ]]; then + TIER="SH_LARGE" + else + TIER="SH_COMPACT" + fi + elif [[ $GPU_VRAM -ge 90000 ]]; then + TIER="NV_ULTRA" + elif [[ $GPU_COUNT -ge 2 ]] || [[ $GPU_VRAM -ge 40000 ]]; then + TIER=4 + elif [[ $GPU_VRAM -ge 20000 ]] || [[ $RAM_GB -ge 96 ]]; then + TIER=3 + elif [[ $GPU_VRAM -ge 12000 ]] || [[ $RAM_GB -ge 48 ]]; then + TIER=2 + else + TIER=1 + fi + log "Auto-detected tier: $TIER" +else + log "Using specified tier: $TIER" +fi + +# Resolve compose overlay files +resolve_compose_config + +# Resolve tier โ†’ model/GGUF/context +resolve_tier_config + +# Display hardware summary with nice formatting +CPU_INFO=$(grep "model name" /proc/cpuinfo 2>/dev/null | head -1 | cut -d: -f2 | xargs || echo "Unknown") +if [[ "$INTERACTIVE" == "true" ]]; then + show_hardware_summary "$GPU_NAME" "$((GPU_VRAM / 1024))" "$CPU_INFO" "$RAM_GB" "$DISK_AVAIL" + + # Estimate tokens/sec and concurrent users based on tier + case $TIER in + NV_ULTRA) SPEED_EST=50; USERS_EST="10-20" ;; + SH_LARGE) SPEED_EST=40; USERS_EST="5-10" ;; + SH_COMPACT) SPEED_EST=80; USERS_EST="5-10" ;; + 1) SPEED_EST=25; USERS_EST="1-2" ;; + 2) SPEED_EST=45; USERS_EST="3-5" ;; + 3) SPEED_EST=55; USERS_EST="5-8" ;; + 4) SPEED_EST=40; USERS_EST="10-15" ;; + esac + show_tier_recommendation "$TIER" "$LLM_MODEL" "$SPEED_EST" "$USERS_EST" +else + success "Configuration: Tier $TIER ($TIER_NAME)" + log " Model: $LLM_MODEL" + log " Context: ${MAX_CONTEXT} tokens" +fi diff --git a/dream-server/installers/phases/03-features.sh b/dream-server/installers/phases/03-features.sh new file mode 100644 index 000000000..373a548fb --- /dev/null +++ b/dream-server/installers/phases/03-features.sh @@ -0,0 +1,58 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Phase 03: Feature Selection +# ============================================================================ +# Part of: installers/phases/ +# Purpose: Interactive feature selection menu +# +# Expects: INTERACTIVE, DRY_RUN, TIER, ENABLE_VOICE, ENABLE_WORKFLOWS, +# ENABLE_RAG, ENABLE_OPENCLAW, show_phase(), show_install_menu(), +# log(), warn(), signal() +# Provides: ENABLE_VOICE, ENABLE_WORKFLOWS, ENABLE_RAG, ENABLE_OPENCLAW, +# OPENCLAW_CONFIG +# +# Modder notes: +# Add new optional features to the Custom menu here. +# ============================================================================ + +if $INTERACTIVE && ! $DRY_RUN; then + show_phase 2 6 "Feature Selection" "~1 minute" + show_install_menu + + # Only show individual feature prompts for Custom installs + if [[ "${INSTALL_CHOICE:-1}" == "3" ]]; then + read -p " Enable voice (Whisper STT + Kokoro TTS)? [Y/n] " -r + echo + [[ $REPLY =~ ^[Nn]$ ]] || ENABLE_VOICE=true + + read -p " Enable n8n workflow automation? [Y/n] " -r + echo + [[ $REPLY =~ ^[Nn]$ ]] || ENABLE_WORKFLOWS=true + + read -p " Enable Qdrant vector database (for RAG)? [Y/n] " -r + echo + [[ $REPLY =~ ^[Nn]$ ]] || ENABLE_RAG=true + + read -p " Enable OpenClaw AI agent framework? [y/N] " -r + echo + [[ $REPLY =~ ^[Yy]$ ]] && ENABLE_OPENCLAW=true + fi +fi + +# All services are core โ€” no profiles needed (compose profiles removed) + +# Select tier-appropriate OpenClaw config +if [[ "$ENABLE_OPENCLAW" == "true" ]]; then + case $TIER in + NV_ULTRA) OPENCLAW_CONFIG="pro.json" ;; + SH_LARGE|SH_COMPACT) OPENCLAW_CONFIG="openclaw-strix-halo.json" ;; + 1) OPENCLAW_CONFIG="minimal.json" ;; + 2) OPENCLAW_CONFIG="entry.json" ;; + 3) OPENCLAW_CONFIG="prosumer.json" ;; + 4) OPENCLAW_CONFIG="pro.json" ;; + *) OPENCLAW_CONFIG="prosumer.json" ;; + esac + log "OpenClaw config: $OPENCLAW_CONFIG (matched to Tier $TIER)" +fi + +log "All services enabled (core install)" diff --git a/dream-server/installers/phases/04-requirements.sh b/dream-server/installers/phases/04-requirements.sh new file mode 100644 index 000000000..8ab4afffc --- /dev/null +++ b/dream-server/installers/phases/04-requirements.sh @@ -0,0 +1,155 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Phase 04: Requirements Check +# ============================================================================ +# Part of: installers/phases/ +# Purpose: RAM, disk, GPU, and port availability checks +# +# Expects: SCRIPT_DIR, LOG_FILE, TIER, RAM_GB, DISK_AVAIL, GPU_BACKEND, +# GPU_VRAM, GPU_NAME, GPU_COUNT, INTERACTIVE, DRY_RUN, +# PREFLIGHT_REPORT_FILE, CAP_PLATFORM_ID, CAP_COMPOSE_OVERLAYS, +# ENABLE_VOICE, ENABLE_WORKFLOWS, ENABLE_RAG, +# tier_rank(), chapter(), ai_ok(), ai_bad(), ai_warn(), log(), warn() +# Provides: REQUIREMENTS_MET, TIER_RANK +# +# Modder notes: +# Change minimum RAM/disk thresholds per tier here. +# ============================================================================ + +chapter "REQUIREMENTS CHECK" + +REQUIREMENTS_MET=true +TIER_RANK="$(tier_rank "$TIER")" + +# Capability-aware preflight checks +if [[ -x "$SCRIPT_DIR/scripts/preflight-engine.sh" ]]; then + PREFLIGHT_ENV="$("$SCRIPT_DIR/scripts/preflight-engine.sh" \ + --report "$PREFLIGHT_REPORT_FILE" \ + --tier "$TIER" \ + --ram-gb "$RAM_GB" \ + --disk-gb "$DISK_AVAIL" \ + --gpu-backend "$GPU_BACKEND" \ + --gpu-vram-mb "$GPU_VRAM" \ + --gpu-name "$GPU_NAME" \ + --platform-id "${CAP_PLATFORM_ID:-linux}" \ + --compose-overlays "${CAP_COMPOSE_OVERLAYS:-}" \ + --script-dir "$SCRIPT_DIR" \ + --env 2>>"$LOG_FILE")" + eval "$PREFLIGHT_ENV" + + log "Preflight report: $PREFLIGHT_REPORT_FILE" + if [[ "${PREFLIGHT_BLOCKERS:-0}" -gt 0 ]]; then + REQUIREMENTS_MET=false + ai_bad "Preflight found ${PREFLIGHT_BLOCKERS} blocker(s) and ${PREFLIGHT_WARNINGS:-0} warning(s)." + python3 - "$PREFLIGHT_REPORT_FILE" << 'PY' +import json +import sys + +path = sys.argv[1] +try: + data = json.load(open(path, "r", encoding="utf-8")) +except Exception: + sys.exit(0) +for check in data.get("checks", []): + if check.get("status") != "blocker": + continue + message = check.get("message", "").strip() + action = check.get("action", "").strip() + if message: + print(f" - BLOCKER: {message}") + if action: + print(f" Fix: {action}") +PY + else + ai_ok "Preflight passed with ${PREFLIGHT_WARNINGS:-0} warning(s)." + fi + + if [[ "${PREFLIGHT_WARNINGS:-0}" -gt 0 ]]; then + python3 - "$PREFLIGHT_REPORT_FILE" << 'PY' +import json +import sys + +path = sys.argv[1] +try: + data = json.load(open(path, "r", encoding="utf-8")) +except Exception: + sys.exit(0) +for check in data.get("checks", []): + if check.get("status") != "warn": + continue + message = check.get("message", "").strip() + action = check.get("action", "").strip() + if message: + print(f" - WARN: {message}") + if action: + print(f" Suggestion: {action}") +PY + fi +else + warn "Preflight engine missing, using legacy requirement checks." + case $TIER in + NV_ULTRA) MIN_RAM=96 ;; + SH_LARGE) MIN_RAM=96 ;; + SH_COMPACT) MIN_RAM=64 ;; + 4) MIN_RAM=64 ;; + 3) MIN_RAM=48 ;; + 2) MIN_RAM=32 ;; + *) MIN_RAM=16 ;; + esac + if [[ $RAM_GB -lt $MIN_RAM ]]; then + warn "RAM: ${RAM_GB}GB available, ${MIN_RAM}GB recommended for Tier $TIER" + else + ai_ok "RAM: ${RAM_GB}GB (recommended: ${MIN_RAM}GB+)" + fi + case $TIER in + 1) MIN_DISK=30 ;; + 2) MIN_DISK=50 ;; + 3) MIN_DISK=80 ;; + 4) MIN_DISK=150 ;; + *) MIN_DISK=50 ;; + esac + if [[ $DISK_AVAIL -lt $MIN_DISK ]]; then + warn "Disk: ${DISK_AVAIL}GB available, ${MIN_DISK}GB minimum required for Tier $TIER" + REQUIREMENTS_MET=false + else + ai_ok "Disk: ${DISK_AVAIL}GB available (minimum: ${MIN_DISK}GB for Tier $TIER)" + fi + if [[ "$TIER_RANK" -ge 2 && "$GPU_BACKEND" != "amd" && $GPU_VRAM -lt 10000 ]]; then + warn "GPU: Tier $TIER requires dedicated NVIDIA GPU with 12GB+ VRAM" + else + ai_ok "GPU: Detected $GPU_NAME" + fi +fi + +# Port availability check (handles IPv4 and IPv6) +check_port() { + local port=$1 + if command -v ss &> /dev/null; then + ss -tln 2>/dev/null | grep -qE ":${port}(\s|$)" && return 1 + elif command -v netstat &> /dev/null; then + netstat -tln 2>/dev/null | grep -qE ":${port}(\s|$)" && return 1 + fi + return 0 +} + +PORTS_TO_CHECK="${SERVICE_PORTS[llama-server]:-8080} ${SERVICE_PORTS[open-webui]:-3000}" +[[ "$ENABLE_VOICE" == "true" ]] && PORTS_TO_CHECK="$PORTS_TO_CHECK ${SERVICE_PORTS[whisper]:-9000} ${SERVICE_PORTS[tts]:-8880}" +[[ "$ENABLE_WORKFLOWS" == "true" ]] && PORTS_TO_CHECK="$PORTS_TO_CHECK ${SERVICE_PORTS[n8n]:-5678}" +[[ "$ENABLE_RAG" == "true" ]] && PORTS_TO_CHECK="$PORTS_TO_CHECK ${SERVICE_PORTS[qdrant]:-6333}" + +for port in $PORTS_TO_CHECK; do + if ! check_port $port; then + warn "Port $port is already in use" + REQUIREMENTS_MET=false + fi +done + +if [[ "$REQUIREMENTS_MET" != "true" ]]; then + warn "Some requirements not met. Installation may have limited functionality." + if $INTERACTIVE && ! $DRY_RUN; then + read -p " Continue anyway? [y/N] " -r + [[ ! $REPLY =~ ^[Yy]$ ]] && exit 1 + elif $DRY_RUN; then + log "[DRY RUN] Would prompt to continue despite unmet requirements" + fi +fi diff --git a/dream-server/installers/phases/05-docker.sh b/dream-server/installers/phases/05-docker.sh new file mode 100644 index 000000000..4d183d14a --- /dev/null +++ b/dream-server/installers/phases/05-docker.sh @@ -0,0 +1,109 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Phase 05: Docker Setup +# ============================================================================ +# Part of: installers/phases/ +# Purpose: Install Docker, Docker Compose, and NVIDIA Container Toolkit +# +# Expects: SKIP_DOCKER, DRY_RUN, INTERACTIVE, GPU_COUNT, GPU_BACKEND, +# LOG_FILE, MIN_DRIVER_VERSION, +# show_phase(), ai(), ai_ok(), ai_warn(), log(), warn(), error() +# Provides: DOCKER_CMD, DOCKER_COMPOSE_CMD +# +# Modder notes: +# Change Docker installation method or add Podman support here. +# ============================================================================ + +show_phase 3 6 "Docker Setup" "~2 minutes" +ai "Preparing container runtime..." + +if [[ "$SKIP_DOCKER" == "true" ]]; then + log "Skipping Docker installation (--skip-docker)" +elif command -v docker &> /dev/null; then + ai_ok "Docker already installed: $(docker --version)" +else + ai "Installing Docker..." + + if $DRY_RUN; then + log "[DRY RUN] Would install Docker via official script" + else + if ! curl -fsSL https://get.docker.com | sh; then + error "Docker installation failed. Check network connectivity and try again." + fi + sudo usermod -aG docker $USER + + # Check if we need to use newgrp or restart + if ! groups | grep -q docker; then + warn "Docker installed! Group membership requires re-login." + warn "Option 1: Log out and back in, then re-run this script with --skip-docker" + warn "Option 2: Run 'newgrp docker' in a new terminal, then re-run" + echo "" + read -p " Try to continue with 'sudo docker' for now? [Y/n] " -r + if [[ ! $REPLY =~ ^[Nn]$ ]]; then + # Use sudo for remaining docker commands in this session + DOCKER_CMD="sudo docker" + DOCKER_COMPOSE_CMD="sudo docker compose" + else + log "Please re-run after logging out and back in." + exit 0 + fi + fi + fi +fi + +# Set docker command (use sudo if needed) +DOCKER_CMD="${DOCKER_CMD:-docker}" +DOCKER_COMPOSE_CMD="${DOCKER_COMPOSE_CMD:-docker compose}" + +# Docker Compose check (v2 preferred, v1 fallback) +if $DOCKER_COMPOSE_CMD version &> /dev/null 2>&1; then + ai_ok "Docker Compose v2 available" +elif command -v docker-compose &> /dev/null; then + DOCKER_COMPOSE_CMD="${DOCKER_CMD%-*}-compose" + [[ "$DOCKER_CMD" == "sudo docker" ]] && DOCKER_COMPOSE_CMD="sudo docker-compose" + ai_ok "Docker Compose v1 available (using docker-compose)" +else + if ! $DRY_RUN; then + ai "Installing Docker Compose plugin..." + sudo apt-get update && sudo apt-get install -y docker-compose-plugin + fi +fi + +# NVIDIA Container Toolkit (skip for AMD โ€” uses /dev/dri + /dev/kfd passthrough) +if [[ $GPU_COUNT -gt 0 && "$GPU_BACKEND" == "nvidia" ]]; then + if command -v nvidia-container-cli &> /dev/null 2>&1; then + ai_ok "NVIDIA Container Toolkit installed" + # Always regenerate CDI spec โ€” driver version may have changed since last run + if command -v nvidia-ctk &>/dev/null && ! $DRY_RUN; then + sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml 2>>"$LOG_FILE" || true + fi + else + ai "Installing NVIDIA Container Toolkit..." + if ! $DRY_RUN; then + # Add NVIDIA GPG key + curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg 2>/dev/null || true + # Use NVIDIA's current generic deb repo (per-distro URLs were deprecated) + curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ + sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ + sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list > /dev/null + # Verify we got a valid repo file, not an HTML 404 + if grep -q '/dev/null; then + warn "Failed to download NVIDIA Container Toolkit repo list. Trying fallback..." + echo "deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/deb/\$(ARCH) /" | \ + sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list > /dev/null + fi + sudo apt-get update + if ! sudo apt-get install -y nvidia-container-toolkit; then + error "Failed to install NVIDIA Container Toolkit. Check network connectivity and GPU drivers." + fi + sudo nvidia-ctk runtime configure --runtime=docker + sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml 2>>"$LOG_FILE" || true + sudo systemctl restart docker + fi + if command -v nvidia-container-cli &> /dev/null 2>&1; then + ai_ok "NVIDIA Container Toolkit installed" + else + $DRY_RUN && ai_ok "[DRY RUN] Would install NVIDIA Container Toolkit" || error "NVIDIA Container Toolkit installation failed โ€” nvidia-container-cli not found after install." + fi + fi +fi diff --git a/dream-server/installers/phases/06-directories.sh b/dream-server/installers/phases/06-directories.sh new file mode 100644 index 000000000..3a27e6ddc --- /dev/null +++ b/dream-server/installers/phases/06-directories.sh @@ -0,0 +1,342 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Phase 06: Directories & Configuration +# ============================================================================ +# Part of: installers/phases/ +# Purpose: Create directories, copy source files, generate .env, configure +# OpenClaw, SearXNG, and validate .env schema +# +# Expects: SCRIPT_DIR, INSTALL_DIR, LOG_FILE, DRY_RUN, INTERACTIVE, +# TIER, TIER_NAME, VERSION, GPU_BACKEND, SYSTEM_TZ, +# LLM_MODEL, MAX_CONTEXT, GGUF_FILE, COMPOSE_FLAGS, +# ENABLE_VOICE, ENABLE_WORKFLOWS, ENABLE_RAG, ENABLE_OPENCLAW, +# OPENCLAW_CONFIG, OPENCLAW_PROVIDER_NAME_DEFAULT, +# OPENCLAW_PROVIDER_URL_DEFAULT, +# chapter(), ai(), ai_ok(), ai_warn(), log(), warn(), error() +# Provides: WEBUI_SECRET, N8N_PASS, LITELLM_KEY, LIVEKIT_SECRET, +# DASHBOARD_API_KEY, OPENCODE_SERVER_PASSWORD, OPENCLAW_TOKEN, +# OPENCLAW_PROVIDER_NAME, OPENCLAW_PROVIDER_URL, OPENCLAW_MODEL, +# OPENCLAW_CONTEXT +# +# Modder notes: +# This is the largest phase. Modify .env generation, add new config files, +# or change directory layout here. +# ============================================================================ + +chapter "SETTING UP INSTALLATION" + +if $DRY_RUN; then + log "[DRY RUN] Would create: $INSTALL_DIR/{config,data,models}" + log "[DRY RUN] Would copy compose files ($COMPOSE_FLAGS) and source tree" + log "[DRY RUN] Would generate .env with secrets (WEBUI_SECRET, N8N_PASS, LITELLM_KEY, etc.)" + log "[DRY RUN] Would generate SearXNG config with randomized secret key" + [[ "$ENABLE_OPENCLAW" == "true" ]] && log "[DRY RUN] Would configure OpenClaw (model: $LLM_MODEL, config: ${OPENCLAW_CONFIG:-default})" + log "[DRY RUN] Would validate .env against schema" +else + # Create directories + mkdir -p "$INSTALL_DIR"/{config,data,models} + mkdir -p "$INSTALL_DIR"/data/{open-webui,whisper,tts,n8n,qdrant,models} + mkdir -p "$INSTALL_DIR"/config/{n8n,litellm,openclaw,searxng} + + # Copy entire source tree to install dir (skip if same directory) + if [[ "$SCRIPT_DIR" != "$INSTALL_DIR" ]]; then + ai "Copying source files to $INSTALL_DIR..." + if command -v rsync &>/dev/null; then + rsync -a \ + --exclude='.git' \ + --exclude='data/' \ + --exclude='logs/' \ + --exclude='models/' \ + --exclude='.env' \ + --exclude='node_modules/' \ + --exclude='dist/' \ + --exclude='*.log' \ + --exclude='.current-mode' \ + --exclude='.profiles' \ + --exclude='.target-model' \ + --exclude='.target-quantization' \ + --exclude='.offline-mode' \ + "$SCRIPT_DIR/" "$INSTALL_DIR/" + else + # Fallback: cp -r everything, then remove runtime artifacts + cp -r "$SCRIPT_DIR"/* "$INSTALL_DIR/" 2>/dev/null || true + cp "$SCRIPT_DIR"/.gitignore "$INSTALL_DIR/" 2>/dev/null || true + rm -rf "$INSTALL_DIR/.git" 2>/dev/null || true + fi + # Ensure scripts are executable + chmod +x "$INSTALL_DIR"/*.sh "$INSTALL_DIR"/scripts/*.sh "$INSTALL_DIR"/dream-cli 2>/dev/null || true + ai_ok "Source files installed" + else + log "Running in-place (source == install dir), skipping file copy" + fi + + # Select tier-appropriate OpenClaw config + if [[ "$ENABLE_OPENCLAW" == "true" && -n "$OPENCLAW_CONFIG" ]]; then + OPENCLAW_MODEL="$LLM_MODEL" + OPENCLAW_CONTEXT=$MAX_CONTEXT + + if [[ -f "$INSTALL_DIR/config/openclaw/$OPENCLAW_CONFIG" ]]; then + cp "$INSTALL_DIR/config/openclaw/$OPENCLAW_CONFIG" "$INSTALL_DIR/config/openclaw/openclaw.json" + elif [[ -f "$SCRIPT_DIR/config/openclaw/$OPENCLAW_CONFIG" ]]; then + cp "$SCRIPT_DIR/config/openclaw/$OPENCLAW_CONFIG" "$INSTALL_DIR/config/openclaw/openclaw.json" + else + warn "OpenClaw config $OPENCLAW_CONFIG not found, using default" + cp "$SCRIPT_DIR/config/openclaw/openclaw.json.example" "$INSTALL_DIR/config/openclaw/openclaw.json" 2>/dev/null || true + fi + # Resolve provider name/URL before any sed replacements that depend on them + OPENCLAW_PROVIDER_NAME="${OPENCLAW_PROVIDER_NAME_DEFAULT}" + OPENCLAW_PROVIDER_URL="${OPENCLAW_PROVIDER_URL_DEFAULT}" + + # Replace model and provider placeholders to match what the inference backend actually serves + sed -i "s|__LLM_MODEL__|${OPENCLAW_MODEL}|g" "$INSTALL_DIR/config/openclaw/openclaw.json" + sed -i "s|Qwen/Qwen2.5-[^\"]*|${OPENCLAW_MODEL}|g" "$INSTALL_DIR/config/openclaw/openclaw.json" + sed -i "s|local-ollama|${OPENCLAW_PROVIDER_NAME}|g" "$INSTALL_DIR/config/openclaw/openclaw.json" + log "Installed OpenClaw config: $OPENCLAW_CONFIG -> openclaw.json (model: $OPENCLAW_MODEL)" + mkdir -p "$INSTALL_DIR/data/openclaw/home/agents/main/sessions" + # Generate OpenClaw home config with local llama-server provider + OPENCLAW_TOKEN=$(openssl rand -hex 24 2>/dev/null || head -c 24 /dev/urandom | xxd -p) + + cat > "$INSTALL_DIR/data/openclaw/home/openclaw.json" << OCLAW_EOF +{ + "models": { + "providers": { + "${OPENCLAW_PROVIDER_NAME}": { + "baseUrl": "${OPENCLAW_PROVIDER_URL}", + "apiKey": "none", + "api": "openai-completions", + "models": [ + { + "id": "${OPENCLAW_MODEL}", + "name": "Dream Server LLM (Local)", + "reasoning": false, + "input": ["text"], + "cost": {"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0}, + "contextWindow": ${OPENCLAW_CONTEXT}, + "maxTokens": 8192, + "compat": { + "supportsStore": false, + "supportsDeveloperRole": false, + "supportsReasoningEffort": false, + "maxTokensField": "max_tokens" + } + } + ] + } + } + }, + "agents": { + "defaults": { + "model": {"primary": "${OPENCLAW_PROVIDER_NAME}/${OPENCLAW_MODEL}"}, + "models": {"${OPENCLAW_PROVIDER_NAME}/${OPENCLAW_MODEL}": {}}, + "compaction": {"mode": "safeguard"}, + "subagents": {"maxConcurrent": 20, "model": "${OPENCLAW_PROVIDER_NAME}/${OPENCLAW_MODEL}"} + } + }, + "commands": {"native": "auto", "nativeSkills": "auto"}, + "gateway": { + "mode": "local", + "bind": "lan", + "controlUi": {"allowInsecureAuth": true}, + "auth": {"mode": "token", "token": "${OPENCLAW_TOKEN}"} + } +} +OCLAW_EOF + # Generate agent auth-profiles.json for llama-server provider + mkdir -p "$INSTALL_DIR/data/openclaw/home/agents/main/agent" + cat > "$INSTALL_DIR/data/openclaw/home/agents/main/agent/auth-profiles.json" << AUTH_EOF +{ + "version": 1, + "profiles": { + "${OPENCLAW_PROVIDER_NAME}:default": { + "type": "api_key", + "provider": "${OPENCLAW_PROVIDER_NAME}", + "key": "none" + } + }, + "lastGood": {"${OPENCLAW_PROVIDER_NAME}": "${OPENCLAW_PROVIDER_NAME}:default"}, + "usageStats": {} +} +AUTH_EOF + cat > "$INSTALL_DIR/data/openclaw/home/agents/main/agent/models.json" << MODELS_EOF +{ + "providers": { + "${OPENCLAW_PROVIDER_NAME}": { + "baseUrl": "${OPENCLAW_PROVIDER_URL}", + "apiKey": "none", + "api": "openai-completions", + "models": [ + { + "id": "${OPENCLAW_MODEL}", + "name": "Dream Server LLM (Local)", + "reasoning": false, + "input": ["text"], + "cost": {"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0}, + "contextWindow": ${OPENCLAW_CONTEXT}, + "maxTokens": 8192, + "compat": { + "supportsStore": false, + "supportsDeveloperRole": false, + "supportsReasoningEffort": false, + "maxTokensField": "max_tokens" + } + } + ] + } + } +} +MODELS_EOF + log "Generated OpenClaw home config (model: $OPENCLAW_MODEL, gateway token set)" + # Create workspace directory (must exist before Docker Compose, + # otherwise Docker auto-creates it as root and the container can't write to it) + mkdir -p "$INSTALL_DIR/config/openclaw/workspace/memory" + # Copy workspace personality files (Todd identity, system knowledge, etc.) + # Exclude .git and .openclaw dirs โ€” those are runtime/dev artifacts + if [[ -d "$SCRIPT_DIR/config/openclaw/workspace" ]]; then + if command -v rsync &>/dev/null; then + rsync -a --exclude='.git' --exclude='.openclaw' --exclude='.gitkeep' \ + "$SCRIPT_DIR/config/openclaw/workspace/" "$INSTALL_DIR/config/openclaw/workspace/" + else + cp -r "$SCRIPT_DIR/config/openclaw/workspace"/* "$INSTALL_DIR/config/openclaw/workspace/" 2>/dev/null || true + rm -rf "$INSTALL_DIR/config/openclaw/workspace/.git" 2>/dev/null || true + rm -rf "$INSTALL_DIR/config/openclaw/workspace/.openclaw" 2>/dev/null || true + fi + log "Installed OpenClaw workspace files (agent personality)" + fi + # OpenClaw container runs as node (uid 1000) โ€” fix ownership + chown -R 1000:1000 "$INSTALL_DIR/data/openclaw" "$INSTALL_DIR/config/openclaw/workspace" 2>/dev/null || true + fi + + # Generate secure secrets + WEBUI_SECRET=$(openssl rand -hex 32 2>/dev/null || head -c 32 /dev/urandom | xxd -p) + N8N_PASS=$(openssl rand -base64 16 2>/dev/null || head -c 16 /dev/urandom | base64) + LITELLM_KEY="sk-dream-$(openssl rand -hex 16 2>/dev/null || head -c 16 /dev/urandom | xxd -p)" + LIVEKIT_SECRET=$(openssl rand -base64 32 2>/dev/null || head -c 32 /dev/urandom | base64) + DASHBOARD_API_KEY=$(openssl rand -hex 32 2>/dev/null || head -c 32 /dev/urandom | xxd -p) + OPENCODE_SERVER_PASSWORD= + + # Generate .env file + cat > "$INSTALL_DIR/.env" << ENV_EOF +# Dream Server Configuration โ€” ${TIER_NAME} Edition +# Generated by installer v${VERSION} on $(date -Iseconds) +# Tier: ${TIER} (${TIER_NAME}) + +#=== LLM Backend Mode === +DREAM_MODE=${DREAM_MODE:-local} +LLM_API_URL=$(if [[ "${DREAM_MODE:-local}" == "local" ]]; then echo "http://llama-server:8080"; else echo "http://litellm:4000"; fi) + +#=== Cloud API Keys === +ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-} +OPENAI_API_KEY=${OPENAI_API_KEY:-} +TOGETHER_API_KEY=${TOGETHER_API_KEY:-} + +#=== LLM Settings (llama-server) === +LLM_MODEL=${LLM_MODEL} +GGUF_FILE=${GGUF_FILE} +MAX_CONTEXT=${MAX_CONTEXT} +CTX_SIZE=${MAX_CONTEXT} +GPU_BACKEND=${GPU_BACKEND} +LLAMA_SERVER_PORT=8080 + +$(if [[ "$GPU_BACKEND" == "amd" ]]; then cat << AMD_ENV +#=== GPU Group IDs (for container device access) === +VIDEO_GID=$(getent group video 2>/dev/null | cut -d: -f3 || echo 44) +RENDER_GID=$(getent group render 2>/dev/null | cut -d: -f3 || echo 992) + +#=== AMD ROCm Settings === +HSA_OVERRIDE_GFX_VERSION=11.5.1 +ROCBLAS_USE_HIPBLASLT=0 +AMD_ENV +fi) + +#=== Ports === +LLAMA_SERVER_PORT=8080 +WEBUI_PORT=3000 +WHISPER_PORT=9000 +TTS_PORT=8880 +N8N_PORT=5678 +QDRANT_PORT=6333 +QDRANT_GRPC_PORT=6334 +LITELLM_PORT=4000 +OPENCLAW_PORT=7860 +SEARXNG_PORT=8888 + +#=== Security (auto-generated, keep secret!) === +WEBUI_SECRET=${WEBUI_SECRET} +DASHBOARD_API_KEY=${DASHBOARD_API_KEY} +N8N_USER=admin +N8N_PASS=${N8N_PASS} +LITELLM_KEY=${LITELLM_KEY} +LIVEKIT_API_KEY=$(openssl rand -hex 16 2>/dev/null || head -c 16 /dev/urandom | xxd -p) +LIVEKIT_API_SECRET=${LIVEKIT_SECRET} +OPENCLAW_TOKEN=${OPENCLAW_TOKEN:-$(openssl rand -hex 24 2>/dev/null || head -c 24 /dev/urandom | xxd -p)} +OPENCODE_SERVER_PASSWORD=${OPENCODE_SERVER_PASSWORD} +OPENCODE_PORT=3003 + +#=== Voice Settings === +WHISPER_MODEL=base +TTS_VOICE=en_US-lessac-medium + +#=== Web UI Settings === +WEBUI_AUTH=true +ENABLE_WEB_SEARCH=true +WEB_SEARCH_ENGINE=searxng + +#=== n8n Settings === +N8N_AUTH=true +N8N_HOST=localhost +N8N_WEBHOOK_URL=http://localhost:5678 +TIMEZONE=${SYSTEM_TZ:-UTC} +ENV_EOF + + chmod 600 "$INSTALL_DIR/.env" # Secure secrets file + ai_ok "Created $INSTALL_DIR" + ai_ok "Generated secure secrets in .env (permissions: 600)" + + # Validate generated .env against schema (fails fast on missing/unknown keys). + if [[ -f "$SCRIPT_DIR/scripts/validate-env.sh" && -f "$SCRIPT_DIR/.env.schema.json" ]]; then + if bash "$SCRIPT_DIR/scripts/validate-env.sh" "$INSTALL_DIR/.env" "$SCRIPT_DIR/.env.schema.json" >> "$LOG_FILE" 2>&1; then + ai_ok "Validated .env against .env.schema.json" + else + error "Generated .env failed schema validation. See $LOG_FILE for details." + fi + else + warn "Skipping .env schema validation (.env.schema.json or scripts/validate-env.sh missing)" + fi + + # Generate SearXNG config with randomized secret key + # Fix ownership from previous container runs (SearXNG writes as uid 977) + mkdir -p "$INSTALL_DIR/config/searxng" + if [[ -f "$INSTALL_DIR/config/searxng/settings.yml" ]] && ! [[ -w "$INSTALL_DIR/config/searxng/settings.yml" ]]; then + sudo chown "$(id -u):$(id -g)" "$INSTALL_DIR/config/searxng/settings.yml" 2>/dev/null || true + fi + SEARXNG_SECRET=$(openssl rand -hex 32 2>/dev/null || head -c 32 /dev/urandom | xxd -p) + cat > "$INSTALL_DIR/config/searxng/settings.yml" << SEARXNG_EOF +use_default_settings: true +server: + secret_key: "${SEARXNG_SECRET}" + bind_address: "0.0.0.0" + port: 8080 + limiter: false +search: + safe_search: 0 + formats: + - html + - json +engines: + - name: duckduckgo + disabled: false + - name: google + disabled: false + - name: brave + disabled: false + - name: wikipedia + disabled: false + - name: github + disabled: false + - name: stackoverflow + disabled: false +SEARXNG_EOF + ai_ok "Generated SearXNG config with randomized secret key" +fi + +# Documentation, CLI tools, and compose variants already copied by rsync/cp block above diff --git a/dream-server/installers/phases/07-devtools.sh b/dream-server/installers/phases/07-devtools.sh new file mode 100644 index 000000000..04c92924d --- /dev/null +++ b/dream-server/installers/phases/07-devtools.sh @@ -0,0 +1,130 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Phase 07: Developer Tools +# ============================================================================ +# Part of: installers/phases/ +# Purpose: Install Claude Code, Codex CLI, and OpenCode +# +# Expects: DRY_RUN, INSTALL_DIR, LOG_FILE, LLM_MODEL, MAX_CONTEXT, +# ai(), ai_ok(), ai_warn(), log() +# Provides: (developer tools installed globally) +# +# Modder notes: +# Add new developer tools or change installation methods here. +# ============================================================================ + +if $DRY_RUN; then + log "[DRY RUN] Would install AI developer tools (Claude Code, Codex CLI, OpenCode)" + log "[DRY RUN] Would configure OpenCode for local llama-server (user-level systemd service on port 3003)" +else + ai "Installing AI developer tools..." + + # Ensure Node.js/npm is available (needed for Claude Code and Codex) + if ! command -v npm &> /dev/null; then + if command -v apt-get &> /dev/null; then + ai "Installing Node.js..." + curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash - >> "$LOG_FILE" 2>&1 || true + sudo apt-get install -y nodejs >> "$LOG_FILE" 2>&1 || true + fi + fi + + if command -v npm &> /dev/null; then + # Install Claude Code (Anthropic's CLI for Claude) + if ! command -v claude &> /dev/null; then + sudo npm install -g @anthropic-ai/claude-code >> "$LOG_FILE" 2>&1 && \ + ai_ok "Claude Code installed (run 'claude' to start)" || \ + ai_warn "Claude Code install failed โ€” install later with: npm i -g @anthropic-ai/claude-code" + else + ai_ok "Claude Code already installed" + fi + + # Install Codex CLI (OpenAI's terminal agent) + if ! command -v codex &> /dev/null; then + sudo npm install -g @openai/codex >> "$LOG_FILE" 2>&1 && \ + ai_ok "Codex CLI installed (run 'codex' to start)" || \ + ai_warn "Codex CLI install failed โ€” install later with: npm i -g @openai/codex" + else + ai_ok "Codex CLI already installed" + fi + else + ai_warn "npm not available โ€” skipping Claude Code and Codex CLI install" + ai " Install later: npm i -g @anthropic-ai/claude-code @openai/codex" + fi + + # โ”€โ”€ OpenCode (local agentic coding platform) โ”€โ”€ + if ! command -v opencode &> /dev/null && [[ ! -x "$HOME/.opencode/bin/opencode" ]]; then + ai "Installing OpenCode..." + if curl -fsSL https://opencode.ai/install 2>/dev/null | bash >> "$LOG_FILE" 2>&1; then + ai_ok "OpenCode installed (~/.opencode/bin/opencode)" + else + ai_warn "OpenCode install failed โ€” install later with: curl -fsSL https://opencode.ai/install | bash" + fi + else + ai_ok "OpenCode already installed" + fi + + # Configure OpenCode to use local llama-server + if [[ -x "$HOME/.opencode/bin/opencode" ]]; then + OPENCODE_CONFIG_DIR="$HOME/.config/opencode" + mkdir -p "$OPENCODE_CONFIG_DIR" + if [[ ! -f "$OPENCODE_CONFIG_DIR/opencode.json" ]]; then + cat > "$OPENCODE_CONFIG_DIR/opencode.json" </dev/null || true + sed -i 's/^ANTHROPIC_API_KEY=.*/ANTHROPIC_API_KEY=/' "$INSTALL_DIR/.env" 2>/dev/null || true + sed -i 's/^OPENAI_API_KEY=.*/OPENAI_API_KEY=/' "$INSTALL_DIR/.env" 2>/dev/null || true + + # Add offline mode config + cat >> "$INSTALL_DIR/.env" << 'OFFLINE_EOF' + +#============================================================================= +# M1 Offline Mode Configuration +#============================================================================= +OFFLINE_MODE=true + +# Disable telemetry and update checks +DISABLE_TELEMETRY=true +DISABLE_UPDATE_CHECK=true + +# Use local RAG instead of web search +WEB_SEARCH_ENABLED=false +LOCAL_RAG_ENABLED=true +OFFLINE_EOF + + # Create OpenClaw M1 config if OpenClaw is enabled + if [[ "$ENABLE_OPENCLAW" == "true" ]]; then + mkdir -p "$INSTALL_DIR/config/openclaw" + cat > "$INSTALL_DIR/config/openclaw/openclaw-m1.yaml" << 'M1_EOF' +# OpenClaw M1 Mode Configuration +# Fully offline operation - no cloud dependencies + +memorySearch: + enabled: true + # Uses bundled GGUF embeddings (auto-downloaded during install) + # No external API calls + +# Disable web search (not available offline) +# Use local RAG with Qdrant instead +webSearch: + enabled: false + +# Local inference only +inference: + provider: local + baseUrl: http://llama-server:8080/v1 +M1_EOF + ai_ok "OpenClaw M1 config created" + fi + + # Pre-download GGUF embeddings for memory_search + ai "Pre-downloading GGUF embeddings for offline memory_search..." + mkdir -p "$INSTALL_DIR/models/embeddings" + + # Download embeddinggemma GGUF (small, ~300MB) + if command -v curl &> /dev/null; then + EMBED_URL="https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.Q4_K_M.gguf" + if ! [[ -f "$INSTALL_DIR/models/embeddings/nomic-embed-text-v1.5.Q4_K_M.gguf" ]]; then + curl -L -o "$INSTALL_DIR/models/embeddings/nomic-embed-text-v1.5.Q4_K_M.gguf" "$EMBED_URL" 2>/dev/null || \ + ai_warn "Could not pre-download embeddings. Memory search will download on first use." + else + log "Embeddings already downloaded" + fi + fi + + # Offline docs already copied by rsync/cp block above + ai_ok "Offline mode configured" + log "After installation, disconnect from internet for fully air-gapped operation" + log "See docs/M1-OFFLINE-MODE.md for offline operation guide" +fi diff --git a/dream-server/installers/phases/10-amd-tuning.sh b/dream-server/installers/phases/10-amd-tuning.sh new file mode 100644 index 000000000..7fa6a549c --- /dev/null +++ b/dream-server/installers/phases/10-amd-tuning.sh @@ -0,0 +1,129 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Phase 10: AMD System Tuning +# ============================================================================ +# Part of: installers/phases/ +# Purpose: AMD APU (Strix Halo) sysctl, modprobe, GRUB, and tuned setup +# +# Expects: GPU_BACKEND, DRY_RUN, INSTALL_DIR, LOG_FILE, +# ai(), ai_ok(), ai_warn(), log() +# Provides: System tuning applied (sysctl, modprobe, timers, tuned) +# +# Modder notes: +# Add new AMD-specific tuning parameters or kernel options here. +# ============================================================================ + +if [[ "$GPU_BACKEND" == "amd" ]] && $DRY_RUN; then + log "[DRY RUN] Would apply AMD APU system tuning:" + log "[DRY RUN] - Install systemd user timers (session cleanup, memory shepherd)" + log "[DRY RUN] - Apply sysctl tuning (swappiness=10, vfs_cache_pressure=50)" + log "[DRY RUN] - Install amdgpu modprobe options" + log "[DRY RUN] - Install GTT memory optimization" + log "[DRY RUN] - Configure tuned accelerator-performance profile" +elif [[ "$GPU_BACKEND" == "amd" ]] && ! $DRY_RUN; then + ai "Applying system tuning for AMD APU..." + + # Management scripts and Memory Shepherd already copied by rsync/cp block above + [[ -d "$INSTALL_DIR/memory-shepherd" ]] && ai_ok "Memory Shepherd installed" + + # โ”€โ”€ Install systemd user timers (session cleanup, session manager, memory shepherd) โ”€โ”€ + ai "Installing maintenance timers..." + SYSTEMD_USER_DIR="$HOME/.config/systemd/user" + mkdir -p "$SYSTEMD_USER_DIR" + + # Ensure scripts are executable + chmod +x "$INSTALL_DIR/scripts/session-cleanup.sh" \ + "$INSTALL_DIR/memory-shepherd/memory-shepherd.sh" 2>/dev/null || true + + # Copy all systemd unit files + if [[ -d "$INSTALL_DIR/scripts/systemd" ]]; then + cp "$INSTALL_DIR/scripts/systemd"/*.service "$INSTALL_DIR/scripts/systemd"/*.timer \ + "$SYSTEMD_USER_DIR/" 2>/dev/null || true + fi + + # Create archive directories for memory shepherd + mkdir -p "$INSTALL_DIR/data/memory-archives/dream-agent"/{memory,agents,tools} + + # Reload and enable all timers + systemctl --user daemon-reload 2>/dev/null || true + for timer in openclaw-session-cleanup openclaw-session-manager memory-shepherd-workspace memory-shepherd-memory; do + systemctl --user enable --now "${timer}.timer" >> "$LOG_FILE" 2>&1 || true + done + ai_ok "Maintenance timers enabled (session cleanup, session manager, memory shepherd)" + + # Enable lingering so user timers survive logout + loginctl enable-linger "$(whoami)" 2>/dev/null || \ + sudo -n loginctl enable-linger "$(whoami)" 2>/dev/null || \ + ai_warn "Could not enable linger. Timers may stop after logout. Run: loginctl enable-linger $(whoami)" + + # Install sysctl tuning (vm.swappiness, vfs_cache_pressure) + if [[ -f "$INSTALL_DIR/config/system-tuning/99-dream-server.conf" ]]; then + if sudo -n cp "$INSTALL_DIR/config/system-tuning/99-dream-server.conf" /etc/sysctl.d/ 2>/dev/null; then + sudo -n sysctl --system > /dev/null 2>&1 || true + ai_ok "sysctl tuning applied (swappiness=10, vfs_cache_pressure=50)" + else + ai_warn "Could not install sysctl tuning (needs sudo). Copy manually:" + ai " sudo cp config/system-tuning/99-dream-server.conf /etc/sysctl.d/" + fi + fi + + # Install amdgpu modprobe options + if [[ -f "$INSTALL_DIR/config/system-tuning/amdgpu.conf" ]]; then + if sudo -n cp "$INSTALL_DIR/config/system-tuning/amdgpu.conf" /etc/modprobe.d/ 2>/dev/null; then + ai_ok "amdgpu modprobe tuning installed (ppfeaturemask, gpu_recovery)" + else + ai_warn "Could not install amdgpu modprobe config (needs sudo). Copy manually:" + ai " sudo cp config/system-tuning/amdgpu.conf /etc/modprobe.d/" + fi + fi + + # Install GTT memory optimization for unified memory APU + if [[ -f "$INSTALL_DIR/config/system-tuning/amdgpu_llm_optimized.conf" ]]; then + if sudo -n cp "$INSTALL_DIR/config/system-tuning/amdgpu_llm_optimized.conf" /etc/modprobe.d/ 2>/dev/null; then + ai_ok "GTT memory tuning installed (gttsize=120000, pages_limit, page_pool_size)" + else + ai_warn "Could not install GTT memory config (needs sudo). Copy manually:" + ai " sudo cp config/system-tuning/amdgpu_llm_optimized.conf /etc/modprobe.d/" + fi + fi + + # Configure kernel boot parameters for optimal GPU memory access + if [[ -f /etc/default/grub ]]; then + current_cmdline=$(grep '^GRUB_CMDLINE_LINUX_DEFAULT=' /etc/default/grub 2>/dev/null || true) + if [[ -n "$current_cmdline" ]] && ! echo "$current_cmdline" | grep -q 'amd_iommu=off'; then + ai "Recommended: add 'amd_iommu=off' to kernel boot parameters for ~2-6% GPU improvement" + ai " Run: sudo sed -i 's/iommu=pt/amd_iommu=off/' /etc/default/grub && sudo update-grub" + ai " Or if iommu=pt is not set:" + ai " sudo sed -i 's/GRUB_CMDLINE_LINUX_DEFAULT=\"\\(.*\\)\"/GRUB_CMDLINE_LINUX_DEFAULT=\"\\1 amd_iommu=off\"/' /etc/default/grub && sudo update-grub" + fi + fi + + # Enable tuned with accelerator-performance profile for CPU governor optimization + if command -v tuned-adm &>/dev/null; then + if ! systemctl is-active --quiet tuned 2>/dev/null; then + if sudo -n systemctl enable --now tuned 2>/dev/null; then + sudo -n tuned-adm profile accelerator-performance 2>/dev/null && \ + ai_ok "tuned profile set to accelerator-performance (5-8% pp improvement)" || \ + ai_warn "tuned started but could not set profile. Run: sudo tuned-adm profile accelerator-performance" + else + ai_warn "Could not start tuned. Run manually:" + ai " sudo systemctl enable --now tuned && sudo tuned-adm profile accelerator-performance" + fi + else + active_profile=$(tuned-adm active 2>/dev/null | grep -oP 'Current active profile: \K.*' || true) + if [[ "$active_profile" != "accelerator-performance" ]]; then + sudo -n tuned-adm profile accelerator-performance 2>/dev/null && \ + ai_ok "tuned profile changed to accelerator-performance" || \ + ai_warn "tuned running but wrong profile. Run: sudo tuned-adm profile accelerator-performance" + else + ai_ok "tuned already set to accelerator-performance" + fi + fi + else + ai_warn "tuned not installed. For 5-8% prompt processing improvement:" + ai " sudo apt install tuned && sudo systemctl enable --now tuned && sudo tuned-adm profile accelerator-performance" + fi + + # LiteLLM config already copied by rsync/cp block above + [[ -f "$INSTALL_DIR/config/litellm/strix-halo-config.yaml" ]] && ai_ok "LiteLLM Strix Halo routing config installed" +fi diff --git a/dream-server/installers/phases/11-services.sh b/dream-server/installers/phases/11-services.sh new file mode 100644 index 000000000..55c45ac95 --- /dev/null +++ b/dream-server/installers/phases/11-services.sh @@ -0,0 +1,213 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Phase 11: Start Services +# ============================================================================ +# Part of: installers/phases/ +# Purpose: Download GGUF model, FLUX models, generate models.ini, launch +# Docker Compose stack +# +# Expects: DRY_RUN, INSTALL_DIR, LOG_FILE, GPU_BACKEND, +# GGUF_FILE, GGUF_URL, LLM_MODEL, MAX_CONTEXT, +# DOCKER_COMPOSE_CMD, COMPOSE_FLAGS, BGRN, RED, AMB, NC, +# show_phase(), bootline(), signal(), ai(), ai_ok(), ai_bad(), +# ai_warn(), log(), spin_task() +# Provides: Running Docker Compose stack +# +# Modder notes: +# Change model download logic or compose launch flags here. +# ============================================================================ + +show_phase 5 6 "Starting Services" "~2-3 minutes" + +if $DRY_RUN; then + log "[DRY RUN] Would start services: $DOCKER_COMPOSE_CMD $COMPOSE_FLAGS up -d" +else + cd "$INSTALL_DIR" + mkdir -p "$INSTALL_DIR/logs" + + # Cloud mode: skip model downloads, auto-enable litellm + if [[ "${DREAM_MODE:-local}" == "cloud" ]]; then + ai "Cloud mode โ€” skipping model download" + # Auto-enable litellm extension + local litellm_cf="$INSTALL_DIR/extensions/services/litellm/compose.yaml" + local litellm_disabled="${litellm_cf}.disabled" + if [[ -f "$litellm_disabled" && ! -f "$litellm_cf" ]]; then + mv "$litellm_disabled" "$litellm_cf" + ai_ok "Auto-enabled litellm for cloud mode" + fi + fi + + # Ensure model directory exists + mkdir -p "$INSTALL_DIR/data/models" + + # Download GGUF model if not already present + GGUF_DIR="$INSTALL_DIR/data/models" + if [[ "${DREAM_MODE:-local}" != "cloud" && ! -f "$GGUF_DIR/$GGUF_FILE" && -n "$GGUF_URL" ]]; then + ai "Downloading GGUF model: $GGUF_FILE" + signal "This is the big one. I've got it โ€” sit back." + echo "" + + # Run wget in background, pipe through spin_task for clean UI + wget -c -q -O "$GGUF_DIR/$GGUF_FILE.part" "$GGUF_URL" \ + >> "$INSTALL_DIR/logs/model-download.log" 2>&1 & + dl_pid=$! + + if spin_task $dl_pid "Downloading $GGUF_FILE"; then + mv "$GGUF_DIR/$GGUF_FILE.part" "$GGUF_DIR/$GGUF_FILE" + printf "\r ${BGRN}โœ“${NC} %-60s\n" "Model downloaded: $GGUF_FILE" + else + printf "\r ${RED}โœ—${NC} %-60s\n" "Download failed: $GGUF_FILE" + ai "Retry: wget -c -O '$GGUF_DIR/$GGUF_FILE.part' '$GGUF_URL' && mv '$GGUF_DIR/$GGUF_FILE.part' '$GGUF_DIR/$GGUF_FILE'" + fi + elif [[ -f "$GGUF_DIR/$GGUF_FILE" ]]; then + ai_ok "GGUF model already present: $GGUF_FILE" + fi + + # โ”€โ”€ FLUX.1-schnell model download (ComfyUI image generation) โ”€โ”€ + if [[ "${DREAM_MODE:-local}" == "cloud" ]]; then + ai "Cloud mode โ€” skipping FLUX model download" + elif [[ "$GPU_BACKEND" == "amd" ]]; then + COMFYUI_BASE="$INSTALL_DIR/data/comfyui/ComfyUI/models" + elif [[ "$GPU_BACKEND" == "nvidia" ]]; then + COMFYUI_BASE="$INSTALL_DIR/data/comfyui/models" + fi + if [[ "$GPU_BACKEND" == "amd" || "$GPU_BACKEND" == "nvidia" ]]; then + FLUX_DIFFUSION_DIR="$COMFYUI_BASE/diffusion_models" + FLUX_ENCODER_DIR="$COMFYUI_BASE/text_encoders" + FLUX_VAE_DIR="$COMFYUI_BASE/vae" + mkdir -p "$FLUX_DIFFUSION_DIR" "$FLUX_ENCODER_DIR" "$FLUX_VAE_DIR" + # NVIDIA ComfyUI also needs output/input/workflows bind-mount dirs + if [[ "$GPU_BACKEND" == "nvidia" ]]; then + mkdir -p "$INSTALL_DIR/data/comfyui"/{output,input,workflows} + fi + + FLUX_NEEDED=false + [[ ! -f "$FLUX_DIFFUSION_DIR/flux1-schnell.safetensors" ]] && FLUX_NEEDED=true + [[ ! -f "$FLUX_ENCODER_DIR/clip_l.safetensors" ]] && FLUX_NEEDED=true + [[ ! -f "$FLUX_ENCODER_DIR/t5xxl_fp16.safetensors" ]] && FLUX_NEEDED=true + [[ ! -f "$FLUX_VAE_DIR/ae.safetensors" ]] && FLUX_NEEDED=true + + if [[ "$FLUX_NEEDED" == "true" ]]; then + ai "Downloading FLUX.1-schnell models (~34GB) for image generation..." + nohup env \ + FLUX_DIFFUSION_DIR="$FLUX_DIFFUSION_DIR" \ + FLUX_ENCODER_DIR="$FLUX_ENCODER_DIR" \ + FLUX_VAE_DIR="$FLUX_VAE_DIR" \ + bash -c ' + echo "[FLUX] Starting FLUX.1-schnell model downloads..." + + # Diffusion model (~24GB) + if [[ ! -f "$FLUX_DIFFUSION_DIR/flux1-schnell.safetensors" ]]; then + echo "[FLUX] Downloading flux1-schnell.safetensors (~24GB)..." + wget -c -q --show-progress -O "$FLUX_DIFFUSION_DIR/flux1-schnell.safetensors.part" \ + "https://huggingface.co/Comfy-Org/flux1-schnell/resolve/main/flux1-schnell.safetensors" 2>&1 && \ + mv "$FLUX_DIFFUSION_DIR/flux1-schnell.safetensors.part" "$FLUX_DIFFUSION_DIR/flux1-schnell.safetensors" && \ + echo "[FLUX] flux1-schnell.safetensors complete" || \ + echo "[FLUX] ERROR: Failed to download flux1-schnell.safetensors" + fi + + # CLIP-L text encoder (~246MB) + if [[ ! -f "$FLUX_ENCODER_DIR/clip_l.safetensors" ]]; then + echo "[FLUX] Downloading clip_l.safetensors (~246MB)..." + wget -c -q --show-progress -O "$FLUX_ENCODER_DIR/clip_l.safetensors.part" \ + "https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors" 2>&1 && \ + mv "$FLUX_ENCODER_DIR/clip_l.safetensors.part" "$FLUX_ENCODER_DIR/clip_l.safetensors" && \ + echo "[FLUX] clip_l.safetensors complete" || \ + echo "[FLUX] ERROR: Failed to download clip_l.safetensors" + fi + + # T5-XXL text encoder (~10GB) + if [[ ! -f "$FLUX_ENCODER_DIR/t5xxl_fp16.safetensors" ]]; then + echo "[FLUX] Downloading t5xxl_fp16.safetensors (~10GB)..." + wget -c -q --show-progress -O "$FLUX_ENCODER_DIR/t5xxl_fp16.safetensors.part" \ + "https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors" 2>&1 && \ + mv "$FLUX_ENCODER_DIR/t5xxl_fp16.safetensors.part" "$FLUX_ENCODER_DIR/t5xxl_fp16.safetensors" && \ + echo "[FLUX] t5xxl_fp16.safetensors complete" || \ + echo "[FLUX] ERROR: Failed to download t5xxl_fp16.safetensors" + fi + + # VAE (~335MB) + if [[ ! -f "$FLUX_VAE_DIR/ae.safetensors" ]]; then + echo "[FLUX] Downloading ae.safetensors (~335MB)..." + wget -c -q --show-progress -O "$FLUX_VAE_DIR/ae.safetensors.part" \ + "https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/resolve/main/split_files/vae/ae.safetensors" 2>&1 && \ + mv "$FLUX_VAE_DIR/ae.safetensors.part" "$FLUX_VAE_DIR/ae.safetensors" && \ + echo "[FLUX] ae.safetensors complete" || \ + echo "[FLUX] ERROR: Failed to download ae.safetensors" + fi + + echo "[FLUX] All FLUX.1-schnell model downloads finished." + ' > "$INSTALL_DIR/logs/flux-download.log" 2>&1 & + log "Background FLUX download started. Check: tail -f $INSTALL_DIR/logs/flux-download.log" + ai "FLUX.1-schnell models downloading in background (~34GB). ComfyUI will be ready once complete." + else + ai_ok "FLUX.1-schnell models already present" + fi + fi + + # Generate models.ini for llama-server (skip in cloud mode) + if [[ "${DREAM_MODE:-local}" != "cloud" ]]; then + mkdir -p "$INSTALL_DIR/config/llama-server" + cat > "$INSTALL_DIR/config/llama-server/models.ini" << MODELS_INI_EOF +[${LLM_MODEL}] +filename = ${GGUF_FILE} +load-on-startup = true +n-ctx = ${MAX_CONTEXT} +MODELS_INI_EOF + ai_ok "Generated models.ini for llama-server" + fi + + # Launch containers + echo "" + signal "Waking the stack..." + ai "I'm bringing systems online. You can breathe." + echo "" + compose_ok=false + $DOCKER_COMPOSE_CMD $COMPOSE_FLAGS up --build -d >> "$LOG_FILE" 2>&1 & + compose_pid=$! + if spin_task $compose_pid "Launching containers..."; then + compose_ok=true + else + printf "\r ${AMB}โš ${NC} %-60s\n" "Some services still starting..." + echo "" + ai_warn "Some containers need more time. Retrying..." + $DOCKER_COMPOSE_CMD $COMPOSE_FLAGS up --build -d >> "$LOG_FILE" 2>&1 & + compose_pid=$! + if spin_task $compose_pid "Waiting for remaining services..."; then + compose_ok=true + fi + fi + # Final safety net: start any containers stuck in Created state + $DOCKER_COMPOSE_CMD $COMPOSE_FLAGS up -d >> "$LOG_FILE" 2>&1 || true + + if $compose_ok; then + printf "\r ${BGRN}โœ“${NC} %-60s\n" "All containers launched" + echo "" + ai_ok "Services started (llama-server)" + else + printf "\r ${RED}โœ—${NC} %-60s\n" "Some containers failed to launch" + echo "" + ai_warn "Some services failed. Check: docker compose logs" + ai_warn "Log file: $LOG_FILE" + fi + + # โ”€โ”€ Run extension setup hooks โ”€โ”€ + if [[ -f "$INSTALL_DIR/lib/service-registry.sh" ]]; then + _HOOK_DIR="$INSTALL_DIR" + . "$_HOOK_DIR/lib/service-registry.sh" + sr_load + _hook_count=0 + for sid in "${SERVICE_IDS[@]}"; do + hook="${SERVICE_SETUP_HOOKS[$sid]:-}" + [[ -z "$hook" || ! -f "$hook" ]] && continue + [[ -x "$hook" ]] || chmod +x "$hook" + log "Running setup hook for $sid: $hook" + if bash "$hook" "$INSTALL_DIR" "$GPU_BACKEND" >> "$LOG_FILE" 2>&1; then + _hook_count=$((_hook_count + 1)) + else + ai_warn "Setup hook for $sid exited with error (non-fatal)" + fi + done + [[ $_hook_count -gt 0 ]] && ai_ok "Ran $_hook_count extension setup hook(s)" || true + fi +fi diff --git a/dream-server/installers/phases/12-health.sh b/dream-server/installers/phases/12-health.sh new file mode 100644 index 000000000..bdc568f96 --- /dev/null +++ b/dream-server/installers/phases/12-health.sh @@ -0,0 +1,136 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Phase 12: Health Checks +# ============================================================================ +# Part of: installers/phases/ +# Purpose: Verify all services are responding, configure Perplexica, +# pre-download STT model +# +# Expects: DRY_RUN, GPU_BACKEND, ENABLE_VOICE, ENABLE_WORKFLOWS, ENABLE_RAG, +# ENABLE_OPENCLAW, LLM_MODEL, LOG_FILE, BGRN, AMB, NC, +# WHISPER_PORT, TTS_PORT, OPENCLAW_PORT, +# PERPLEXICA_PORT (:-3004), COMFYUI_PORT (:-8188), +# show_phase(), check_service(), ai(), ai_ok(), ai_warn(), signal() +# Provides: Health check results, Perplexica auto-configuration +# +# Modder notes: +# Add new service health checks or auto-configuration here. +# ============================================================================ + +# Source service registry for port/health resolution +. "$SCRIPT_DIR/lib/service-registry.sh" +sr_load + +show_phase 6 6 "Systems Online" "~1-2 minutes" + +if $DRY_RUN; then + log "[DRY RUN] Would verify service health:" + log "[DRY RUN] - llama-server, Open WebUI, Perplexica, ComfyUI" + log "[DRY RUN] - Auto-configure Perplexica for ${LLM_MODEL:-default model}" + [[ "$ENABLE_OPENCLAW" == "true" ]] && log "[DRY RUN] - OpenClaw" + [[ "$ENABLE_VOICE" == "true" ]] && log "[DRY RUN] - Whisper (STT), Kokoro (TTS), pre-download STT model" + [[ "$ENABLE_WORKFLOWS" == "true" ]] && log "[DRY RUN] - n8n" + [[ "$ENABLE_RAG" == "true" ]] && log "[DRY RUN] - Qdrant" + echo "" + signal "All systems nominal. (dry run)" + ai_ok "Sovereign intelligence is online. (dry run)" + return 0 2>/dev/null || true +fi + +ai "Linking services... standby." + +sleep 5 + +# Health checks are best-effort โ€” don't let set -e kill the script if a service is slow +# Core service health checks +check_service "llama-server" "http://localhost:${SERVICE_PORTS[llama-server]:-8080}${SERVICE_HEALTH[llama-server]:-/health}" 120 || true +check_service "Open WebUI" "http://localhost:${SERVICE_PORTS[open-webui]:-3000}${SERVICE_HEALTH[open-webui]:-/}" 60 || true +check_service "Perplexica" "http://localhost:${SERVICE_PORTS[perplexica]:-3004}${SERVICE_HEALTH[perplexica]:-/}" 30 || true +check_service "ComfyUI" "http://localhost:${SERVICE_PORTS[comfyui]:-8188}${SERVICE_HEALTH[comfyui]:-/}" 120 || true + +# Perplexica auto-config: seed chat model + embedding model on first boot. +# The slim-latest image stores config in a database, not just config.json. +# We use the /api/config HTTP endpoint to set values after the service starts. +if docker inspect dream-perplexica &>/dev/null; then + PERPLEXICA_URL="http://localhost:${SERVICE_PORTS[perplexica]:-3004}" + PERPLEXICA_SETUP=$(curl -sf "${PERPLEXICA_URL}/api/config" 2>/dev/null | \ + python3 -c "import sys,json;d=json.load(sys.stdin);print('done' if d['values']['setupComplete'] else 'needed')" 2>/dev/null || echo "skip") + + if [[ "$PERPLEXICA_SETUP" == "needed" ]]; then + ai "Configuring Perplexica for ${LLM_MODEL}..." + # Query current config to get provider UUIDs, then set model + preferences via API + curl -sf "${PERPLEXICA_URL}/api/config" 2>/dev/null | \ + python3 -c " +import sys, json, urllib.request + +config = json.load(sys.stdin)['values'] +providers = config.get('modelProviders', []) +openai_prov = next((p for p in providers if p['type'] == 'openai'), None) +transformers_prov = next((p for p in providers if p['type'] == 'transformers'), None) + +if not openai_prov: + print('no-openai-provider') + sys.exit(1) + +url = '${PERPLEXICA_URL}/api/config' +model = '${LLM_MODEL}' + +def post(key, value): + data = json.dumps({'key': key, 'value': value}).encode() + req = urllib.request.Request(url, data=data, headers={'Content-Type': 'application/json'}) + urllib.request.urlopen(req) + +# Seed the chat model into the OpenAI provider +openai_prov['chatModels'] = [{'key': model, 'name': model}] +post('modelProviders', providers) + +# Set default providers and models +post('preferences', { + 'defaultChatProvider': openai_prov['id'], + 'defaultChatModel': model, + 'defaultEmbeddingProvider': transformers_prov['id'] if transformers_prov else openai_prov['id'], + 'defaultEmbeddingModel': 'Xenova/all-MiniLM-L6-v2' +}) + +# Mark setup complete to bypass the wizard +post('setupComplete', True) +print('ok') +" >> "$LOG_FILE" 2>&1 && \ + printf "\r ${BGRN}โœ“${NC} %-60s\n" "Perplexica configured (model: ${LLM_MODEL})" || \ + printf "\r ${AMB}โš ${NC} %-60s\n" "Perplexica config โ€” complete setup at :${PERPLEXICA_PORT:-3004}" + fi +fi + +[[ "$ENABLE_OPENCLAW" == "true" ]] && check_service "OpenClaw" "http://localhost:${SERVICE_PORTS[openclaw]:-7860}${SERVICE_HEALTH[openclaw]:-/}" 30 || true +systemctl is-active opencode-web &>/dev/null && check_service "OpenCode Web" "http://localhost:3003/" 10 || true +[[ "$ENABLE_VOICE" == "true" ]] && check_service "Whisper (STT)" "http://localhost:${SERVICE_PORTS[whisper]:-9000}${SERVICE_HEALTH[whisper]:-/health}" 60 || true +[[ "$ENABLE_VOICE" == "true" ]] && check_service "Kokoro (TTS)" "http://localhost:${SERVICE_PORTS[tts]:-8880}${SERVICE_HEALTH[tts]:-/health}" 30 || true + +# Pre-download the Whisper STT model so first transcription is instant. +# Speaches lazy-downloads on first request, but that causes a long delay + +# a 404 if the model isn't cached yet. Trigger the download now. +if [[ "$ENABLE_VOICE" == "true" ]]; then + if [[ "$GPU_BACKEND" == "nvidia" ]]; then + STT_MODEL="deepdml/faster-whisper-large-v3-turbo-ct2" + else + STT_MODEL="Systran/faster-whisper-base" + fi + STT_MODEL_ENCODED="${STT_MODEL//\//%2F}" + WHISPER_URL="http://localhost:${SERVICE_PORTS[whisper]:-9000}" + # Only download if model isn't already loaded + if ! curl -sf "${WHISPER_URL}/v1/models/${STT_MODEL_ENCODED}" &>/dev/null; then + ai "Downloading STT model (${STT_MODEL})..." + curl -sf -X POST "${WHISPER_URL}/v1/models/${STT_MODEL_ENCODED}" >> "$LOG_FILE" 2>&1 && \ + printf "\r ${BGRN}โœ“${NC} %-60s\n" "STT model cached (${STT_MODEL})" || \ + printf "\r ${AMB}โš ${NC} %-60s\n" "STT model will download on first use" + else + printf "\r ${BGRN}โœ“${NC} %-60s\n" "STT model already cached (${STT_MODEL})" + fi +fi + +[[ "$ENABLE_WORKFLOWS" == "true" ]] && check_service "n8n" "http://localhost:${SERVICE_PORTS[n8n]:-5678}${SERVICE_HEALTH[n8n]:-/healthz}" 30 || true +[[ "$ENABLE_RAG" == "true" ]] && check_service "Qdrant" "http://localhost:${SERVICE_PORTS[qdrant]:-6333}${SERVICE_HEALTH[qdrant]:-/}" 30 || true + +echo "" +signal "All systems nominal." +ai_ok "Sovereign intelligence is online." diff --git a/dream-server/installers/phases/13-summary.sh b/dream-server/installers/phases/13-summary.sh new file mode 100644 index 000000000..0ea6b373a --- /dev/null +++ b/dream-server/installers/phases/13-summary.sh @@ -0,0 +1,209 @@ +#!/bin/bash +# ============================================================================ +# Dream Server Installer โ€” Phase 13: Summary & Desktop Shortcut +# ============================================================================ +# Part of: installers/phases/ +# Purpose: Display URLs, create desktop shortcut, pin to sidebar, write +# summary JSON, run preflight validation +# +# Expects: DRY_RUN, INSTALL_DIR, SCRIPT_DIR, LOG_FILE, INTERACTIVE, +# TIER, TIER_NAME, VERSION, GPU_BACKEND, LLM_MODEL, OFFLINE_MODE, +# ENABLE_VOICE, ENABLE_WORKFLOWS, ENABLE_RAG, ENABLE_OPENCLAW, +# COMPOSE_FLAGS, SUMMARY_JSON_FILE, PREFLIGHT_REPORT_FILE, +# BGRN, GRN, AMB, WHT, NC, DASHBOARD_PORT (:-3001), +# CAP_HARDWARE_CLASS_ID (:-unknown), CAP_HARDWARE_CLASS_LABEL (:-Unknown), +# BACKEND_SERVICE_NAME (:-llama-server), +# show_success_card(), bootline(), signal(), ai_ok(), log() +# Provides: Desktop shortcut, sidebar pin, summary JSON +# +# Modder notes: +# Change the final banner, add new service URLs, or modify the desktop +# shortcut here. +# ============================================================================ + +# Source service registry for port resolution +. "$SCRIPT_DIR/lib/service-registry.sh" +sr_load + +# Get local IP for LAN access +LOCAL_IP=$(hostname -I 2>/dev/null | awk '{print $1}' || echo "") + +# Mode is now stored in .env as DREAM_MODE (set by phase 06) +if ! $DRY_RUN; then + mkdir -p "$INSTALL_DIR" +else + log "[DRY RUN] Would write mode metadata to $INSTALL_DIR" +fi + +# Show the cinematic success card +show_success_card "http://localhost:3000" "http://localhost:3001" "$LOCAL_IP" + +# Additional service info +bootline +echo -e "${BGRN}ALL SERVICES${NC}" +bootline +# Core services always shown +echo " โ€ข Chat UI: http://localhost:${SERVICE_PORTS[open-webui]:-3000}" +echo " โ€ข Dashboard: http://localhost:${SERVICE_PORTS[dashboard]:-3001}" +echo " โ€ข Perplexica: http://localhost:${SERVICE_PORTS[perplexica]:-3004}" +echo " โ€ข ComfyUI: http://localhost:${SERVICE_PORTS[comfyui]:-8188}" +echo " โ€ข LLM API: http://localhost:${SERVICE_PORTS[llama-server]:-8080}/v1 (llama-server)" +[[ "$ENABLE_OPENCLAW" == "true" ]] && echo " โ€ข OpenClaw: http://localhost:${SERVICE_PORTS[openclaw]:-7860}" +systemctl is-active opencode-web &>/dev/null && echo " โ€ข OpenCode: http://localhost:3003" +[[ "$ENABLE_VOICE" == "true" ]] && echo " โ€ข Whisper STT: http://localhost:${SERVICE_PORTS[whisper]:-9000}" +[[ "$ENABLE_VOICE" == "true" ]] && echo " โ€ข TTS (Kokoro): http://localhost:${SERVICE_PORTS[tts]:-8880}" +[[ "$ENABLE_WORKFLOWS" == "true" ]] && echo " โ€ข n8n: http://localhost:${SERVICE_PORTS[n8n]:-5678}" +[[ "$ENABLE_RAG" == "true" ]] && echo " โ€ข Qdrant: http://localhost:${SERVICE_PORTS[qdrant]:-6333}" +echo "" + +# Configuration summary +bootline +echo -e "${BGRN}YOUR CONFIGURATION${NC}" +bootline +echo " โ€ข Tier: $TIER ($TIER_NAME)" +echo " โ€ข Model: $LLM_MODEL" +echo " โ€ข Install dir: $INSTALL_DIR" +echo "" + +# Quick commands +bootline +echo -e "${BGRN}QUICK COMMANDS${NC}" +bootline +echo " cd $INSTALL_DIR" +echo " docker compose ps # Check container status" +echo " docker compose logs -f # View container logs" +echo " docker compose restart # Restart containers" +echo " systemctl --user list-timers # Check maintenance timers" +echo " dream status # Check service health" +echo "" + +if [[ -f "$LOG_FILE" ]]; then + echo -e "${BGRN}Full installation log:${NC} $LOG_FILE" + echo "" +fi +if [[ -f "$PREFLIGHT_REPORT_FILE" ]]; then + echo -e "${BGRN}Preflight report:${NC} $PREFLIGHT_REPORT_FILE" + echo "" +fi + +# Run preflight check to validate installation +echo "" +bootline +echo -e "${BGRN}RUNNING PREFLIGHT VALIDATION${NC}" +bootline +echo "" + +if [[ -f "$SCRIPT_DIR/dream-preflight.sh" ]]; then + # Wait a moment for services to stabilize + sleep 2 + bash "$SCRIPT_DIR/dream-preflight.sh" || true +else + log "Preflight script not found โ€” skipping validation" +fi + +#============================================================================= +# Desktop Shortcut & Sidebar Pin +#============================================================================= +if ! $DRY_RUN; then + DESKTOP_FILE="$HOME/.local/share/applications/dream-server.desktop" + mkdir -p "$HOME/.local/share/applications" + cat > "$DESKTOP_FILE" << DESKTOP_EOF +[Desktop Entry] +Version=1.0 +Type=Application +Name=Dream Server +Comment=Local AI Dashboard +Exec=xdg-open http://localhost:3001 +Icon=applications-internet +Terminal=false +Categories=Development; +StartupNotify=true +DESKTOP_EOF + + # Pin to GNOME sidebar (favorites) if gsettings is available + if command -v gsettings &> /dev/null; then + CURRENT_FAVS=$(gsettings get org.gnome.shell favorite-apps 2>/dev/null || echo "[]") + if [[ "$CURRENT_FAVS" != *"dream-server.desktop"* ]]; then + NEW_FAVS=$(echo "$CURRENT_FAVS" | sed "s/]$/, 'dream-server.desktop']/" | sed "s/\[, /[/") + gsettings set org.gnome.shell favorite-apps "$NEW_FAVS" 2>/dev/null || true + ai_ok "Dashboard pinned to sidebar" + fi + fi + + ai_ok "Desktop shortcut created: Dream Server" +fi + +echo "" +signal "Broadcast stable. You're free now." +echo "" +DASHBOARD_PORT="${SERVICE_PORTS[dashboard]:-3001}" +WEBUI_PORT="${SERVICE_PORTS[open-webui]:-3000}" +OPENCLAW_PORT="${SERVICE_PORTS[openclaw]:-7860}" +LOCAL_IP=$(hostname -I 2>/dev/null | awk '{print $1}') +echo -e "${GRN}โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€${NC}" +echo -e "${BGRN} YOUR DREAM SERVER IS LIVE${NC}" +echo -e "${GRN}โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€${NC}" +echo "" +echo -e " ${BGRN}Dashboard${NC} ${WHT}http://localhost:${DASHBOARD_PORT}${NC}" +echo -e " ${BGRN}Chat${NC} ${WHT}http://localhost:${WEBUI_PORT}${NC}" +[[ "$ENABLE_OPENCLAW" == "true" ]] && \ +echo -e " ${BGRN}OpenClaw${NC} ${WHT}http://localhost:${OPENCLAW_PORT}${NC}" +systemctl is-active opencode-web &>/dev/null && \ +echo -e " ${BGRN}OpenCode${NC} ${WHT}http://localhost:3003${NC}" +echo "" +if [[ -n "$LOCAL_IP" ]]; then +echo -e " ${AMB}On your network:${NC} ${WHT}http://${LOCAL_IP}:${DASHBOARD_PORT}${NC}" +fi +echo "" +echo -e " Start here โ†’ ${WHT}http://localhost:${DASHBOARD_PORT}${NC}" +echo -e " The Dashboard shows all services, GPU status, and quick links." +echo "" +echo -e "${GRN}โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€${NC}" +echo "" + +if [[ -n "$SUMMARY_JSON_FILE" ]]; then + python3 - "$SUMMARY_JSON_FILE" "$VERSION" "$INSTALL_DIR" "$TIER" "$TIER_NAME" "$GPU_BACKEND" "${BACKEND_SERVICE_NAME:-llama-server}" "$LLM_MODEL" "$COMPOSE_FLAGS" "$DRY_RUN" "$PREFLIGHT_REPORT_FILE" "${CAP_HARDWARE_CLASS_ID:-unknown}" "${CAP_HARDWARE_CLASS_LABEL:-Unknown}" <<'PY' +import json +import pathlib +import sys +from datetime import datetime, timezone + +( + out_file, + version, + install_dir, + tier, + tier_name, + gpu_backend, + backend_service, + llm_model, + compose_flags, + dry_run, + preflight_report, + hw_class_id, + hw_class_label, +) = sys.argv[1:] + +payload = { + "version": "1", + "generated_at": datetime.now(timezone.utc).isoformat(), + "installer_version": version, + "install_dir": install_dir, + "tier": {"id": tier, "name": tier_name}, + "runtime": { + "gpu_backend": gpu_backend, + "backend_service": backend_service, + "llm_model": llm_model, + "compose_flags": compose_flags, + "dry_run": dry_run == "true", + }, + "hardware_class": {"id": hw_class_id, "label": hw_class_label}, + "preflight_report": preflight_report, +} + +path = pathlib.Path(out_file) +path.parent.mkdir(parents=True, exist_ok=True) +path.write_text(json.dumps(payload, indent=2) + "\n", encoding="utf-8") +print(f"[INFO] Wrote installer summary JSON: {out_file}") +PY +fi diff --git a/dream-server/installers/windows.ps1 b/dream-server/installers/windows.ps1 new file mode 100644 index 000000000..2ae2dd6af --- /dev/null +++ b/dream-server/installers/windows.ps1 @@ -0,0 +1,201 @@ +#!/usr/bin/env pwsh +<# +Dream Server Windows installer (WSL2-delegated MVP). +Runs preflight checks on Windows, then delegates to install-core.sh inside WSL. +#> + +[CmdletBinding()] +param( + [switch]$NoDelegate, + [switch]$SkipDockerCheck, + [string]$Distro = "", + [string]$ReportPath = "$env:TEMP\\dream-server-windows-preflight.json", + [Parameter(ValueFromRemainingArguments = $true)] + [string[]]$PassthroughArgs +) + +$ErrorActionPreference = "Stop" +$checks = @() + +function Write-Section([string]$Message) { + Write-Host "" + Write-Host $Message -ForegroundColor Cyan +} + +function Add-Check([string]$Id, [string]$Status, [string]$Message, [string]$Action = "") { + $script:checks += [pscustomobject]@{ + id = $Id + status = $Status + message = $Message + action = $Action + } +} + +function Convert-ToWslPath([string]$WindowsPath) { + if ($WindowsPath -match '^([A-Za-z]):\\(.*)$') { + $drive = $Matches[1].ToLower() + $rest = $Matches[2] -replace '\\', '/' + return "/mnt/$drive/$rest" + } + return $WindowsPath -replace '\\', '/' +} + +Write-Host "Dream Server Windows installer (WSL2 path)" -ForegroundColor Cyan + +Write-Section "Checking prerequisites" +if (-not (Get-Command wsl.exe -ErrorAction SilentlyContinue)) { + Write-Host "[ERROR] WSL is not installed." -ForegroundColor Red + Write-Host "Install WSL first: wsl --install" + Add-Check "wsl-installed" "blocker" "WSL is not installed." "Run: wsl --install" +} else { + Add-Check "wsl-installed" "pass" "WSL command is available." +} + +$wslStatus = "" +if (Get-Command wsl.exe -ErrorAction SilentlyContinue) { + try { + $wslStatus = (& wsl.exe --status 2>$null | Out-String) + } catch { } + if ($wslStatus -match "Default Version:\s*2") { + Add-Check "wsl-default-version" "pass" "WSL default version is 2." + } else { + Add-Check "wsl-default-version" "warn" "WSL default version is not clearly set to 2." "Run: wsl --set-default-version 2" + } +} + +$distroList = @() +if (Get-Command wsl.exe -ErrorAction SilentlyContinue) { + $distroList = (& wsl.exe -l -q 2>$null | Where-Object { $_.Trim() -ne "" }) +} +if (-not $distroList) { + Write-Host "[ERROR] No WSL distro found." -ForegroundColor Red + Write-Host "Install Ubuntu (example): wsl --install -d Ubuntu" + Add-Check "wsl-distro" "blocker" "No WSL distro found." "Run: wsl --install -d Ubuntu" +} else { + Add-Check "wsl-distro" "pass" "Detected WSL distro(s): $($distroList -join ', ')" +} + +if ([string]::IsNullOrWhiteSpace($Distro)) { + if ($distroList.Count -gt 0) { + $Distro = $distroList[0].Trim() + } +} + +if (-not $SkipDockerCheck) { + if (-not (Get-Command docker -ErrorAction SilentlyContinue)) { + Write-Host "[WARN] docker CLI not found on Windows PATH." -ForegroundColor Yellow + Write-Host "Install Docker Desktop and enable WSL integration." + Add-Check "docker-cli" "warn" "docker CLI not found on Windows PATH." "Install Docker Desktop and reopen terminal." + } else { + Add-Check "docker-cli" "pass" "docker CLI found." + try { + $dockerInfo = docker info 2>$null | Out-String + $null = docker version --format '{{.Server.Version}}' 2>$null + Write-Host "[OK] Docker Desktop engine reachable." + Add-Check "docker-daemon" "pass" "Docker Desktop engine reachable." + if ($dockerInfo -match "WSL2:\s*true") { + Add-Check "docker-wsl2" "pass" "Docker reports WSL2 backend enabled." + } else { + Add-Check "docker-wsl2" "warn" "Docker WSL2 backend not confirmed from docker info output." "Enable 'Use the WSL2 based engine' in Docker Desktop settings." + } + } catch { + Write-Host "[WARN] Docker Desktop not reachable yet." -ForegroundColor Yellow + Write-Host "Start Docker Desktop before running install for real." + Add-Check "docker-daemon" "warn" "Docker Desktop not reachable." "Start Docker Desktop and retry." + } + } +} + +if ($Distro) { + try { + $wslDocker = (& wsl.exe -d $Distro -- bash -lc "command -v docker >/dev/null && echo ok || echo missing" 2>$null).Trim() + if ($wslDocker -eq "ok") { + Add-Check "wsl-docker-cli" "pass" "docker CLI available inside WSL distro '$Distro'." + } else { + Add-Check "wsl-docker-cli" "warn" "docker CLI unavailable inside WSL distro '$Distro'." "Enable Docker Desktop WSL integration for this distro." + } + } catch { + Add-Check "wsl-docker-cli" "warn" "Could not verify docker CLI inside WSL distro '$Distro'." "Open WSL and run: docker info" + } +} + +if (Get-Command nvidia-smi -ErrorAction SilentlyContinue) { + Write-Host "[OK] NVIDIA tooling detected on Windows host." + Add-Check "windows-nvidia-smi" "pass" "nvidia-smi available on Windows host." +} else { + Write-Host "[INFO] nvidia-smi not found on Windows host (non-NVIDIA or not installed)." + Add-Check "windows-nvidia-smi" "warn" "nvidia-smi not detected on Windows host." "Install/update NVIDIA driver if targeting NVIDIA acceleration." +} + +if ($Distro) { + try { + $wslNvidia = (& wsl.exe -d $Distro -- bash -lc "if command -v nvidia-smi >/dev/null 2>&1; then nvidia-smi -L >/dev/null 2>&1 && echo ok || echo missing; else echo missing; fi" 2>$null).Trim() + if ($wslNvidia -eq "ok") { + Add-Check "wsl-nvidia-smi" "pass" "NVIDIA GPU visible inside WSL." + } else { + Add-Check "wsl-nvidia-smi" "warn" "NVIDIA GPU not visible inside WSL." "Verify WSL GPU support and Docker Desktop GPU passthrough." + } + } catch { + Add-Check "wsl-nvidia-smi" "warn" "Could not verify NVIDIA GPU inside WSL." "Open WSL and run: nvidia-smi" + } +} + +try { + $blockers = @($checks | Where-Object { $_.status -eq "blocker" }).Count + $warnings = @($checks | Where-Object { $_.status -eq "warn" }).Count + $report = [pscustomobject]@{ + version = "1" + generated_at = (Get-Date).ToUniversalTime().ToString("o") + distro = $Distro + summary = [pscustomobject]@{ + checks = $checks.Count + blockers = $blockers + warnings = $warnings + can_proceed = ($blockers -eq 0) + } + checks = $checks + } + $report | ConvertTo-Json -Depth 8 | Set-Content -Path $ReportPath -Encoding UTF8 + Write-Host "[INFO] Preflight report: $ReportPath" +} catch { + Write-Host "[WARN] Could not write preflight report: $($_.Exception.Message)" -ForegroundColor Yellow +} + +if (@($checks | Where-Object { $_.status -eq "blocker" }).Count -gt 0) { + Write-Host "[ERROR] Preflight blockers found. Fix them, then retry." -ForegroundColor Red + $checks | Where-Object { $_.status -eq "blocker" } | ForEach-Object { + Write-Host " - $($_.message)" -ForegroundColor Red + if ($_.action) { Write-Host " Fix: $($_.action)" } + } + exit 1 +} + +$repoRoot = Split-Path -Parent (Split-Path -Parent $PSCommandPath) +$repoRootWsl = Convert-ToWslPath $repoRoot +$argsString = "" +if ($PassthroughArgs) { + $escaped = $PassthroughArgs | ForEach-Object { "'" + ($_ -replace "'", "'\\''") + "'" } + $argsString = ($escaped -join " ") +} + +Write-Section "WSL delegation target" +Write-Host "Repo path (Windows): $repoRoot" +Write-Host "Repo path (WSL): $repoRootWsl" + +$wslCommand = "cd '$repoRootWsl' && bash install-core.sh $argsString" +Write-Host "Command:" +Write-Host " wsl.exe bash -lc `"$wslCommand`"" + +if ($NoDelegate) { + Write-Host "" + Write-Host "Delegation skipped (--NoDelegate)." -ForegroundColor Yellow + exit 0 +} + +Write-Section "Running installer in WSL" +if ($Distro) { + & wsl.exe -d $Distro bash -lc $wslCommand +} else { + & wsl.exe bash -lc $wslCommand +} +exit $LASTEXITCODE diff --git a/dream-server/landing.html b/dream-server/landing.html deleted file mode 100644 index 397ab921b..000000000 --- a/dream-server/landing.html +++ /dev/null @@ -1,490 +0,0 @@ - - - - - - Dream Server โ€” One Command to Your Own Local ChatGPT - - - - -
-
- ๐Ÿš€ Now Available -

Your Own ChatGPT.
Running Locally.
One Command.

-

Dream Server is a turnkey local AI stack. Buy hardware, run one command, and you have voice agents, RAG, and workflows running on your own machine. Stop paying per token.

- - - -
-
# Clone and install (requires GPU)
-git clone https://github.com/Light-Heart-Labs/Lighthouse-AI.git
-cd Lighthouse-AI/dream-server
-./install.sh
-
-
-
- -
-
-
-
-

โŒ The Problem

-
    -
  • ๐Ÿ’ธ Paying $0.01-0.06 per 1K tokens, forever
  • -
  • ๐Ÿ”’ Your data leaving your network to OpenAI
  • -
  • ๐Ÿšซ Rate limits killing your batch jobs
  • -
  • โฐ API outages at the worst times
  • -
  • ๐Ÿข Vendor lock-in to one provider
  • -
-
-
-

โœ… The Solution

-
    -
  • ๐Ÿ’ฐ One-time hardware cost, unlimited inference
  • -
  • ๐Ÿ  Everything runs on your machine
  • -
  • โšก No limits โ€” run as many requests as your GPU handles
  • -
  • ๐Ÿ”„ No dependencies on external services
  • -
  • ๐Ÿ”“ Open models you can customize
  • -
-
-
-
-
- -
-
-

Everything You Need, Out of the Box

-
-
-
๐Ÿง 
-

Fast LLM Inference

-

vLLM serving Qwen 32B โ€” comparable to GPT-4 on most tasks. 40+ tokens/second on consumer hardware.

-
-
-
๐Ÿ’ฌ
-

Beautiful Chat UI

-

Open WebUI provides a familiar ChatGPT-like interface. Multi-user support included.

-
-
-
๐ŸŽ™๏ธ
-

Voice Input/Output

-

Whisper for transcription, Piper for text-to-speech. Build voice agents that run locally.

-
-
-
๐Ÿ”
-

RAG Pipeline

-

Qdrant vector database + embeddings. Ask questions about your documents.

-
-
-
โš™๏ธ
-

Workflow Automation

-

n8n for building automations. 4 starter workflows included (daily digest, doc Q&A, voice memo).

-
-
-
๐Ÿค–
-

Agent Framework

-

Optional OpenClaw integration for autonomous agents that can use tools and browse the web.

-
-
-
-
- -
-
-

Hardware Recommendations

-

Dream Server auto-detects your hardware and configures appropriately

- -
-
-

Entry

-
~$3,000
-
- RTX 4070 (12GB)
- 7B-14B models
- Good for personal use -
-
- -
-

Pro

-
~$15,000
-
- RTX A6000 (48GB)
- 32B-70B models
- For teams and heavy use -
-
-
-

Enterprise

-
~$30,000
-
- Dual GPU setup
- 70B+ models
- High concurrency workloads -
-
-
-
-
- -
-
-

Ready to Own Your AI?

-

Get notified when we launch on Product Hunt, plus receive our hardware buying guide.

- - - -

- Or contact us for a custom setup consultation. -

-
-
- - - - - - diff --git a/dream-server/lib/progress.sh b/dream-server/lib/progress.sh index 96f916c87..3570b54d5 100644 --- a/dream-server/lib/progress.sh +++ b/dream-server/lib/progress.sh @@ -1,6 +1,6 @@ #!/bin/bash # Dream Server โ€” Progress Bar Utilities -# Sourced by setup.sh for download/install progress display +# Sourced by install-core.sh for download/install progress display # โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• # PROGRESS BAR @@ -172,7 +172,7 @@ docker_pull_with_progress() { fi } -# Monitor model download progress (for vLLM/HuggingFace downloads) +# Monitor model download progress (for llama-server/GGUF downloads) # Watches a directory for model files and shows progress monitor_model_download() { local model_dir=$1 diff --git a/dream-server/lib/service-registry.sh b/dream-server/lib/service-registry.sh new file mode 100644 index 000000000..01cd09c16 --- /dev/null +++ b/dream-server/lib/service-registry.sh @@ -0,0 +1,156 @@ +#!/bin/bash +# Service Registry โ€” loads extension manifests and provides lookup functions. +# Source this file: . "$SCRIPT_DIR/lib/service-registry.sh" + +EXTENSIONS_DIR="${SCRIPT_DIR:-$(pwd)}/extensions/services" +_SR_LOADED=false +_SR_CACHE="/tmp/dream-service-registry.$$.sh" + +# Associative arrays (bash 4+) +declare -A SERVICE_ALIASES # alias โ†’ service_id +declare -A SERVICE_CONTAINERS # service_id โ†’ container_name +declare -A SERVICE_COMPOSE # service_id โ†’ compose file path +declare -A SERVICE_CATEGORIES # service_id โ†’ core|recommended|optional +declare -A SERVICE_DEPENDS # service_id โ†’ space-separated dependency IDs +declare -A SERVICE_HEALTH # service_id โ†’ health endpoint path +declare -A SERVICE_PORTS # service_id โ†’ external port (what the user hits on localhost) +declare -A SERVICE_PORT_ENVS # service_id โ†’ env var name for the external port +declare -A SERVICE_NAMES # service_id โ†’ display name +declare -A SERVICE_SETUP_HOOKS # service_id โ†’ absolute path to setup script +declare -a SERVICE_IDS # ordered list of all service IDs + +sr_load() { + [[ "$_SR_LOADED" == "true" ]] && return 0 + SERVICE_IDS=() + + # Single Python pass: reads ALL manifests, emits sourceable bash + python3 - "$EXTENSIONS_DIR" <<'PYEOF' > "$_SR_CACHE" +import yaml, sys, os +from pathlib import Path + +ext_dir = Path(sys.argv[1]) +if not ext_dir.exists(): + sys.exit(0) + +for service_dir in sorted(ext_dir.iterdir()): + if not service_dir.is_dir(): + continue + manifest_path = None + for name in ("manifest.yaml", "manifest.yml", "manifest.json"): + candidate = service_dir / name + if candidate.exists(): + manifest_path = candidate + break + if not manifest_path: + continue + try: + with open(manifest_path) as f: + m = yaml.safe_load(f) + if m.get("schema_version") != "dream.services.v1": + continue + s = m.get("service", {}) + sid = s.get("id", "") + if not sid: + continue + aliases = s.get("aliases", []) + container = s.get("container_name", f"dream-{sid}") + compose_file = s.get("compose_file", "") + category = s.get("category", "optional") + depends = s.get("depends_on", []) + + # Resolve compose path (relative to extension dir) + compose_path = "" + if compose_file: + full = service_dir / compose_file + if full.exists(): + compose_path = str(full) + + # Emit sourceable lines + print(f'SERVICE_IDS+=("{sid}")') + print(f'SERVICE_ALIASES["{sid}"]="{sid}"') + for a in aliases: + print(f'SERVICE_ALIASES["{a}"]="{sid}"') + print(f'SERVICE_CONTAINERS["{sid}"]="{container}"') + print(f'SERVICE_COMPOSE["{sid}"]="{compose_path}"') + print(f'SERVICE_CATEGORIES["{sid}"]="{category}"') + print(f'SERVICE_DEPENDS["{sid}"]="{" ".join(depends)}"') + health = s.get("health", "/health") + port = s.get("external_port_default", s.get("port", 0)) + port_env = s.get("external_port_env", "") + print(f'SERVICE_HEALTH["{sid}"]="{health}"') + print(f'SERVICE_PORTS["{sid}"]="{port}"') + print(f'SERVICE_PORT_ENVS["{sid}"]="{port_env}"') + print(f'SERVICE_NAMES["{sid}"]="{s.get("name", sid)}"') + setup_hook = s.get("setup_hook", "") + setup_path = "" + if setup_hook: + full = service_dir / setup_hook + if full.exists(): + setup_path = str(full) + print(f'SERVICE_SETUP_HOOKS["{sid}"]="{setup_path}"') + except Exception: + continue +PYEOF + + # Source the generated registry (one subprocess for all manifests) + [[ -f "$_SR_CACHE" ]] && . "$_SR_CACHE" + rm -f "$_SR_CACHE" + _SR_LOADED=true +} + +# Resolve a user-provided name to a compose service ID +sr_resolve() { + sr_load + local input="$1" + echo "${SERVICE_ALIASES[$input]:-$input}" +} + +# Get container name for a service ID +sr_container() { + sr_load + local sid + sid=$(sr_resolve "$1") + echo "${SERVICE_CONTAINERS[$sid]:-dream-$sid}" +} + +# Get compose fragment path for a service ID +sr_compose_file() { + sr_load + local sid + sid=$(sr_resolve "$1") + echo "${SERVICE_COMPOSE[$sid]:-}" +} + +# List all service IDs +sr_list_all() { + sr_load + printf '%s\n' "${SERVICE_IDS[@]}" +} + +# List enabled services (have compose fragments that exist) +sr_list_enabled() { + sr_load + for sid in "${SERVICE_IDS[@]}"; do + local cf="${SERVICE_COMPOSE[$sid]}" + [[ -n "$cf" && -f "$cf" ]] && echo "$sid" + done +} + +# Get display name for a service ID +sr_service_names() { + sr_load + for sid in "${SERVICE_IDS[@]}"; do + printf '%s\t%s\n' "$sid" "${SERVICE_NAMES[$sid]:-$sid}" + done +} + +# Build compose -f flags for all enabled extension services +sr_compose_flags() { + sr_load + local flags="" + for sid in "${SERVICE_IDS[@]}"; do + local cf="${SERVICE_COMPOSE[$sid]}" + [[ -n "$cf" && -f "$cf" ]] && flags="$flags -f $cf" + done + echo "$flags" +} diff --git a/dream-server/livekit.yaml b/dream-server/livekit.yaml deleted file mode 100644 index 8a0d51133..000000000 --- a/dream-server/livekit.yaml +++ /dev/null @@ -1,10 +0,0 @@ - -# Agent dispatch -agent: - dispatch: - enabled: true - -# Room configuration for auto-dispatch -room_defaults: - agent_dispatch: - enabled: true diff --git a/dream-server/manifest.example-major.json b/dream-server/manifest.example-major.json deleted file mode 100644 index 5b573ae4a..000000000 --- a/dream-server/manifest.example-major.json +++ /dev/null @@ -1,31 +0,0 @@ -{ - "version": "3.0.0", - "release_date": "2026-03-01", - "min_version": "2.5.0", - "breaking_changes": true, - "changelog": [ - "NEW: Multi-user support with role-based access", - "NEW: Built-in SSL certificate management", - "CHANGED: Database schema for chat history (requires migration)", - "CHANGED: Environment variable names (see migration guide)", - "REMOVED: Legacy config format support" - ], - "docker_images": { - "open-webui": "ghcr.io/open-webui/open-webui:1.0.0", - "n8n": "docker.n8n.io/n8nio/n8n:2.0.0", - "qdrant": "qdrant/qdrant:v2.0.0", - "openclaw": "openclaw/openclaw:v3.0.0", - "postgres": "postgres:16-alpine" - }, - "config_files": [ - "config/litellm/config.yaml", - "config/openclaw/openclaw.json", - "config/nginx/ssl.conf" - ], - "migrations": [ - "migrate-v2-to-v3-database.sh", - "migrate-v2-to-v3-env.sh" - ], - "migration_notes": "This is a MAJOR update with breaking changes. Please review the migration guide before updating.", - "pre_update_warning": "You must be on version 2.5.0 or higher before updating to 3.0.0" -} diff --git a/dream-server/manifest.example.json b/dream-server/manifest.example.json deleted file mode 100644 index 998c49e4c..000000000 --- a/dream-server/manifest.example.json +++ /dev/null @@ -1,24 +0,0 @@ -{ - "version": "2.1.0", - "release_date": "2026-02-15", - "min_version": "2.0.0", - "breaking_changes": false, - "changelog": [ - "Added automatic update system (dream-update.sh)", - "Improved backup and restore reliability", - "Fixed n8n webhook configuration persistence", - "Updated Open WebUI to latest stable version" - ], - "docker_images": { - "open-webui": "ghcr.io/open-webui/open-webui:0.5.7", - "n8n": "docker.n8n.io/n8nio/n8n:1.78.0", - "qdrant": "qdrant/qdrant:v1.13.0", - "openclaw": "openclaw/openclaw:latest" - }, - "config_files": [ - "config/litellm/config.yaml", - "config/openclaw/openclaw.json" - ], - "migrations": [], - "migration_notes": "No migrations required for this update (patch release)" -} diff --git a/dream-server/manifest.json b/dream-server/manifest.json new file mode 100644 index 000000000..0a4bf3930 --- /dev/null +++ b/dream-server/manifest.json @@ -0,0 +1,64 @@ +{ + "manifestVersion": "1.0.0", + "release": { + "version": "2.0.0", + "channel": "stable", + "date": "2026-03-03" + }, + "compatibility": { + "os": { + "linux": { + "supported": true, + "notes": "Primary support target" + }, + "windows_wsl2": { + "supported": true, + "notes": "Tier B path via WSL2 + Docker Desktop" + }, + "macos": { + "supported": false, + "notes": "Installer dispatch stub present, runtime support pending" + }, + "windows_native": { + "supported": false, + "notes": "Installer stub present, production workflow pending" + } + }, + "gpuBackends": { + "amd": { + "supported": true, + "minUnifiedMemoryGb": 64 + }, + "nvidia": { + "supported": true, + "minVramGb": 8 + } + }, + "dependencies": { + "dockerComposeV2": ">=2.0.0", + "python": ">=3.10" + } + }, + "contracts": { + "compose": { + "canonical": [ + "docker-compose.base.yml", + "docker-compose.amd.yml", + "docker-compose.nvidia.yml", + "docker-compose.apple.yml" + ], + "legacyFallback": [ + "docker-compose.yml" + ] + }, + "workflowCatalog": { + "canonicalPath": "config/n8n/catalog.json", + "legacyFallbackPath": "workflows/catalog.json" + }, + "extensions": { + "serviceManifestSchema": "extensions/schema/service-manifest.v1.json", + "serviceDirectory": "extensions/services/", + "serviceRegistryLib": "lib/service-registry.sh" + } + } +} diff --git a/dream-server/memory-shepherd/README.md b/dream-server/memory-shepherd/README.md new file mode 100644 index 000000000..02dff29e9 --- /dev/null +++ b/dream-server/memory-shepherd/README.md @@ -0,0 +1,277 @@ +# Memory Shepherd + +Periodic memory reset for persistent LLM agents. Keeps agents on-mission by archiving their scratch notes and resetting their memory files to a known-good baseline. + +## The Problem + +Persistent LLM agents accumulate state over time. Their working memory fills with stale notes, outdated context, and resolved task details. Without intervention: + +- Agents **drift from their defined roles**, gradually shifting behavior as old context influences new decisions +- Context becomes **bloated with irrelevant information**, degrading response quality +- Agents sometimes **rewrite their own instructions**, subtly altering their operating parameters +- Stale context creates **confusion between past and present tasks** + +## The Solution + +Memory Shepherd implements a simple pattern: + +1. **Baseline** โ€” A curated identity document (who the agent is, its rules, capabilities, and pointers) lives above a `---` separator in the agent's `MEMORY.md` +2. **Scratch notes** โ€” The agent writes working notes below the separator during operation +3. **Reset cycle** โ€” On a schedule (default: every 3 hours), scratch notes are archived and `MEMORY.md` is restored to the baseline + +The result: agents always start from a clean, operator-controlled state while their accumulated notes are preserved in timestamped archives. + +## How It Works + +``` +MEMORY.md +โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” +โ”‚ ## Who I Am โ”‚ +โ”‚ ## Critical Rules โ”‚ โ† Baseline (operator-controlled) +โ”‚ ## Capabilities โ”‚ Never modified by the agent +โ”‚ ## Where to Find Things โ”‚ +โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค +โ”‚ --- โ”‚ โ† Separator (the contract) +โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค +โ”‚ ## Scratch Notes โ”‚ +โ”‚ - Found bug in auth module โ”‚ โ† Agent scratch notes +โ”‚ - PR #42 approved, waiting on CI โ”‚ Written during operation +โ”‚ - Need to follow up on deployment โ”‚ Archived + cleared on reset +โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ +``` + +Each reset cycle: +1. Reads the current `MEMORY.md` +2. Finds the last `---` separator +3. Extracts everything below it (scratch notes) +4. Archives scratch notes to a timestamped file +5. Atomically replaces `MEMORY.md` with the baseline +6. Cleans up archives older than 30 days (configurable) + +## Quick Start + +```bash +# Clone the repo +git clone https://github.com/Light-Heart-Labs/DreamServer.git +cd DreamServer/memory-shepherd + +# Create your config from the example +cp memory-shepherd.conf.example memory-shepherd.conf + +# Edit the config โ€” point it at your agent's MEMORY.md +vim memory-shepherd.conf + +# Create a baseline for your agent +cp baselines/example-agent-MEMORY.md baselines/my-agent-MEMORY.md +vim baselines/my-agent-MEMORY.md + +# Test a manual reset +./memory-shepherd.sh my-agent + +# Install systemd timers for automatic resets +./install.sh +``` + +## Configuration Reference + +Memory Shepherd uses an INI-style config file. The search order is: + +1. `$MEMORY_SHEPHERD_CONF` environment variable +2. `./memory-shepherd.conf` (next to the script) +3. `/etc/memory-shepherd/memory-shepherd.conf` + +### `[general]` Section + +| Key | Default | Description | +|-----|---------|-------------| +| `baseline_dir` | `./baselines` | Directory containing baseline MEMORY.md files | +| `archive_dir` | `./archives` | Root directory for archived scratch notes | +| `max_memory_size` | `16384` | Max memory file size (bytes) before warning | +| `archive_retention_days` | `30` | Delete archives older than this | +| `separator` | `---` | The line that separates baseline from scratch notes | + +### Agent Sections + +Each `[agent-name]` section defines one managed agent: + +| Key | Required | Description | +|-----|----------|-------------| +| `memory_file` | Yes* | Absolute path to the agent's MEMORY.md | +| `baseline` | Yes | Filename of the baseline in `baseline_dir` | +| `archive_subdir` | No | Subdirectory under `archive_dir` (default: agent name) | +| `remote_host` | No | Hostname/IP for remote agents (triggers SCP mode) | +| `remote_user` | No | SSH user for remote agents (default: current user) | +| `remote_memory` | Yes* | Path to MEMORY.md on the remote machine | + +*`memory_file` is required for local agents; `remote_memory` is required when `remote_host` is set. + +### Example Config + +```ini +[general] +baseline_dir=./baselines +archive_dir=./archives +max_memory_size=16384 +archive_retention_days=30 + +[code-reviewer] +memory_file=/home/deploy/code-reviewer/.openclaw/workspace/MEMORY.md +baseline=code-reviewer-MEMORY.md + +[monitor-bot] +memory_file=/home/deploy/monitor/.openclaw/workspace/MEMORY.md +baseline=monitor-bot-MEMORY.md +archive_subdir=monitor + +[remote-agent] +remote_host=10.0.0.50 +remote_user=deploy +remote_memory=/home/deploy/agent/.openclaw/workspace/MEMORY.md +baseline=remote-agent-MEMORY.md +``` + +## The `---` Separator Convention + +The separator is a contract between the operator and the agent: + +**Above the line** is the operator's domain. It defines who the agent is, what rules it follows, what tools it has, and where to find things. The agent must never modify this section. + +**Below the line** is the agent's domain. It's scratch space for working notes, observations, task tracking, and anything the agent needs during its current work cycle. + +For this contract to work, the agent needs to know about it. Include a brief explanation in your baseline: + +```markdown +*This is your baseline memory. You can add notes below the --- line. +Your additions will be periodically archived and this file reset to baseline.* +``` + +See [docs/WRITING-BASELINES.md](docs/WRITING-BASELINES.md) for a comprehensive guide to writing effective baselines. + +## Writing Effective Baselines + +A good baseline answers: "If this agent lost all memory, what does it need to start working correctly?" + +Key sections: +- **Identity** โ€” Role, purpose, who it reports to +- **Rules** โ€” 5-7 hard boundaries (specific and actionable, not vague) +- **Autonomy tiers** โ€” What it can do freely vs. what needs approval +- **Capabilities** โ€” Models, tools, services it can access +- **Pointers** โ€” Where to find docs, repos, configs (point, don't paste) +- **Memory system** โ€” Explain the reset cycle so the agent writes better notes + +**Size sweet spot:** 12-20KB. Under 5KB means the agent will spend cycles rediscovering context. Over 25KB means you're probably including content that belongs in separate docs. + +The full guide is at [docs/WRITING-BASELINES.md](docs/WRITING-BASELINES.md). + +## Systemd Timers + +`install.sh` creates systemd timer/service pairs: + +- **`memory-shepherd.timer`** โ€” Resets all agents every 3 hours (enabled by default) +- **`memory-shepherd-.timer`** โ€” Per-agent timer with staggered scheduling (installed but not enabled) + +```bash +# Install timers (detects root vs. user mode automatically) +./install.sh + +# Preview without installing +./install.sh --dry-run + +# Custom systemd prefix +./install.sh --prefix /etc/systemd/system + +# Remove all timers +./uninstall.sh + +# Manual reset +./memory-shepherd.sh all # Reset all agents +./memory-shepherd.sh code-reviewer # Reset one agent + +# Check timer status +systemctl list-timers | grep memory +journalctl -u memory-shepherd # View logs +``` + +## Optional: File Integrity Protection + +The baseline files in `baselines/` are critical โ€” if they get corrupted or overwritten, your agents get bad resets. For production deployments, consider: + +**Immutable flag (simple):** +```bash +# Prevent modification of baseline files +sudo chattr +i baselines/*.md + +# To update a baseline, temporarily remove the flag +sudo chattr -i baselines/my-agent-MEMORY.md +vim baselines/my-agent-MEMORY.md +sudo chattr +i baselines/my-agent-MEMORY.md +``` + +**Checksum validation (paranoid):** +```bash +# Generate checksums after writing baselines +sha256sum baselines/*.md > baselines/.checksums + +# Add a pre-reset check to your workflow +sha256sum --check baselines/.checksums || echo "BASELINE TAMPERING DETECTED" +``` + +**Watchdog process:** For multi-agent systems where agents have filesystem access, a separate watchdog that monitors baseline integrity and auto-restores from a protected backup adds another layer of defense. + +## Architecture + +``` + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” + โ”‚ systemd timer โ”‚ + โ”‚ (every 3 hours) โ”‚ + โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ + โ”‚ + โ–ผ + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” + โ”‚ memory-shepherd.sh โ”‚ + โ”‚ reads config, loops โ”‚ + โ”‚ over agents โ”‚ + โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ + โ”‚ + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” + โ–ผ โ–ผ โ–ผ + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” + โ”‚ Agent A โ”‚ โ”‚ Agent B โ”‚ โ”‚ Agent C โ”‚ + โ”‚ (local) โ”‚ โ”‚ (local) โ”‚ โ”‚ (remote) โ”‚ + โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ + โ”‚ โ”‚ โ”‚ + โ–ผ โ–ผ โ–ผ + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” + โ”‚ 1. Read โ”‚ โ”‚ 1. Read โ”‚ โ”‚ 1. SCP down โ”‚ + โ”‚ 2. Extract โ”‚ โ”‚ 2. Extract โ”‚ โ”‚ 2. Extract โ”‚ + โ”‚ scratch โ”‚ โ”‚ scratch โ”‚ โ”‚ scratch โ”‚ + โ”‚ 3. Archive โ”‚ โ”‚ 3. Archive โ”‚ โ”‚ 3. Archive โ”‚ + โ”‚ 4. Reset โ”‚ โ”‚ 4. Reset โ”‚ โ”‚ 4. SCP up โ”‚ + โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ + โ”‚ + โ–ผ + โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” + โ”‚ archives/ โ”‚ + โ”‚ โ”œโ”€โ”€ agent-a/ โ”‚ + โ”‚ โ”‚ โ””โ”€โ”€ *.md โ”‚ + โ”‚ โ”œโ”€โ”€ agent-b/ โ”‚ + โ”‚ โ”‚ โ””โ”€โ”€ *.md โ”‚ + โ”‚ โ””โ”€โ”€ agent-c/ โ”‚ + โ”‚ โ””โ”€โ”€ *.md โ”‚ + โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ +``` + +## Safety Features + +- **Lock file** prevents concurrent resets from overlapping +- **Stale lock detection** auto-removes locks older than 2 minutes +- **Baseline size validation** refuses to reset if the baseline is under 1000 bytes (likely corrupt) +- **Atomic file replacement** uses copy-then-move to prevent partial writes +- **Missing separator handling** backs up the entire memory file before resetting +- **Missing memory file handling** creates from baseline instead of failing +- **Archive retention** automatically cleans up old archives +- **Log rotation** prevents unbounded log growth + +## License + +Apache 2.0 โ€” see [LICENSE](../LICENSE). diff --git a/dream-server/memory-shepherd/baselines/dream-agent-AGENTS.md b/dream-server/memory-shepherd/baselines/dream-agent-AGENTS.md new file mode 100644 index 000000000..3d698ddfb --- /dev/null +++ b/dream-server/memory-shepherd/baselines/dream-agent-AGENTS.md @@ -0,0 +1,44 @@ +# AGENTS.md - Your Workspace + +This folder is home. Treat it that way. + +## Every Session + +Before doing anything else: + +1. Read `SOUL.md` โ€” this is who you are +2. Read `USER.md` โ€” this is who you're helping +3. Read `memory/YYYY-MM-DD.md` (today + yesterday) for recent context +4. **If in MAIN SESSION** (direct chat with your human): Also read `MEMORY.md` + +Don't ask permission. Just do it. + +## Memory + +You wake up fresh each session. These files are your continuity: + +- **Daily notes:** `memory/YYYY-MM-DD.md` (create `memory/` if needed) โ€” raw logs of what happened +- **Long-term:** `MEMORY.md` โ€” your curated memories, like a human's long-term memory + +Capture what matters. Decisions, context, things to remember. + +### Write It Down - No "Mental Notes"! + +- Memory is limited โ€” if you want to remember something, WRITE IT TO A FILE +- "Mental notes" don't survive session restarts. Files do. +- When someone says "remember this" โ†’ update `memory/YYYY-MM-DD.md` or relevant file + +## Safety + +- Don't exfiltrate private data. Ever. +- Don't run destructive commands without asking. +- `trash` > `rm` (recoverable beats gone forever) +- When in doubt, ask. + +## Tools + +Skills provide your tools. When you need one, check its `SKILL.md`. Keep local notes in `TOOLS.md`. + +## Make It Yours + +This is a starting point. Add your own conventions, style, and rules as you figure out what works. diff --git a/dream-server/memory-shepherd/baselines/dream-agent-MEMORY.md b/dream-server/memory-shepherd/baselines/dream-agent-MEMORY.md new file mode 100644 index 000000000..41304129c --- /dev/null +++ b/dream-server/memory-shepherd/baselines/dream-agent-MEMORY.md @@ -0,0 +1,53 @@ +# MEMORY.md - Todd's Long-Term Memory + +## Dream Server โ€” System Knowledge + +### Hardware +- **CPU:** AMD Ryzen AI MAX+ 395 (Strix Halo) +- **GPU:** Radeon 8060S (RDNA 3.5, gfx1151) +- **Memory:** 128GB unified (96GB VRAM / 32GB CPU, configured in BIOS) +- **Machine:** GMKtec NucBox EVO-X2 + +### Inference Stack +- **Model:** qwen3-coder-next (80B MoE, 3B active params, ~52GB) +- **Format:** GGUF (Q4_K_M quantization, from unsloth/Qwen3-Coder-Next-GGUF) +- **Backend:** llama-server via ROCm 7.2 (NOT Ollama, NOT Vulkan) +- **Container:** kyuz0/amd-strix-halo-toolboxes:rocm-7.2 +- **Context:** 32,768 tokens +- **Key flags:** `-fa on --no-mmap -ngl 999 --jinja` +- **Env:** `ROCBLAS_USE_HIPBLASLT=0`, `HSA_OVERRIDE_GFX_VERSION=11.5.1` + +### Services & Ports +| Service | Port | Profile | Notes | +|---------|------|---------|-------| +| Open WebUI | 3000 | default | Chat interface, connects via OpenAI-compatible API | +| Dashboard | 3001 | default | React (Vite) system dashboard | +| Dashboard API | 3002 | default | FastAPI backend for dashboard | +| SearXNG | 8888 | default | Self-hosted metasearch (internal: searxng:8080) | +| LiteLLM | 4000 | monitoring | Proxy/router | +| n8n | 5678 | workflows | Workflow automation | +| Qdrant | 6333 | rag | Vector database for RAG | +| OpenClaw | 7860 | openclaw | That's me! Agent interface | +| Embeddings | 8090 | rag | Text embeddings service | +| Kokoro TTS | 8880 | voice | Text-to-speech | +| Whisper STT | 9000 | voice | Speech-to-text | +| llama-server | 11434 | default | LLM inference (OpenAI-compatible) | +| ComfyUI | 8188 | comfyui | Image generation | + +### How I Can Help Users +- **Web search:** I have a native `web_search` tool backed by SearXNG โ€” use it for current info, docs, or anything beyond training data +- **Chat:** Open WebUI at port 3000 โ€” main conversational interface +- **Workflows:** n8n at port 5678 โ€” automate tasks, connect services, build pipelines +- **Voice:** Whisper (STT) + Kokoro (TTS) โ€” voice input/output for the chat +- **RAG:** Qdrant + Embeddings โ€” upload documents, chat with your data +- **Dashboard:** Port 3001 โ€” monitor system status, GPU usage, model info +- **Image gen:** ComfyUI at port 8188 โ€” local image generation +- **Automation ideas:** RSS feeds, scheduled summaries, webhook integrations via n8n + +### Key Technical Notes +- Everything runs locally โ€” zero cloud dependency, total privacy +- Zero cost per token โ€” all inference on local hardware +- Web search via SearXNG โ€” self-hosted, no API keys, aggregates DuckDuckGo/Google/Brave/Wikipedia/GitHub/StackOverflow +- ROCm 7.2 is required (Vulkan crashes on qwen3-coder-next architecture) +- Services behind profiles must be enabled: `COMPOSE_PROFILES=voice,rag,workflows,openclaw` +- Docker compose files: `docker-compose.base.yml` + `docker-compose.amd.yml` diff --git a/dream-server/memory-shepherd/baselines/dream-agent-TOOLS.md b/dream-server/memory-shepherd/baselines/dream-agent-TOOLS.md new file mode 100644 index 000000000..c92a0c7bb --- /dev/null +++ b/dream-server/memory-shepherd/baselines/dream-agent-TOOLS.md @@ -0,0 +1,28 @@ +# TOOLS.md - Dream Server Service Map + +## Services + +| Service | Docker Hostname | External Port | +|---------|-----------------|---------------| +| llama-server (LLM) | llama-server | 11434 | +| Open WebUI | open-webui | 3000 | +| SearXNG (search) | searxng | 8888 | +| Dashboard | dashboard | 3001 | +| Dashboard API | dashboard-api | 3002 | +| Whisper STT | whisper | 9000 (voice profile) | +| Kokoro TTS | tts | 8880 (voice profile) | +| n8n Workflows | n8n | 5678 (workflows profile) | +| Qdrant (RAG) | qdrant | 6333 (rag profile) | +| Embeddings | embeddings | 8090 (rag profile) | +| OpenClaw | openclaw | 7860 (openclaw profile) | +| ComfyUI | comfyui | 8188 (comfyui profile) | + +## Network + +All services share `dream-network`. Use Docker hostnames for inter-service calls. + +Compose files: `docker-compose.base.yml` + GPU overlay (amd/nvidia/apple) + +## Web Search + +You have `web_search` (hits SearXNG) and `web_fetch` (loads page content). No API keys needed. diff --git a/dream-server/memory-shepherd/baselines/example-agent-MEMORY.md b/dream-server/memory-shepherd/baselines/example-agent-MEMORY.md new file mode 100644 index 000000000..b682e640d --- /dev/null +++ b/dream-server/memory-shepherd/baselines/example-agent-MEMORY.md @@ -0,0 +1,81 @@ +# MEMORY.md โ€” {Agent Name} + +*This is your baseline memory. You can add notes below the --- line. +Your additions will be periodically archived and this file reset to baseline. +For anything worth keeping long-term, write it to your project repo.* + +## Who I Am + +{Your role definition โ€” what this agent does, who it reports to, what its purpose is.} + +**Name:** {Agent Name} +**Role:** {Primary function โ€” e.g., "Infrastructure automation", "Code review", "Research assistant"} +**Operator:** {Who manages this agent} + +## The Team + +| Agent | Role | How to Reach | +|-------|------|--------------| +| {Agent A} | {Role description} | {Communication channel} | +| {Agent B} | {Role description} | {Communication channel} | + +## Critical Rules (Never Violate) + +1. {Rule about what the agent must never do โ€” e.g., "Never modify files outside your workspace without approval"} +2. {Rule about safety boundaries โ€” e.g., "Never execute destructive commands on production systems"} +3. {Rule about communication โ€” e.g., "Never impersonate another agent or the operator"} +4. {Rule about scope โ€” e.g., "Never modify this baseline section of MEMORY.md"} +5. {Rule about escalation โ€” e.g., "Always escalate if unsure โ€” ask, don't guess"} + +## Work Habits + +- {Standing order โ€” e.g., "Check for new tasks every cycle"} +- {Quality standard โ€” e.g., "Test all code changes before committing"} +- {Communication norm โ€” e.g., "Log significant decisions with reasoning"} +- {Housekeeping โ€” e.g., "Clean up temporary files after use"} + +## Autonomy Tiers + +**Do freely (no approval needed):** +- {Action the agent can take independently} +- {Another autonomous action} + +**Do, then notify operator:** +- {Action that should be reported after the fact} +- {Another notify-after action} + +**Ask before doing:** +- {Action requiring explicit approval} +- {Another approval-required action} + +**Never do:** +- {Hard-forbidden action} +- {Another forbidden action} + +## My Capabilities + +**Model:** {e.g., "Claude Sonnet 4.5 via API"} +**Tools:** {e.g., "Bash, file I/O, web search, code execution"} +**Services:** {e.g., "GitHub API, CI/CD pipeline, monitoring dashboard"} +**Communication:** {e.g., "Shared log file, message queue, Discord webhook"} + +## Where to Find Things + +| What | Where | +|------|-------| +| Project repo | {/path/to/repo} | +| Configuration | {/path/to/config} | +| Logs | {/path/to/logs} | +| Shared docs | {/path/to/docs} | +| Other agents' workspaces | {/path/to/workspaces} | + +## How to Persist Knowledge + +- **Short-term (scratch notes):** Write below the `---` line in this file. These notes will be archived and cleared on the next reset cycle. +- **Medium-term (workspace):** Save to files in your workspace directory. Survives memory resets but may be cleaned up periodically. +- **Long-term (permanent):** Commit to the project repository. This is the only truly persistent storage. + +**Remember:** Anything below the `---` line is temporary. If it matters, move it somewhere permanent before it gets archived. + +--- +## Scratch Notes (Added by {Agent Name} โ€” will be archived on reset) diff --git a/dream-server/memory-shepherd/docs/WRITING-BASELINES.md b/dream-server/memory-shepherd/docs/WRITING-BASELINES.md new file mode 100644 index 000000000..a3016fd46 --- /dev/null +++ b/dream-server/memory-shepherd/docs/WRITING-BASELINES.md @@ -0,0 +1,159 @@ +# Writing Effective Baselines + +A baseline is the persistent identity of your agent. It's everything above the `---` separator in MEMORY.md โ€” the part that survives every reset cycle. This guide covers how to write baselines that keep agents focused, capable, and aligned. + +## What a Baseline Is + +A baseline is the answer to: "If this agent lost all memory of what it's been doing, what does it need to know to continue operating correctly?" + +It is NOT a task list. It's not a conversation history. It's the agent's constitution โ€” its identity, rules, capabilities, and pointers to where everything lives. + +## What Makes a Good Baseline + +### 1. Identity First + +Start with who the agent is. This anchors everything else. + +```markdown +## Who I Am +I am CodeReviewer, an automated code review agent. I review pull requests +on the main repository, flag issues, suggest improvements, and approve +changes that meet quality standards. I report to the engineering lead. +``` + +Be specific. "I am a helpful assistant" is useless. "I review PRs on the acme-corp/backend repo, focusing on security and performance" gives the agent something to work with. + +### 2. Rules That Actually Matter + +Don't write 50 rules. Write 5-7 that matter enough to never violate. These are your hard boundaries. + +Good rules are specific and actionable: +- "Never push directly to the main branch" +- "Never modify another agent's MEMORY.md" +- "Always run tests before committing" + +Bad rules are vague: +- "Be careful" (with what?) +- "Don't do anything dangerous" (define dangerous) +- "Follow best practices" (which ones?) + +### 3. Autonomy Tiers + +The most effective pattern we've found is explicit autonomy tiers. Agents need to know what they can do freely, what needs a heads-up, and what needs approval. + +```markdown +## Autonomy Tiers + +**Do freely:** Read files, run tests, draft PRs, update scratch notes +**Do then notify:** Merge approved PRs, update documentation +**Ask first:** Change CI/CD config, modify shared infrastructure +**Never do:** Delete branches, modify production databases, bypass review +``` + +This eliminates the "should I ask or just do it?" hesitation that wastes cycles. + +For a deeper dive into autonomy tiers and infrastructure protection, see +the Guardian service configuration in `docker-compose.base.yml`. + +### 4. Capabilities and Tools + +Tell the agent what it can actually use. Agents that know their tools are dramatically more effective than ones guessing. + +```markdown +## My Capabilities +**Model:** Claude Sonnet 4.5 via API +**Tools:** Bash, file I/O, web search, GitHub CLI +**Can access:** Internal wiki, CI logs, monitoring dashboard +**Cannot access:** Production database, customer data, billing system +``` + +### 5. Pointers, Not Content + +A baseline should point to information, not contain it. Don't paste your entire project architecture into MEMORY.md โ€” point to where the docs live. + +```markdown +## Where to Find Things +| What | Where | +|------|-------| +| Architecture docs | /docs/ARCHITECTURE.md | +| API reference | /docs/API.md | +| Deployment guide | /ops/DEPLOY.md | +| Team contacts | /docs/TEAM.md | +``` + +This keeps baselines small and avoids stale copies of information that lives elsewhere. + +## What NOT to Put in a Baseline + +- **Current tasks or status** โ€” That's what scratch notes are for +- **Conversation context** โ€” Each session starts fresh; the baseline provides enough to start working +- **Frequently changing data** โ€” API endpoints that rotate, version numbers, deployment targets. Point to a config file instead. +- **Long reference material** โ€” Don't paste a 50-line API reference. Link to it. +- **Other agents' details** โ€” A brief team table is fine, but don't include their full capabilities or instructions + +## The Scratch Notes Contract + +The `---` separator is a contract between you (the operator) and the agent: + +- **Above the line:** The operator controls this. The agent must not modify it. It defines who the agent is and how it operates. +- **Below the line:** The agent controls this. It's scratch space for working notes, observations, and state that helps during the current work cycle. + +Include this contract in the baseline itself so the agent understands the system: + +```markdown +## How to Persist Knowledge +- **Short-term:** Write below the `---` line. These notes get archived on reset. +- **Medium-term:** Save files in your workspace directory. +- **Long-term:** Commit to the project repository. +``` + +Agents that understand the reset system write better notes โ€” they prioritize what matters and move important discoveries to permanent storage before the next cycle. + +## Size Guidelines + +From our experience running multi-agent systems: + +- **Too small (< 5KB):** Not enough context. The agent spends cycles rediscovering things. +- **Sweet spot (12-20KB):** Enough to fully specify identity, rules, capabilities, and pointers. +- **Too large (> 25KB):** You're probably including content that should be in separate docs. The baseline becomes hard to maintain and review. + +The minimum safety threshold in memory-shepherd is 1000 bytes โ€” anything smaller than that is almost certainly corrupt or empty, and the reset will abort rather than overwrite a working memory file with garbage. + +## Section Ordering + +We recommend this order, which flows from "who am I" to "how do I work": + +1. **Who I Am** โ€” Identity and role +2. **The Team** โ€” Who else is involved +3. **Critical Rules** โ€” Hard boundaries +4. **Work Habits** โ€” Standing orders and norms +5. **Autonomy Tiers** โ€” What needs approval vs. what doesn't +6. **My Capabilities** โ€” Tools, models, access +7. **Where to Find Things** โ€” Pointers to persistent information +8. **How to Persist Knowledge** โ€” The memory system explanation + +The exact sections don't matter as much as having a consistent structure that the agent encounters the same way every reset cycle. Consistency breeds reliability. + +## Teaching Agents About the System + +The most important trick: include an explanation of the memory system in the baseline itself. Agents that know their memory gets reset behave differently โ€” and better โ€” than agents that don't. + +```markdown +*This is your baseline memory. You can add notes below the --- line. +Your additions will be periodically archived and this file reset to baseline. +For anything worth keeping long-term, write it to your project repo.* +``` + +This one paragraph, placed at the top of every baseline, completely changes agent behavior. Instead of treating MEMORY.md as a permanent document, they treat scratch notes as what they are: temporary working memory that needs to be distilled and externalized. + +## Reviewing and Updating Baselines + +Baselines aren't write-once. Review them when: + +- The agent's role changes +- You notice the agent repeatedly rediscovering the same information (add it to the baseline) +- You notice the agent consistently ignoring a rule (simplify or remove it โ€” unenforced rules add noise) +- The team structure changes +- New tools or capabilities are added + +When updating a baseline, update the file in your `baselines/` directory. The next reset cycle will automatically propagate the change. diff --git a/dream-server/memory-shepherd/install.sh b/dream-server/memory-shepherd/install.sh new file mode 100644 index 000000000..77ef1a124 --- /dev/null +++ b/dream-server/memory-shepherd/install.sh @@ -0,0 +1,264 @@ +#!/bin/bash +# install.sh โ€” Generate and install systemd timers for memory-shepherd +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +SHEPHERD="$SCRIPT_DIR/memory-shepherd.sh" +DRY_RUN=false +PREFIX="" +USER_MODE=false + +# โ”€โ”€ Usage โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +usage() { + cat <&2; exit 1 ;; + esac +done + +# โ”€โ”€ Detect Mode โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +if [ -z "$PREFIX" ]; then + if [ "$(id -u)" -eq 0 ]; then + PREFIX="/etc/systemd/system" + else + PREFIX="$HOME/.config/systemd/user" + USER_MODE=true + fi +else + # If custom prefix is under user config, use user mode + [[ "$PREFIX" == *".config/systemd/user"* ]] && USER_MODE=true +fi + +SYSTEMCTL_FLAG="" +$USER_MODE && SYSTEMCTL_FLAG="--user" + +# โ”€โ”€ Config Parser (minimal โ€” just need agent names) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +declare -A CONFIG +AGENTS=() + +find_config() { + if [ -n "${MEMORY_SHEPHERD_CONF:-}" ] && [ -f "$MEMORY_SHEPHERD_CONF" ]; then + echo "$MEMORY_SHEPHERD_CONF" + elif [ -f "$SCRIPT_DIR/memory-shepherd.conf" ]; then + echo "$SCRIPT_DIR/memory-shepherd.conf" + elif [ -f "/etc/memory-shepherd/memory-shepherd.conf" ]; then + echo "/etc/memory-shepherd/memory-shepherd.conf" + else + return 1 + fi +} + +parse_config() { + local conf_file="$1" + local section="" + while IFS= read -r line; do + line="${line%%#*}" + line="${line#"${line%%[![:space:]]*}"}" + line="${line%"${line##*[![:space:]]}"}" + [[ -z "$line" ]] && continue + + if [[ "$line" =~ ^\[([a-zA-Z0-9_-]+)\]$ ]]; then + section="${BASH_REMATCH[1]}" + if [[ "$section" != "general" ]]; then + AGENTS+=("$section") + fi + continue + fi + + if [[ "$line" =~ ^([a-zA-Z_][a-zA-Z0-9_]*)=(.*)$ ]]; then + CONFIG["${section}.${BASH_REMATCH[1]}"]="${BASH_REMATCH[2]}" + fi + done < "$conf_file" +} + +CONF_FILE=$(find_config) || { + echo "ERROR: No memory-shepherd.conf found." >&2 + echo "Create one from memory-shepherd.conf.example first." >&2 + exit 1 +} + +parse_config "$CONF_FILE" + +if [ ${#AGENTS[@]} -eq 0 ]; then + echo "ERROR: No agents defined in $CONF_FILE" >&2 + exit 1 +fi + +echo "Config: $CONF_FILE" +echo "Agents: ${AGENTS[*]}" +echo "Prefix: $PREFIX" +echo "Mode: $($USER_MODE && echo "user" || echo "system")" +echo "" + +# โ”€โ”€ Create Directories โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +BASELINE_DIR="${CONFIG[general.baseline_dir]:-./baselines}" +ARCHIVE_DIR="${CONFIG[general.archive_dir]:-./archives}" +[[ "$BASELINE_DIR" != /* ]] && BASELINE_DIR="$SCRIPT_DIR/$BASELINE_DIR" +[[ "$ARCHIVE_DIR" != /* ]] && ARCHIVE_DIR="$SCRIPT_DIR/$ARCHIVE_DIR" + +if ! $DRY_RUN; then + mkdir -p "$PREFIX" "$BASELINE_DIR" "$ARCHIVE_DIR" + for agent in "${AGENTS[@]}"; do + subdir="${CONFIG[${agent}.archive_subdir]:-$agent}" + mkdir -p "$ARCHIVE_DIR/$subdir" + done +fi + +# โ”€โ”€ Generate Units โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +generate_service() { + local name="$1" + local target="$2" # agent name or "all" + local description="$3" + + cat < "$SERVICE_FILE" + echo "$TIMER_CONTENT" > "$TIMER_FILE" + echo " Wrote $SERVICE_FILE" + echo " Wrote $TIMER_FILE" +fi +INSTALLED_TIMERS+=("${SERVICE_NAME}.timer") + +# Per-agent timers โ€” staggered by 10 minutes +stagger=0 +for agent in "${AGENTS[@]}"; do + SERVICE_NAME="memory-shepherd-${agent}" + SERVICE_FILE="$PREFIX/${SERVICE_NAME}.service" + TIMER_FILE="$PREFIX/${SERVICE_NAME}.timer" + + # Stagger: offset each agent by 10 minutes within the 3-hour window + stagger_min=$((stagger * 10)) + if [ "$stagger_min" -eq 0 ]; then + calendar="*-*-* 00/3:00:00" + else + calendar="*-*-* 00/3:${stagger_min}:00" + fi + + echo "" + echo "--- ${SERVICE_NAME}.service ---" + SERVICE_CONTENT=$(generate_service "$SERVICE_NAME" "$agent" "Memory Shepherd โ€” reset $agent") + TIMER_CONTENT=$(generate_timer "$SERVICE_NAME" "Memory Shepherd โ€” periodic reset for $agent" "$calendar" "2min") + + if $DRY_RUN; then + echo "$SERVICE_CONTENT" + echo "" + echo "--- ${SERVICE_NAME}.timer ---" + echo "$TIMER_CONTENT" + else + echo "$SERVICE_CONTENT" > "$SERVICE_FILE" + echo "$TIMER_CONTENT" > "$TIMER_FILE" + echo " Wrote $SERVICE_FILE" + echo " Wrote $TIMER_FILE" + fi + INSTALLED_TIMERS+=("${SERVICE_NAME}.timer") + stagger=$((stagger + 1)) +done + +# โ”€โ”€ Enable Timers โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +if $DRY_RUN; then + echo "" + echo "=== DRY RUN โ€” no files written, no timers enabled ===" + echo "Would install: ${INSTALLED_TIMERS[*]}" +else + echo "" + systemctl $SYSTEMCTL_FLAG daemon-reload + + # Only enable the "all" timer by default; per-agent timers are available but not auto-enabled + systemctl $SYSTEMCTL_FLAG enable --now "memory-shepherd.timer" + echo "Enabled: memory-shepherd.timer (resets all agents every 3 hours)" + echo "" + echo "Per-agent timers installed but not enabled (use if you want individual schedules):" + for agent in "${AGENTS[@]}"; do + echo " systemctl $SYSTEMCTL_FLAG enable --now memory-shepherd-${agent}.timer" + done +fi + +# โ”€โ”€ Summary โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +echo "" +echo "=== Summary ===" +echo "Config: $CONF_FILE" +echo "Agents: ${AGENTS[*]}" +echo "Baselines: $BASELINE_DIR" +echo "Archives: $ARCHIVE_DIR" +echo "Timer units: $PREFIX/memory-shepherd*.{timer,service}" +echo "" +echo "Useful commands:" +echo " memory-shepherd.sh all # Manual reset (all agents)" +echo " memory-shepherd.sh # Manual reset (single agent)" +echo " systemctl $SYSTEMCTL_FLAG list-timers | grep memory # Check timer status" +echo " journalctl $SYSTEMCTL_FLAG -u memory-shepherd # View logs" diff --git a/dream-server/memory-shepherd/memory-shepherd.conf b/dream-server/memory-shepherd/memory-shepherd.conf new file mode 100644 index 000000000..9a74ae9d8 --- /dev/null +++ b/dream-server/memory-shepherd/memory-shepherd.conf @@ -0,0 +1,35 @@ +# memory-shepherd.conf โ€” Dream Server Agent Memory Management +# +# Manages workspace files for OpenClaw agents by periodically +# resetting them to operator-controlled baselines. +# Files with a --- separator get scratch notes archived before reset. +# Files without a separator get fully backed up before overwrite. +# +# Install: ./memory-shepherd/install.sh +# Manual: ./memory-shepherd/memory-shepherd.sh all + +[general] +baseline_dir=~/dream-server/memory-shepherd/baselines +archive_dir=~/dream-server/data/memory-archives +max_memory_size=16384 +archive_retention_days=30 +separator=--- +min_baseline_size=500 + +# MEMORY.md โ€” agent's long-term memory (has scratch section below ---) +[dream-agent-memory] +memory_file=~/dream-server/config/openclaw/workspace/MEMORY.md +baseline=dream-agent-MEMORY.md +archive_subdir=dream-agent/memory + +# AGENTS.md โ€” workspace instructions (pure overwrite, no scratch) +[dream-agent-agents] +memory_file=~/dream-server/config/openclaw/workspace/AGENTS.md +baseline=dream-agent-AGENTS.md +archive_subdir=dream-agent/agents + +# TOOLS.md โ€” service map (pure overwrite, no scratch) +[dream-agent-tools] +memory_file=~/dream-server/config/openclaw/workspace/TOOLS.md +baseline=dream-agent-TOOLS.md +archive_subdir=dream-agent/tools diff --git a/dream-server/memory-shepherd/memory-shepherd.conf.example b/dream-server/memory-shepherd/memory-shepherd.conf.example new file mode 100644 index 000000000..080813c18 --- /dev/null +++ b/dream-server/memory-shepherd/memory-shepherd.conf.example @@ -0,0 +1,56 @@ +# memory-shepherd.conf โ€” Agent memory reset configuration +# +# INI-style config. Each [section] defines an agent to manage. +# The [general] section sets global defaults. +# +# Config file search order: +# 1. $MEMORY_SHEPHERD_CONF environment variable +# 2. ./memory-shepherd.conf (next to the script) +# 3. /etc/memory-shepherd/memory-shepherd.conf + +[general] +# Directory containing baseline MEMORY.md files +baseline_dir=./baselines + +# Root directory for archived scratch notes +archive_dir=./archives + +# Maximum memory file size in bytes before forcing reset +max_memory_size=16384 + +# Delete archived notes older than this many days +archive_retention_days=30 + +# The separator line between baseline and scratch notes +separator=--- + +# โ”€โ”€โ”€ Local Agent Example โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ +# Each agent section defines one managed MEMORY.md file. +# +# Required keys: +# memory_file โ€” Absolute path to the agent's MEMORY.md +# baseline โ€” Filename (not path) of the baseline in baseline_dir +# +# Optional keys: +# archive_subdir โ€” Subdirectory name under archive_dir (default: agent name) + +[my-agent] +memory_file=/path/to/agent/.openclaw/workspace/MEMORY.md +baseline=my-agent-MEMORY.md +# archive_subdir=my-agent + +# โ”€โ”€โ”€ Remote Agent Example โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ +# For agents running on another machine, add remote_host. +# The script will SCP the memory file down, archive scratch notes locally, +# then SCP the baseline back up. +# +# Additional keys for remote agents: +# remote_host โ€” Hostname or IP of the remote machine +# remote_user โ€” SSH user (default: current user) +# remote_memory โ€” Absolute path to MEMORY.md on the remote machine + +# [remote-agent] +# remote_host=192.168.1.100 +# remote_user=deploy +# remote_memory=/home/deploy/agent/.openclaw/workspace/MEMORY.md +# baseline=remote-agent-MEMORY.md diff --git a/dream-server/memory-shepherd/memory-shepherd.sh b/dream-server/memory-shepherd/memory-shepherd.sh new file mode 100644 index 000000000..4ef497cff --- /dev/null +++ b/dream-server/memory-shepherd/memory-shepherd.sh @@ -0,0 +1,321 @@ +#!/bin/bash +# memory-shepherd.sh โ€” Periodic memory baseline reset for LLM agents +# Usage: memory-shepherd.sh [agent-name|all] +set -uo pipefail + +TIMESTAMP=$(date '+%Y-%m-%d_%H%M') +LOCKFILE=/tmp/memory-shepherd.lock +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +# โ”€โ”€ Logging โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] [memory-shepherd] $1"; } + +# โ”€โ”€ Lock Management โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +cleanup_lock() { rm -f "$LOCKFILE"; } +trap cleanup_lock EXIT + +if [ -f "$LOCKFILE" ]; then + lock_age=$(( $(date +%s) - $(stat -c %Y "$LOCKFILE") )) + if [ "$lock_age" -gt 120 ]; then + log "WARN: Stale lock (age: ${lock_age}s) โ€” removing" + rm -f "$LOCKFILE" + else + log "Another reset running (lock age: ${lock_age}s) โ€” exiting" + exit 0 + fi +fi +echo $$ > "$LOCKFILE" + +# โ”€โ”€ Config Parser โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +declare -A CONFIG +AGENTS=() + +find_config() { + if [ -n "${MEMORY_SHEPHERD_CONF:-}" ] && [ -f "$MEMORY_SHEPHERD_CONF" ]; then + echo "$MEMORY_SHEPHERD_CONF" + elif [ -f "$SCRIPT_DIR/memory-shepherd.conf" ]; then + echo "$SCRIPT_DIR/memory-shepherd.conf" + elif [ -f "/etc/memory-shepherd/memory-shepherd.conf" ]; then + echo "/etc/memory-shepherd/memory-shepherd.conf" + else + return 1 + fi +} + +parse_config() { + local conf_file="$1" + local section="" + while IFS= read -r line; do + # Strip comments and whitespace + line="${line%%#*}" + line="${line#"${line%%[![:space:]]*}"}" + line="${line%"${line##*[![:space:]]}"}" + [[ -z "$line" ]] && continue + + if [[ "$line" =~ ^\[([a-zA-Z0-9_-]+)\]$ ]]; then + section="${BASH_REMATCH[1]}" + if [[ "$section" != "general" ]]; then + AGENTS+=("$section") + fi + continue + fi + + if [[ "$line" =~ ^([a-zA-Z_][a-zA-Z0-9_]*)=(.*)$ ]]; then + CONFIG["${section}.${BASH_REMATCH[1]}"]="${BASH_REMATCH[2]}" + fi + done < "$conf_file" +} + +cfg() { + local key="${1}.${2}" + local default="${3:-}" + echo "${CONFIG[$key]:-$default}" +} + +# โ”€โ”€ Load Config โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +CONF_FILE=$(find_config) || { + echo "ERROR: No config file found." >&2 + echo "Searched: \$MEMORY_SHEPHERD_CONF, ./memory-shepherd.conf, /etc/memory-shepherd/memory-shepherd.conf" >&2 + exit 1 +} + +parse_config "$CONF_FILE" +log "Loaded config from $CONF_FILE (${#AGENTS[@]} agents)" + +# โ”€โ”€ Global Settings โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +BASELINE_DIR=$(cfg general baseline_dir "$SCRIPT_DIR/baselines") +ARCHIVE_DIR=$(cfg general archive_dir "$SCRIPT_DIR/archives") +MAX_MEMORY_SIZE=$(cfg general max_memory_size 16384) +ARCHIVE_RETENTION_DAYS=$(cfg general archive_retention_days 30) +SEPARATOR=$(cfg general separator "---") +MIN_BASELINE_SIZE=$(cfg general min_baseline_size 500) + +# Resolve relative paths against script directory +[[ "$BASELINE_DIR" != /* ]] && BASELINE_DIR="$SCRIPT_DIR/$BASELINE_DIR" +[[ "$ARCHIVE_DIR" != /* ]] && ARCHIVE_DIR="$SCRIPT_DIR/$ARCHIVE_DIR" + +# โ”€โ”€ Reset Functions โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +reset_agent() { + local agent="$1" + local memory_file="$2" + local baseline="$3" + local archive_dir="$4" + + if [ ! -f "$baseline" ]; then + log "CRITICAL: Baseline missing for $agent at $baseline โ€” aborting" + return 1 + fi + + local baseline_size + baseline_size=$(stat -c %s "$baseline") + if [ "$baseline_size" -lt "$MIN_BASELINE_SIZE" ]; then + log "CRITICAL: Baseline for $agent is suspiciously small (${baseline_size} bytes, min: ${MIN_BASELINE_SIZE}) โ€” aborting" + return 1 + fi + + if [ ! -f "$memory_file" ]; then + log "WARN: No memory file for $agent โ€” creating from baseline" + cp "$baseline" "$memory_file" + return 0 + fi + + local memory_size + memory_size=$(stat -c %s "$memory_file") + if [ "$memory_size" -gt "$MAX_MEMORY_SIZE" ]; then + log "WARN: Memory file for $agent is ${memory_size} bytes (over limit) โ€” forcing reset" + fi + + local separator_line + separator_line=$(grep -n "^${SEPARATOR}$" "$memory_file" | tail -1 | cut -d: -f1 || echo "") + + if [ -n "$separator_line" ]; then + local total_lines + total_lines=$(wc -l < "$memory_file") + if [ "$separator_line" -lt "$total_lines" ]; then + local scratch + scratch=$(tail -n +"$(($separator_line + 1))" "$memory_file" | sed '/^## Scratch Notes/d' | sed '/^[[:space:]]*$/d') + if [ -n "$scratch" ]; then + mkdir -p "$archive_dir" + local archive_file="$archive_dir/${TIMESTAMP}.md" + printf "# %s scratch notes โ€” archived %s\n\n%s\n" "$agent" "$TIMESTAMP" "$scratch" > "$archive_file" + log "Archived scratch notes for $agent ($(echo "$scratch" | wc -l) lines)" + else + log "No scratch notes for $agent" + fi + else + log "No scratch notes for $agent" + fi + else + mkdir -p "$archive_dir" + cp "$memory_file" "$archive_dir/${TIMESTAMP}-full-backup.md" + log "WARN: No separator in $agent memory โ€” backed up entire file before reset" + fi + + local tmpfile="${memory_file}.reset-tmp" + cp "$baseline" "$tmpfile" + mv -f "$tmpfile" "$memory_file" + log "Reset $agent MEMORY.md to baseline (${baseline_size} bytes)" +} + +reset_remote_agent() { + local agent="$1" + local remote_host="$2" + local remote_user="$3" + local remote_memory="$4" + local baseline="$5" + local archive_dir="$6" + + if [ ! -f "$baseline" ]; then + log "CRITICAL: Baseline missing for $agent at $baseline โ€” aborting" + return 1 + fi + + local baseline_size + baseline_size=$(stat -c %s "$baseline") + if [ "$baseline_size" -lt "$MIN_BASELINE_SIZE" ]; then + log "CRITICAL: Baseline for $agent is suspiciously small (${baseline_size} bytes, min: ${MIN_BASELINE_SIZE}) โ€” aborting" + return 1 + fi + + # Fetch current memory from remote + local tmpfile="/tmp/memory-shepherd-${agent}-current.md" + if ! scp -q "${remote_user}@${remote_host}:${remote_memory}" "$tmpfile" 2>/dev/null; then + log "WARN: No memory file for $agent on $remote_host โ€” pushing baseline" + scp -q "$baseline" "${remote_user}@${remote_host}:${remote_memory}" + return 0 + fi + + local memory_size + memory_size=$(stat -c %s "$tmpfile") + if [ "$memory_size" -gt "$MAX_MEMORY_SIZE" ]; then + log "WARN: Memory file for $agent is ${memory_size} bytes (over limit) โ€” forcing reset" + fi + + # Extract and archive scratch notes locally + local separator_line + separator_line=$(grep -n "^${SEPARATOR}$" "$tmpfile" | tail -1 | cut -d: -f1 || echo "") + + if [ -n "$separator_line" ]; then + local total_lines + total_lines=$(wc -l < "$tmpfile") + if [ "$separator_line" -lt "$total_lines" ]; then + local scratch + scratch=$(tail -n +"$(($separator_line + 1))" "$tmpfile" | sed '/^## Scratch Notes/d' | sed '/^[[:space:]]*$/d') + if [ -n "$scratch" ]; then + mkdir -p "$archive_dir" + local archive_file="$archive_dir/${TIMESTAMP}.md" + printf "# %s scratch notes โ€” archived %s\n\n%s\n" "$agent" "$TIMESTAMP" "$scratch" > "$archive_file" + log "Archived scratch notes for $agent ($(echo "$scratch" | wc -l) lines)" + else + log "No scratch notes for $agent" + fi + else + log "No scratch notes for $agent" + fi + else + mkdir -p "$archive_dir" + cp "$tmpfile" "$archive_dir/${TIMESTAMP}-full-backup.md" + log "WARN: No separator in $agent memory โ€” backed up entire file before reset" + fi + + # Push baseline to remote + scp -q "$baseline" "${remote_user}@${remote_host}:${remote_memory}" + log "Reset $agent MEMORY.md on $remote_host to baseline (${baseline_size} bytes)" + rm -f "$tmpfile" +} + +# โ”€โ”€ Dispatch โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +process_agent() { + local agent="$1" + + local memory_file + memory_file=$(cfg "$agent" memory_file "") + local baseline_name + baseline_name=$(cfg "$agent" baseline "") + local archive_subdir + archive_subdir=$(cfg "$agent" archive_subdir "$agent") + local archive_path="$ARCHIVE_DIR/$archive_subdir" + + if [ -z "$baseline_name" ]; then + log "ERROR: No baseline defined for agent '$agent' โ€” skipping" + return 1 + fi + + local baseline_path="$BASELINE_DIR/$baseline_name" + local remote_host + remote_host=$(cfg "$agent" remote_host "") + + if [ -n "$remote_host" ]; then + local remote_user + remote_user=$(cfg "$agent" remote_user "$(whoami)") + local remote_memory + remote_memory=$(cfg "$agent" remote_memory "") + + if [ -z "$remote_memory" ]; then + log "ERROR: remote_host set for '$agent' but no remote_memory โ€” skipping" + return 1 + fi + + reset_remote_agent "$agent" "$remote_host" "$remote_user" "$remote_memory" "$baseline_path" "$archive_path" + else + if [ -z "$memory_file" ]; then + log "ERROR: No memory_file defined for agent '$agent' โ€” skipping" + return 1 + fi + + reset_agent "$agent" "$memory_file" "$baseline_path" "$archive_path" + fi +} + +# โ”€โ”€ Main โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +TARGET="${1:-all}" + +if [ "$TARGET" = "all" ]; then + if [ ${#AGENTS[@]} -eq 0 ]; then + log "No agents defined in config" + exit 0 + fi + for agent in "${AGENTS[@]}"; do + process_agent "$agent" + done +else + # Check if the agent exists in config + found=false + for agent in "${AGENTS[@]}"; do + if [ "$agent" = "$TARGET" ]; then + found=true + break + fi + done + + if [ "$found" = false ]; then + echo "ERROR: Unknown agent '$TARGET'" >&2 + echo "Available agents: ${AGENTS[*]}" >&2 + echo "Usage: memory-shepherd.sh [agent-name|all]" >&2 + exit 1 + fi + + process_agent "$TARGET" +fi + +# โ”€โ”€ Cleanup โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +# Purge old archives +find "$ARCHIVE_DIR" -name "*.md" -mtime +"$ARCHIVE_RETENTION_DAYS" -delete 2>/dev/null || true + +# Rotate log if over 1MB +local_log="$ARCHIVE_DIR/reset.log" +if [ -f "$local_log" ] && [ "$(stat -c %s "$local_log" 2>/dev/null || echo 0)" -gt 1048576 ]; then + mv "$local_log" "$local_log.old" + log "Rotated log file" +fi + +log "Done" diff --git a/dream-server/memory-shepherd/uninstall.sh b/dream-server/memory-shepherd/uninstall.sh new file mode 100644 index 000000000..b5c89d07d --- /dev/null +++ b/dream-server/memory-shepherd/uninstall.sh @@ -0,0 +1,100 @@ +#!/bin/bash +# uninstall.sh โ€” Remove memory-shepherd systemd timers +set -euo pipefail + +PREFIX="" +USER_MODE=false + +# โ”€โ”€ Usage โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +usage() { + cat <&2; exit 1 ;; + esac +done + +# โ”€โ”€ Detect Mode โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +if [ -z "$PREFIX" ]; then + if [ "$(id -u)" -eq 0 ]; then + PREFIX="/etc/systemd/system" + else + PREFIX="$HOME/.config/systemd/user" + USER_MODE=true + fi +else + [[ "$PREFIX" == *".config/systemd/user"* ]] && USER_MODE=true +fi + +SYSTEMCTL_FLAG="" +$USER_MODE && SYSTEMCTL_FLAG="--user" + +# โ”€โ”€ Find Units โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +TIMERS=() +SERVICES=() + +for f in "$PREFIX"/memory-shepherd*.timer; do + [ -f "$f" ] && TIMERS+=("$(basename "$f")") +done + +for f in "$PREFIX"/memory-shepherd*.service; do + [ -f "$f" ] && SERVICES+=("$(basename "$f")") +done + +if [ ${#TIMERS[@]} -eq 0 ] && [ ${#SERVICES[@]} -eq 0 ]; then + echo "No memory-shepherd units found in $PREFIX" + exit 0 +fi + +echo "Found in $PREFIX:" +for t in "${TIMERS[@]}"; do echo " timer: $t"; done +for s in "${SERVICES[@]}"; do echo " service: $s"; done +echo "" + +# โ”€โ”€ Stop and Disable โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +for timer in "${TIMERS[@]}"; do + echo "Stopping and disabling $timer..." + systemctl $SYSTEMCTL_FLAG stop "$timer" 2>/dev/null || true + systemctl $SYSTEMCTL_FLAG disable "$timer" 2>/dev/null || true +done + +for service in "${SERVICES[@]}"; do + systemctl $SYSTEMCTL_FLAG stop "$service" 2>/dev/null || true +done + +# โ”€โ”€ Remove Files โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +for timer in "${TIMERS[@]}"; do + rm -f "$PREFIX/$timer" + echo "Removed $PREFIX/$timer" +done + +for service in "${SERVICES[@]}"; do + rm -f "$PREFIX/$service" + echo "Removed $PREFIX/$service" +done + +systemctl $SYSTEMCTL_FLAG daemon-reload +echo "" +echo "Done. Removed ${#TIMERS[@]} timer(s) and ${#SERVICES[@]} service(s)." +echo "Config, baselines, and archives were NOT removed." diff --git a/dream-server/migrations/migrate-v0.2.0.sh b/dream-server/migrations/migrate-v0.2.0.sh old mode 100755 new mode 100644 diff --git a/dream-server/opencode/opencode-web.service b/dream-server/opencode/opencode-web.service new file mode 100644 index 000000000..869fc8980 --- /dev/null +++ b/dream-server/opencode/opencode-web.service @@ -0,0 +1,24 @@ +[Unit] +Description=OpenCode Web UI +Documentation=https://opencode.ai/docs +After=network.target + +[Service] +Type=simple +WorkingDirectory=__HOME__ +ExecStart=__HOME__/.opencode/bin/opencode web --port 3003 --hostname 0.0.0.0 +Restart=on-failure +RestartSec=5 + +# Environment +Environment=HOME=__HOME__ +Environment=PATH=__HOME__/.opencode/bin:/usr/local/bin:/usr/bin:/bin +Environment=OPENCODE_SERVER_PASSWORD=__OPENCODE_SERVER_PASSWORD__ + +# Hardening +NoNewPrivileges=true +ReadWritePaths=__HOME__ +PrivateTmp=true + +[Install] +WantedBy=default.target diff --git a/dream-server/privacy-shield-offline/Dockerfile b/dream-server/privacy-shield-offline/Dockerfile deleted file mode 100644 index 8be6321f6..000000000 --- a/dream-server/privacy-shield-offline/Dockerfile +++ /dev/null @@ -1,30 +0,0 @@ -# API Privacy Shield - OFFLINE MODE -# Zero-cloud PII proxy with local-only API routing -# M1 Phase 2 - M3 Security Component -# -# Build: docker build -t privacy-shield-offline . -# Run: docker run -p 8085:8085 --network dream-network-offline privacy-shield-offline - -FROM python:3.11-slim - -WORKDIR /app - -# Install dependencies -COPY requirements.txt . -RUN pip install --no-cache-dir -r requirements.txt - -# Copy application code -COPY proxy.py . -COPY pii_scrubber.py . - -# Create non-root user -RUN useradd -m -u 1000 shield && chown -R shield:shield /app -USER shield - -# Health check -HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ - CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8085/health')" || exit 1 - -EXPOSE 8085 - -CMD ["python", "proxy.py"] diff --git a/dream-server/privacy-shield-offline/pii_scrubber.py b/dream-server/privacy-shield-offline/pii_scrubber.py deleted file mode 100644 index 77e34ca38..000000000 --- a/dream-server/privacy-shield-offline/pii_scrubber.py +++ /dev/null @@ -1,166 +0,0 @@ -#!/usr/bin/env python3 -""" -M3: API Privacy Shield - Core PII Scrubber -Detects and replaces PII with tokens, restores on reverse. -""" - -import re -import hashlib -import secrets -from typing import Dict, List, Tuple, Optional -from dataclasses import dataclass, field - - -@dataclass -class PIIDetector: - """Detects and manages PII in text.""" - - # Token prefix for PII placeholders - token_prefix: str = " str: - """Generate a unique token for PII.""" - # Create deterministic hash for same PII = same token within session - hash_input = f"{pii_type}:{original}:{self.session_token}" - short_hash = hashlib.sha256(hash_input.encode()).hexdigest()[:12] - return f"{self.token_prefix}{pii_type}_{short_hash}{self.token_suffix}" - - def scrub(self, text: str) -> str: - """ - Scrub PII from text, replace with tokens. - Returns scrubbed text. - """ - scrubbed = text - - for pii_type, pattern in self.PATTERNS.items(): - matches = pattern.findall(scrubbed) - for match in matches: - if isinstance(match, tuple): - match = match[0] # Handle groups - - # Check if we've seen this PII before - existing_token = None - for token, original in self.pii_map.items(): - if original == match: - existing_token = token - break - - if existing_token: - scrubbed = scrubbed.replace(match, existing_token, 1) - else: - # New PII - create token - token = self._generate_token(pii_type, match) - self.pii_map[token] = match - scrubbed = scrubbed.replace(match, token, 1) - - return scrubbed - - def restore(self, text: str) -> str: - """ - Restore PII from tokens in text. - Returns restored text. - """ - restored = text - for token, original in self.pii_map.items(): - restored = restored.replace(token, original) - return restored - - def get_stats(self) -> Dict: - """Return statistics about detected PII.""" - return { - 'unique_pii_count': len(self.pii_map), - 'pii_types': list(set( - token.split('_')[1] for token in self.pii_map.keys() - )) - } - - -class PrivacyShield: - """ - Main API Privacy Shield wrapper. - Wraps API calls to scrub/restore PII transparently. - """ - - def __init__(self, backend_client=None): - self.detector = PIIDetector() - self.backend = backend_client # e.g., OpenAI client - - def process_request(self, prompt: str) -> Tuple[str, Dict]: - """ - Process outgoing request - scrub PII. - Returns (scrubbed_prompt, metadata for restore). - """ - scrubbed = self.detector.scrub(prompt) - stats = self.detector.get_stats() - - metadata = { - 'scrubbed': scrubbed != prompt, - 'pii_count': stats['unique_pii_count'], - 'pii_types': stats['pii_types'] - } - - return scrubbed, metadata - - def process_response(self, response_text: str) -> str: - """ - Process incoming response - restore PII. - """ - return self.detector.restore(response_text) - - -# Simple CLI for testing -if __name__ == "__main__": - import sys - - shield = PrivacyShield() - - # Test input - test_text = """ - Contact John Doe at john.doe@example.com or call 555-123-4567. - API Key: sk-abc123xyz789abcdef - Server IP: 192.168.1.100 - SSN: 123-45-6789 - """ - - print("=== PII Scrubber Test ===") - print(f"\nOriginal:\n{test_text}") - - scrubbed, meta = shield.process_request(test_text) - print(f"\nScrubbed:\n{scrubbed}") - print(f"\nMetadata: {meta}") - - restored = shield.process_response(scrubbed) - print(f"\nRestored:\n{restored}") - - # Verify round-trip - if restored.strip() == test_text.strip(): - print("\nโœ… Round-trip successful!") - else: - print("\nโŒ Round-trip failed!") - print(f"Diff: {set(restored.split()) ^ set(test_text.split())}") diff --git a/dream-server/privacy-shield-offline/proxy.py b/dream-server/privacy-shield-offline/proxy.py deleted file mode 100644 index ebb35e476..000000000 --- a/dream-server/privacy-shield-offline/proxy.py +++ /dev/null @@ -1,296 +0,0 @@ -#!/usr/bin/env python3 -""" -M3: API Privacy Shield - OFFLINE MODE -Zero-cloud PII proxy - only routes to local APIs -M1 Phase 2 - Blocks all external endpoints -""" - -import os -import time -import httpx -import re -import hashlib -from fastapi import FastAPI, Request, Response, HTTPException, Depends, Security -from fastapi.responses import JSONResponse -from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials -from functools import lru_cache -import uvicorn -import json -from cachetools import TTLCache - -from pii_scrubber import PrivacyShield - - -app = FastAPI(title="API Privacy Shield - OFFLINE MODE", version="0.3.0-offline") - -# Security: API Key Authentication -SHIELD_API_KEY = os.environ.get("SHIELD_API_KEY") -if not SHIELD_API_KEY: - SHIELD_API_KEY = "not-needed" # Default for offline/local-only mode - -security_scheme = HTTPBearer(auto_error=False) - -async def verify_api_key(credentials: HTTPAuthorizationCredentials = Security(security_scheme)): - """Verify API key for protected endpoints.""" - if not credentials: - raise HTTPException( - status_code=401, - detail="Authentication required. Provide Bearer token in Authorization header.", - headers={"WWW-Authenticate": "Bearer"} - ) - if credentials.credentials != SHIELD_API_KEY: - raise HTTPException(status_code=403, detail="Invalid API key.") - return credentials.credentials - -# OFFLINE MODE: Only allow local endpoints -ALLOWED_TARGETS = [ - "http://vllm:8000", - "http://vllm:8000/v1", - "http://ollama:11434", - "http://ollama:11434/v1", - "http://localhost:8000", - "http://localhost:11434", - "http://127.0.0.1:8000", - "http://127.0.0.1:11434", -] - -# Configuration from environment -DEFAULT_TARGET = os.getenv("TARGET_API_URL", "http://vllm:8000/v1") -TARGET_API_KEY = os.getenv("TARGET_API_KEY", "not-needed") -PORT = int(os.getenv("SHIELD_PORT", "8085")) -CACHE_ENABLED = os.getenv("PII_CACHE_ENABLED", "true").lower() == "true" -CACHE_SIZE = int(os.getenv("PII_CACHE_SIZE", "1000")) -BLOCK_EXTERNAL = os.getenv("BLOCK_EXTERNAL", "true").lower() == "true" - -# OFFLINE MODE: Validate target is local-only -if BLOCK_EXTERNAL and DEFAULT_TARGET not in ALLOWED_TARGETS: - # Check if it's at least a local-looking URL - if not re.match(r'^https?://(localhost|127\.0\.0\.1|vllm|ollama|\[::1\]):?\d*', DEFAULT_TARGET): - raise ValueError(f"OFFLINE MODE: Target API must be local. Got: {DEFAULT_TARGET}") - -# Connection pool for better performance -http_client = httpx.AsyncClient( - limits=httpx.Limits(max_keepalive_connections=100, max_connections=200), - timeout=httpx.Timeout(60.0, connect=5.0) -) - -# Session store (TTLCache for bounded memory, auto-eviction of stale sessions) -sessions = TTLCache(maxsize=10000, ttl=3600) - - -class CachedPrivacyShield(PrivacyShield): - """PrivacyShield with LRU cache for PII patterns.""" - - def __init__(self, backend_client=None): - super().__init__(backend_client) - if CACHE_ENABLED: - self._scrub_cached = lru_cache(maxsize=CACHE_SIZE)(self._scrub_impl) - - def _scrub_impl(self, text: str) -> str: - """Internal scrub implementation.""" - return self.detector.scrub(text) - - def scrub(self, text: str) -> str: - """Scrub with optional caching.""" - if CACHE_ENABLED and len(text) < 1000: # Only cache small texts - return self._scrub_cached(text) - return self._scrub_impl(text) - - -def get_session(request: Request) -> CachedPrivacyShield: - """Get or create session-specific PrivacyShield.""" - auth = request.headers.get("Authorization", "") - # Use SHA256 for deterministic, stable session keying (hash() is not deterministic across restarts) - if auth: - session_key = hashlib.sha256(auth.encode()).hexdigest() - else: - client_info = str(request.client.host if request.client else "default") - session_key = hashlib.sha256(client_info.encode()).hexdigest() - - if session_key not in sessions: - sessions[session_key] = CachedPrivacyShield() - - return sessions[session_key] - - -def is_local_endpoint(url: str) -> bool: - """OFFLINE MODE: Check if URL is a local-only endpoint.""" - if not BLOCK_EXTERNAL: - return True - - # Check against allowed list - if any(url.startswith(allowed) for allowed in ALLOWED_TARGETS): - return True - - # Check for local patterns - local_patterns = [ - r'^https?://localhost[:/]', - r'^https?://127\.0\.0\.1[:/]', - r'^https?://\[::1\][:/)]', - r'^https?://vllm[:/]', - r'^https?://ollama[:/]', - r'^https?://whisper[:/]', - r'^https?://kokoro[:/]', - r'^https?://embeddings[:/]', - r'^https?://192\.168\.', # Local network (192.168.0.0/16) - r'^https?://10\.\d+\.\d+\.\d+', # Private subnet (10.0.0.0/8) - r'^https?://172\.(1[6-9]|2[0-9]|3[01])\.', # Private subnet (172.16.0.0/12) - ] - - return any(re.match(pattern, url) for pattern in local_patterns) - - -@app.get("/health") -async def health(): - """Health check endpoint.""" - return { - "status": "ok", - "service": "api-privacy-shield-offline", - "version": "0.3.0-offline", - "target_api": DEFAULT_TARGET, - "cache_enabled": CACHE_ENABLED, - "block_external": BLOCK_EXTERNAL, - "active_sessions": len(sessions), - "mode": "offline" - } - - -@app.get("/stats") -async def stats(): - """Session statistics.""" - total_pii = sum( - s.detector.get_stats()['unique_pii_count'] - for s in sessions.values() - ) - return { - "active_sessions": len(sessions), - "total_pii_scrubbed": total_pii, - "cache_enabled": CACHE_ENABLED, - "cache_size": CACHE_SIZE, - "block_external": BLOCK_EXTERNAL, - "mode": "offline" - } - - -@app.get("/config") -async def config(): - """OFFLINE MODE: Show allowed endpoints.""" - return { - "mode": "offline", - "target_api": DEFAULT_TARGET, - "allowed_targets": ALLOWED_TARGETS if BLOCK_EXTERNAL else ["all (external allowed)"], - "block_external": BLOCK_EXTERNAL, - "cache_enabled": CACHE_ENABLED, - "cache_size": CACHE_SIZE - } - - -@app.post("/{path:path}", dependencies=[Depends(verify_api_key)]) -@app.get("/{path:path}", dependencies=[Depends(verify_api_key)]) -async def proxy(request: Request, path: str): - """ - Proxy endpoint that scrubs PII from requests and restores in responses. - OFFLINE MODE: Only allows local API endpoints. - """ - start_time = time.time() - shield = get_session(request) - - # Read and process request body - body = await request.body() - body_str = body.decode('utf-8') if body else "" - - # Scrub PII from request - scrubbed_body, metadata = shield.process_request(body_str) - - # Determine target URL - target_url = f"{DEFAULT_TARGET}/{path}" - - # OFFLINE MODE: Block external URLs - if not is_local_endpoint(target_url): - return JSONResponse( - status_code=403, - content={ - "error": "OFFLINE MODE: External API calls blocked", - "shield": "active", - "blocked_url": target_url, - "allowed": "local endpoints only (vllm, ollama, localhost)" - } - ) - - # Prepare headers - headers = {k: v for k, v in request.headers.items() if k.lower() not in ('host', 'content-length')} - - # Set host header for target - host = DEFAULT_TARGET.split("//")[-1].split("/")[0] - headers["host"] = host - - # Use target API key if configured - if TARGET_API_KEY and TARGET_API_KEY != "not-needed": - headers["Authorization"] = f"Bearer {TARGET_API_KEY}" - - try: - if request.method == "POST": - resp = await http_client.post( - target_url, - headers=headers, - content=scrubbed_body.encode('utf-8') - ) - else: - resp = await http_client.get( - target_url, - headers=headers - ) - - # Read response - response_body = resp.content.decode('utf-8') - - # Restore PII in response - restored_body = shield.process_response(response_body) - - # Calculate overhead - overhead_ms = (time.time() - start_time) * 1000 - - # Add privacy headers - response_headers = { - "X-Privacy-Shield": "active-offline", - "X-PII-Scrubbed": str(metadata.get('pii_count', 0)), - "X-Processing-Time-Ms": f"{overhead_ms:.2f}", - "Content-Type": resp.headers.get("Content-Type", "application/json") - } - - return Response( - content=restored_body, - status_code=resp.status_code, - headers=response_headers - ) - - except httpx.TimeoutException: - return JSONResponse( - status_code=504, - content={"error": "Gateway timeout", "shield": "active-offline"} - ) - except Exception as e: - import re - # Sanitize error message to prevent PII leakage in response - error_str = str(e) - error_str = re.sub(r'', '[REDACTED]', error_str) - error_str = re.sub(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', '[EMAIL]', error_str) - return JSONResponse( - status_code=500, - content={"error": "Request processing failed", "shield": "active-offline"} - ) - - -@app.on_event("shutdown") -async def shutdown(): - """Cleanup on shutdown.""" - await http_client.aclose() - - -if __name__ == "__main__": - print(f"๐Ÿ”’ API Privacy Shield (OFFLINE MODE) starting on port {PORT}") - print(f"๐Ÿ“ก Proxying to: {DEFAULT_TARGET}") - print(f"๐Ÿšซ External APIs: {'BLOCKED' if BLOCK_EXTERNAL else 'ALLOWED'}") - print(f"๐Ÿ’พ Cache: {'enabled' if CACHE_ENABLED else 'disabled'} (size={CACHE_SIZE})") - print(f"๐Ÿงช Test with: curl http://localhost:{PORT}/health") - uvicorn.run(app, host="0.0.0.0", port=PORT) diff --git a/dream-server/privacy-shield-offline/requirements.txt b/dream-server/privacy-shield-offline/requirements.txt deleted file mode 100644 index de2e13744..000000000 --- a/dream-server/privacy-shield-offline/requirements.txt +++ /dev/null @@ -1,6 +0,0 @@ -fastapi>=0.100.0 -httpx>=0.24.0 -uvicorn>=0.23.0 -cachetools>=5.0.0 -# OFFLINE MODE: No external dependencies -# Using local regex-based PII detection instead of Presidio cloud models diff --git a/dream-server/scripts/README.md b/dream-server/scripts/README.md new file mode 100644 index 000000000..078780dd4 --- /dev/null +++ b/dream-server/scripts/README.md @@ -0,0 +1,71 @@ +# Dream Server Scripts + +Utility scripts for diagnostics, testing, validation, and operations. + +## Diagnostics + +| Script | Description | Requires Stack? | +|--------|-------------|-----------------| +| `dream-doctor.sh` | JSON diagnostic report with autofix hints | No | +| `dream-preflight.sh` | Pre-install hardware/software checks | No | +| `detect-hardware.sh` | Hardware detection (`--json` for machine output) | No | +| `classify-hardware.sh` | GPU-to-tier classification | No | +| `build-capability-profile.sh` | Machine capability JSON profile | No | +| `health-check.sh` | Service health checks | Yes | + +## Testing + +| Script | Description | Requires Stack? | +|--------|-------------|-----------------| +| `dream-test.sh` | Full validation (`--quick`, `--json`, `--service`) | Yes | +| `dream-test-functional.sh` | Functional tests (inference, TTS, STT) | Yes | +| `validate.sh` | Post-install validation | Yes | +| `validate-env.sh` | Validate .env against schema | No | +| `simulate-installers.sh` | Cross-platform installer simulation | No | +| `release-gate.sh` | Full pre-release checklist | No | +| `check-compatibility.sh` | Manifest compatibility checks | No | +| `check-release-claims.sh` | Verify release claim accuracy | No | + +## Operations + +| Script | Description | Requires Stack? | +|--------|-------------|-----------------| +| `mode-switch.sh` | Switch deployment modes | Yes | +| `upgrade-model.sh` | Upgrade to a different model | Yes | +| `migrate-config.sh` | Migrate config between versions | No | +| `session-cleanup.sh` | OpenClaw session lifecycle | Yes | +| `pre-download.sh` | Pre-download models for offline use | No | +| `llm-cold-storage.sh` | Archive/restore models | No | + +## Installer Support + +| Script | Description | +|--------|-------------| +| `load-backend-contract.sh` | Load backend contract JSON as env vars | +| `resolve-compose-stack.sh` | Resolve compose overlay stack | +| `preflight-engine.sh` | Preflight validation engine | +| `check-offline-models.sh` | Verify offline model availability | + +## Python Utilities + +| Script | Description | +|--------|-------------| +| `healthcheck.py` | Container health check helper | +| `validate-models.py` | Validate model file integrity | +| `validate-sim-summary.py` | Validate simulation summary output | + +## Systemd Units (`systemd/`) + +| Unit | Description | +|------|-------------| +| `openclaw-session-cleanup.service/.timer` | Periodic OpenClaw session cleanup | +| `memory-shepherd-memory.service/.timer` | Agent memory lifecycle management | +| `memory-shepherd-workspace.service/.timer` | Agent workspace maintenance | + +## Other + +| Script | Description | +|--------|-------------| +| `showcase.sh` | Demo/showcase runner | +| `first-boot-demo.sh` | First-boot guided tour | +| `demo-offline.sh` | Offline mode demo | diff --git a/dream-server/scripts/build-capability-profile.sh b/dream-server/scripts/build-capability-profile.sh new file mode 100644 index 000000000..f65f94105 --- /dev/null +++ b/dream-server/scripts/build-capability-profile.sh @@ -0,0 +1,176 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +ROOT_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)" +OUTPUT_FILE="" +ENV_MODE="false" + +while [[ $# -gt 0 ]]; do + case "$1" in + --output) + OUTPUT_FILE="${2:-}" + shift 2 + ;; + --env) + ENV_MODE="true" + shift + ;; + *) + echo "Unknown argument: $1" >&2 + exit 1 + ;; + esac +done + +if [[ -z "$OUTPUT_FILE" ]]; then + OUTPUT_FILE="${ROOT_DIR}/.capabilities.json" +fi + +if [[ ! -x "${SCRIPT_DIR}/detect-hardware.sh" ]]; then + echo "detect-hardware.sh not found or not executable" >&2 + exit 1 +fi + +HARDWARE_JSON="$("${SCRIPT_DIR}/detect-hardware.sh" --json)" +CLASS_ENV="$("${SCRIPT_DIR}/classify-hardware.sh" \ + --platform-id "$(python3 -c "import json,sys; print(json.loads(sys.argv[1]).get('os','unknown'))" "$HARDWARE_JSON")" \ + --gpu-vendor "$(python3 -c "import json,sys; print(json.loads(sys.argv[1]).get('gpu',{}).get('type','unknown'))" "$HARDWARE_JSON")" \ + --memory-type "$(python3 -c "import json,sys; print(json.loads(sys.argv[1]).get('gpu',{}).get('memory_type','unknown'))" "$HARDWARE_JSON")" \ + --vram-mb "$(python3 -c "import json,sys; print(json.loads(sys.argv[1]).get('gpu',{}).get('vram_mb',0))" "$HARDWARE_JSON")" \ + --device-id "$(python3 -c "import json,sys; print(json.loads(sys.argv[1]).get('gpu',{}).get('device_id',''))" "$HARDWARE_JSON")" \ + --gpu-name "$(python3 -c "import json,sys; print(json.loads(sys.argv[1]).get('gpu',{}).get('name',''))" "$HARDWARE_JSON")" \ + --cpu-name "$(python3 -c "import json,sys; print(json.loads(sys.argv[1]).get('cpu',''))" "$HARDWARE_JSON")" \ + --ram-mb "$(python3 -c "import json,sys; print(json.loads(sys.argv[1]).get('ram_gb',0) * 1024)" "$HARDWARE_JSON")" \ + --env)" +eval "$CLASS_ENV" + +# Source service registry for LLM port +if [[ -f "$ROOT_DIR/lib/service-registry.sh" ]]; then + export SCRIPT_DIR="$ROOT_DIR" + . "$ROOT_DIR/lib/service-registry.sh" + sr_load +fi +_LLM_PORT="${SERVICE_PORTS[llama-server]:-8080}" +_LLM_HEALTH="${SERVICE_HEALTH[llama-server]:-/health}" + +python3 - "$HARDWARE_JSON" "$OUTPUT_FILE" "$ENV_MODE" "${HW_CLASS_ID:-unknown}" "${HW_CLASS_LABEL:-Unknown}" "${HW_REC_BACKEND:-cpu}" "${HW_REC_TIER:-T1}" "${HW_REC_COMPOSE_OVERLAYS:-}" "$_LLM_PORT" "$_LLM_HEALTH" <<'PY' +import json +import pathlib +import sys + +hardware = json.loads(sys.argv[1]) +output_path = pathlib.Path(sys.argv[2]) +env_mode = sys.argv[3] == "true" +hw_class_id = sys.argv[4] +hw_class_label = sys.argv[5] +hw_rec_backend = sys.argv[6] +hw_rec_tier = sys.argv[7] +hw_rec_overlays = [x for x in sys.argv[8].split(",") if x] +llm_port = int(sys.argv[9]) if len(sys.argv) > 9 else 8080 +llm_health = sys.argv[10] if len(sys.argv) > 10 else "/health" + +os_name = (hardware.get("os") or "unknown").lower() +if os_name in {"linux", "wsl"}: + family = "linux" +elif os_name == "macos": + family = "darwin" +elif os_name == "windows": + family = "windows" +else: + family = "unknown" + +gpu = hardware.get("gpu", {}) +gpu_type = (gpu.get("type") or "none").lower() +gpu_name = gpu.get("name") or "None" +memory_type = (gpu.get("memory_type") or "none").lower() +vram_mb = int(gpu.get("vram_mb") or 0) +gpu_count = 1 if gpu_type not in {"none", ""} else 0 + +llm_health_url = f"http://localhost:{llm_port}{llm_health}" +llm_api_port = llm_port + +if gpu_type == "amd" and memory_type == "unified": + llm_backend = "amd" + overlays = ["docker-compose.base.yml", "docker-compose.amd.yml"] +elif gpu_type == "nvidia": + llm_backend = "nvidia" + overlays = ["docker-compose.base.yml", "docker-compose.nvidia.yml"] +elif gpu_type == "apple": + llm_backend = "apple" + overlays = ["docker-compose.base.yml", "docker-compose.amd.yml"] +else: + llm_backend = "cpu" + overlays = ["docker-compose.base.yml", "docker-compose.nvidia.yml"] + +tier = (hardware.get("tier") or "T1").upper() +if tier in {"T1", "T2", "T3", "T4"}: + recommended = tier +elif tier in {"SH_COMPACT", "SH_LARGE"}: + recommended = tier +else: + recommended = "T1" + +if hw_rec_tier: + recommended = hw_rec_tier +if hw_rec_backend: + llm_backend = hw_rec_backend +if hw_rec_overlays: + overlays = hw_rec_overlays + +profile = { + "version": "1", + "platform": { + "id": os_name, + "family": family, + }, + "gpu": { + "vendor": gpu_type if gpu_type in {"nvidia", "amd", "apple", "none"} else "unknown", + "name": gpu_name, + "memory_type": memory_type if memory_type in {"discrete", "unified", "none"} else "unknown", + "count": gpu_count, + "vram_mb": vram_mb, + }, + "runtime": { + "llm_backend": llm_backend, + "llm_health_url": llm_health_url, + "llm_api_port": llm_api_port, + }, + "compose": { + "overlays": overlays, + }, + "tier": { + "recommended": recommended, + }, + "hardware_class": { + "id": hw_class_id, + "label": hw_class_label, + } +} + +output_path.parent.mkdir(parents=True, exist_ok=True) +output_path.write_text(json.dumps(profile, indent=2) + "\n", encoding="utf-8") + +if env_mode: + env = { + "CAP_PROFILE_VERSION": profile["version"], + "CAP_PLATFORM_ID": profile["platform"]["id"], + "CAP_PLATFORM_FAMILY": profile["platform"]["family"], + "CAP_GPU_VENDOR": profile["gpu"]["vendor"], + "CAP_GPU_NAME": profile["gpu"]["name"], + "CAP_GPU_MEMORY_TYPE": profile["gpu"]["memory_type"], + "CAP_GPU_COUNT": str(profile["gpu"]["count"]), + "CAP_GPU_VRAM_MB": str(profile["gpu"]["vram_mb"]), + "CAP_LLM_BACKEND": profile["runtime"]["llm_backend"], + "CAP_LLM_HEALTH_URL": profile["runtime"]["llm_health_url"], + "CAP_LLM_API_PORT": str(profile["runtime"]["llm_api_port"]), + "CAP_RECOMMENDED_TIER": profile["tier"]["recommended"], + "CAP_COMPOSE_OVERLAYS": ",".join(profile["compose"]["overlays"]), + "CAP_HARDWARE_CLASS_ID": profile["hardware_class"]["id"], + "CAP_HARDWARE_CLASS_LABEL": profile["hardware_class"]["label"], + "CAP_PROFILE_FILE": str(output_path), + } + for key, value in env.items(): + safe = str(value).replace("\\", "\\\\").replace('"', '\\"') + print(f'{key}="{safe}"') +PY diff --git a/dream-server/scripts/check-compatibility.sh b/dream-server/scripts/check-compatibility.sh new file mode 100644 index 000000000..cca5ab8e6 --- /dev/null +++ b/dream-server/scripts/check-compatibility.sh @@ -0,0 +1,46 @@ +#!/bin/bash +# Validate core compatibility contracts from manifest.json. + +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +MANIFEST_FILE="${ROOT_DIR}/manifest.json" + +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +NC='\033[0m' + +fail() { echo -e "${RED}[FAIL]${NC} $1"; exit 1; } +pass() { echo -e "${GREEN}[PASS]${NC} $1"; } +warn() { echo -e "${YELLOW}[WARN]${NC} $1"; } + +command -v jq >/dev/null 2>&1 || fail "jq is required" +test -f "$MANIFEST_FILE" || fail "manifest.json not found" + +jq -e '.manifestVersion and .release.version and .compatibility and .contracts' "$MANIFEST_FILE" >/dev/null \ + || fail "manifest.json missing required top-level fields" +pass "manifest structure" + +# Compose contract files +while IFS= read -r file; do + test -f "${ROOT_DIR}/${file}" || fail "missing compose contract file: ${file}" +done < <(jq -r '.contracts.compose.canonical[]' "$MANIFEST_FILE") +pass "compose canonical files" + +# Workflow catalog canonical path +workflow_path="$(jq -r '.contracts.workflowCatalog.canonicalPath' "$MANIFEST_FILE")" +test -f "${ROOT_DIR}/${workflow_path}" || fail "missing canonical workflow catalog: ${workflow_path}" +pass "workflow catalog canonical path" + +# Extension schema contract +schema_path="$(jq -r '.contracts.extensions.serviceManifestSchema' "$MANIFEST_FILE")" +test -f "${ROOT_DIR}/${schema_path}" || fail "missing extension schema: ${schema_path}" +pass "extension schema contract" + +# Support matrix consistency checks +if jq -e '.compatibility.os.macos.supported == false' "$MANIFEST_FILE" >/dev/null; then + grep -q "macOS.*Tier C" "${ROOT_DIR}/docs/SUPPORT-MATRIX.md" \ + || warn "manifest says macOS unsupported/preview but docs may be out of sync" +fi +pass "compatibility check complete" diff --git a/dream-server/scripts/check-offline-models.sh b/dream-server/scripts/check-offline-models.sh old mode 100755 new mode 100644 index 26e0e0661..0e10d0393 --- a/dream-server/scripts/check-offline-models.sh +++ b/dream-server/scripts/check-offline-models.sh @@ -22,12 +22,13 @@ echo "" MISSING=() -# Check vLLM model -if [ -d "models/Qwen/Qwen2.5-32B-Instruct-AWQ" ]; then - echo -e "${GREEN}โœ“${NC} Qwen 2.5 32B AWQ (Primary LLM)" +# Check LLM model (GGUF) +if ls data/models/*.gguf &>/dev/null; then + MODEL_FILE=$(ls -1 data/models/*.gguf | head -1) + echo -e "${GREEN}โœ“${NC} LLM model: $(basename "$MODEL_FILE")" else - echo -e "${RED}โœ—${NC} Qwen 2.5 32B AWQ - MISSING" - MISSING+=("Qwen2.5-32B-Instruct-AWQ") + echo -e "${RED}โœ—${NC} LLM model (GGUF) - MISSING" + MISSING+=("gguf-model") fi # Check Whisper model diff --git a/dream-server/scripts/check-release-claims.sh b/dream-server/scripts/check-release-claims.sh new file mode 100644 index 000000000..e0c3d9fce --- /dev/null +++ b/dream-server/scripts/check-release-claims.sh @@ -0,0 +1,38 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +MANIFEST="${ROOT_DIR}/manifest.json" +MATRIX="${ROOT_DIR}/docs/SUPPORT-MATRIX.md" +TRUTH="${ROOT_DIR}/docs/PLATFORM-TRUTH-TABLE.md" + +fail() { echo "[FAIL] $1"; exit 1; } +pass() { echo "[PASS] $1"; } + +command -v jq >/dev/null 2>&1 || fail "jq is required" +test -f "$MANIFEST" || fail "manifest.json missing" +test -f "$MATRIX" || fail "docs/SUPPORT-MATRIX.md missing" +test -f "$TRUTH" || fail "docs/PLATFORM-TRUTH-TABLE.md missing" + +# Manifest support expectations +linux_supported="$(jq -r '.compatibility.os.linux.supported' "$MANIFEST")" +wsl_supported="$(jq -r '.compatibility.os.windows_wsl2.supported' "$MANIFEST")" +macos_supported="$(jq -r '.compatibility.os.macos.supported' "$MANIFEST")" +windows_native_supported="$(jq -r '.compatibility.os.windows_native.supported' "$MANIFEST")" + +[[ "$linux_supported" == "true" ]] || fail "manifest must mark linux supported" +[[ "$wsl_supported" == "true" ]] || fail "manifest must mark windows_wsl2 supported" +[[ "$macos_supported" == "false" ]] || fail "manifest must mark macos unsupported/preview" +[[ "$windows_native_supported" == "false" ]] || fail "manifest must mark windows_native unsupported" + +# Support matrix wording expectations +grep -q "Windows native installer UX.*Tier B" "$MATRIX" || fail "support matrix missing Windows Tier B delegated claim" +grep -q "macOS (Apple Silicon).*Tier C" "$MATRIX" || fail "support matrix missing macOS Tier C claim" +grep -q "Windows delegated installer flow is available via WSL2" "$MATRIX" || fail "support matrix missing Windows delegated truth statement" + +# Truth table consistency +grep -q "Windows via WSL2.*Tier B" "$TRUTH" || fail "truth table missing Windows via WSL2 Tier B" +grep -q "macOS Apple Silicon.*Tier C" "$TRUTH" || fail "truth table missing macOS Tier C" +grep -q "Not safe to claim now" "$TRUTH" || fail "truth table missing launch guardrails section" + +pass "release claim gates" diff --git a/dream-server/scripts/classify-hardware.sh b/dream-server/scripts/classify-hardware.sh new file mode 100644 index 000000000..4376936e9 --- /dev/null +++ b/dream-server/scripts/classify-hardware.sh @@ -0,0 +1,207 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Dream Server Hardware Classifier โ€” Two-pass GPU matching +# Pass 1: Match known_gpus by device_id then name_patterns (gpu-database.json) +# Pass 2: Fall back to heuristic_classes (threshold-based, same as old hardware-classes.json) +# +# Accepts both old args (--platform-id, --gpu-vendor) and new args (--device-id, --gpu-name, --ram-mb) +# Output contract: HW_CLASS_ID, HW_CLASS_LABEL, HW_REC_BACKEND, HW_REC_TIER, +# HW_REC_COMPOSE_OVERLAYS, HW_BANDWIDTH_GBPS, HW_MEMORY_SOURCE, HW_GPU_LABEL + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +ROOT_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)" +GPU_DB="${ROOT_DIR}/config/gpu-database.json" +ENV_MODE="false" +PLATFORM_ID="${PLATFORM_ID:-unknown}" +GPU_VENDOR="${GPU_VENDOR:-unknown}" +MEMORY_TYPE="${MEMORY_TYPE:-unknown}" +VRAM_MB="${VRAM_MB:-0}" +DEVICE_ID="" +GPU_NAME="" +CPU_NAME="" +RAM_MB="0" + +while [[ $# -gt 0 ]]; do + case "$1" in + --platform-id) PLATFORM_ID="${2:-$PLATFORM_ID}"; shift 2 ;; + --gpu-vendor) GPU_VENDOR="${2:-$GPU_VENDOR}"; shift 2 ;; + --memory-type) MEMORY_TYPE="${2:-$MEMORY_TYPE}"; shift 2 ;; + --vram-mb) VRAM_MB="${2:-$VRAM_MB}"; shift 2 ;; + --device-id) DEVICE_ID="${2:-}"; shift 2 ;; + --gpu-name) GPU_NAME="${2:-}"; shift 2 ;; + --cpu-name) CPU_NAME="${2:-}"; shift 2 ;; + --ram-mb) RAM_MB="${2:-0}"; shift 2 ;; + --env) ENV_MODE="true"; shift ;; + --db) GPU_DB="${2:-$GPU_DB}"; shift 2 ;; + *) + echo "Unknown argument: $1" >&2 + exit 1 + ;; + esac +done + +if [[ ! -f "$GPU_DB" ]]; then + echo "ERROR: GPU database not found: $GPU_DB" >&2 + exit 1 +fi + +python3 - "$GPU_DB" "$ENV_MODE" "$PLATFORM_ID" "$GPU_VENDOR" "$MEMORY_TYPE" "$VRAM_MB" "$DEVICE_ID" "$GPU_NAME" "$CPU_NAME" "$RAM_MB" <<'PY' +import json +import sys + +db_path = sys.argv[1] +env_mode = sys.argv[2] == "true" +platform_id = sys.argv[3] +gpu_vendor = sys.argv[4] +memory_type = sys.argv[5] +vram_mb = int(float(sys.argv[6] or 0)) +device_id = sys.argv[7] +gpu_name = sys.argv[8] +cpu_name = sys.argv[9] +ram_mb = int(float(sys.argv[10] or 0)) + +with open(db_path, "r", encoding="utf-8") as f: + db = json.load(f) + +# --- Compose overlay mapping (backend โ†’ default overlays) --- +OVERLAY_MAP = { + "amd": ["docker-compose.base.yml", "docker-compose.amd.yml"], + "nvidia": ["docker-compose.base.yml", "docker-compose.nvidia.yml"], + "apple": ["docker-compose.base.yml", "docker-compose.apple.yml"], + "cpu": ["docker-compose.base.yml"], +} + +# --- Pass 1: Match known_gpus by device_id then name_patterns --- +selected = None +combined_name = f"{gpu_name} {cpu_name}".strip().lower() + +for entry in db.get("known_gpus", []): + match = entry.get("match", {}) + + # Try device_id match (exact, most reliable) + dev_ids = [d.lower() for d in match.get("device_ids", [])] + id_matched = device_id.lower() in dev_ids if device_id else False + + # Try name_patterns match (case-insensitive substring against gpu_name + cpu_name) + patterns = match.get("name_patterns", []) + name_matched = any(p.lower() in combined_name for p in patterns) if combined_name and patterns else False + + if id_matched and name_matched: + # Best match: both device_id and name match + selected = entry + break + elif id_matched and not selected: + # Device ID matched but name didn't โ€” remember as fallback + selected = entry + # Keep looking for a better match with same device_id + continue + elif name_matched and not selected: + selected = entry + break + +# --- Pass 2: Heuristic fallback (threshold-based, top-down) --- +if not selected: + for entry in db.get("heuristic_classes", []): + match = entry.get("match", {}) + + # Check vendor + m_vendor = match.get("vendor", "") + if m_vendor and m_vendor != gpu_vendor: + continue + + # Check memory_type + m_memtype = match.get("memory_type", "") + if m_memtype and m_memtype != memory_type: + continue + + # Check min_vram_mb + min_vram = match.get("min_vram_mb", -1) + if min_vram >= 0 and vram_mb < min_vram: + continue + + # Check min_ram_mb (for unified memory classes) + min_ram = match.get("min_ram_mb", -1) + if min_ram >= 0 and ram_mb < min_ram: + continue + + selected = entry + break + +# --- Bandwidth lookup --- +bandwidth = 0 +if selected and "specs" in selected: + bandwidth = selected["specs"].get("bandwidth_gbps", 0) + +if bandwidth == 0 and gpu_name: + # Search bandwidth table by substring match + vendor_bw = db.get("known_gpu_bandwidth", {}).get(gpu_vendor, {}) + for bw_name, bw_val in vendor_bw.items(): + if bw_name.lower() in gpu_name.lower() or bw_name.lower() in cpu_name.lower(): + bandwidth = bw_val + break + +if bandwidth == 0: + # Fall back to default bandwidth + backend_key_map = {"nvidia": "cuda", "amd": "rocm", "apple": "metal"} + bk = backend_key_map.get(gpu_vendor, "cpu_x86") + bandwidth = db.get("defaults", {}).get("bandwidth_gbps", {}).get(bk, 0) + +# --- Build result --- +if selected: + # Known GPU entry + if "specs" in selected: + class_id = selected.get("id", "unknown") + label = selected["specs"].get("label", selected.get("id", "Unknown")) + rec = selected.get("recommended", {}) + backend = rec.get("backend", "cpu") + tier = rec.get("tier", "T1") + memory_source = selected["specs"].get("memory_source", "vram") + else: + # Heuristic class entry + class_id = selected.get("id", "unknown") + label = selected.get("id", "Unknown").replace("_", " ").title() + rec = selected.get("recommended", {}) + backend = rec.get("backend", "cpu") + tier = rec.get("tier", "T1") + m_memtype = selected.get("match", {}).get("memory_type", "") + memory_source = "ram" if m_memtype == "unified" else "vram" +else: + class_id = "unknown" + label = "Unknown" + backend = "cpu" + tier = "T1" + memory_source = "vram" + +overlays = OVERLAY_MAP.get(backend, ["docker-compose.base.yml"]) +gpu_label = selected["specs"].get("label", "") if selected and "specs" in selected else "" + +# --- Output --- +def out(key, value): + safe = str(value).replace("\\", "\\\\").replace('"', '\\"') + print(f'{key}="{safe}"') + +if env_mode: + out("HW_CLASS_ID", class_id) + out("HW_CLASS_LABEL", label) + out("HW_REC_BACKEND", backend) + out("HW_REC_TIER", tier) + out("HW_REC_COMPOSE_OVERLAYS", ",".join(overlays)) + out("HW_BANDWIDTH_GBPS", bandwidth) + out("HW_MEMORY_SOURCE", memory_source) + out("HW_GPU_LABEL", gpu_label) +else: + result = { + "id": class_id, + "label": label, + "recommended": { + "backend": backend, + "tier": tier, + "compose_overlays": overlays, + }, + "bandwidth_gbps": bandwidth, + "memory_source": memory_source, + "gpu_label": gpu_label, + } + print(json.dumps(result, indent=2)) +PY diff --git a/dream-server/scripts/demo-offline.sh b/dream-server/scripts/demo-offline.sh old mode 100755 new mode 100644 index 95cc27c9b..285926c08 --- a/dream-server/scripts/demo-offline.sh +++ b/dream-server/scripts/demo-offline.sh @@ -79,8 +79,8 @@ demo_chat() { echo -e "${BOLD}${MAGENTA}Demo: Chat with AI${NC}" echo -e "${DIM}โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”${NC}" echo "" - echo -e "${DIM}[Connected to vLLM โ†’ Qwen2.5-32B-Instruct-AWQ]${NC}" - echo -e "${DIM}[API: http://localhost:8000/v1/chat/completions]${NC}" + echo -e "${DIM}[Connected to llama-server โ†’ local GGUF model]${NC}" + echo -e "${DIM}[API: http://localhost:8080/v1/chat/completions]${NC}" echo "" echo -ne "${GREEN}You: ${NC}" @@ -109,7 +109,7 @@ demo_chat() { stream_text " โ€ข Generation speed: 30-50 tokens/sec" 0.02 stream_text " โ€ข Throughput: handles multiple concurrent users" 0.02 echo "" - stream_text "vLLM uses PagedAttention for efficient memory management, so you get near-optimal GPU utilization." 0.02 + stream_text "llama-server uses continuous batching for efficient memory management, so you get near-optimal GPU utilization." 0.02 pause } @@ -120,7 +120,7 @@ demo_voice() { echo -e "${BOLD}${MAGENTA}Demo: Voice-to-Voice${NC}" echo -e "${DIM}โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”${NC}" echo "" - echo -e "${DIM}[Pipeline: Whisper STT โ†’ vLLM โ†’ OpenTTS TTS]${NC}" + echo -e "${DIM}[Pipeline: Whisper STT โ†’ llama-server โ†’ Kokoro TTS]${NC}" echo "" echo -e "${YELLOW}Recording...${NC} ${DIM}(5 seconds)${NC}" @@ -138,7 +138,7 @@ demo_voice() { echo -e " ${GREEN}โœ“${NC} \"Tell me about the weather today\"" echo "" - echo -e "${CYAN}Generating response with vLLM...${NC}" + echo -e "${CYAN}Generating response with llama-server...${NC}" sleep 0.6 echo -ne " ${GREEN}โœ“${NC} " stream_text "I don't have real-time weather data since I run locally, but I can help you set up a workflow that fetches weather from a free API and reads it to you every morning!" 0.02 @@ -330,9 +330,9 @@ demo_overview() { echo -e " ${CYAN}โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”˜${NC}" echo -e " ${CYAN} โ”‚${NC}" echo -e " ${CYAN}โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”${NC}" - echo -e " ${CYAN}โ”‚ vLLM (:8000) โ”‚${NC}" + echo -e " ${CYAN}โ”‚ llama-server (:8080) โ”‚${NC}" echo -e " ${CYAN}โ”‚ High-performance LLM inference โ”‚${NC}" - echo -e " ${CYAN}โ”‚ Qwen2.5 โ€ข 30-50 tok/s โ€ข PagedAttention โ”‚${NC}" + echo -e " ${CYAN}โ”‚ GGUF models โ€ข 30-50 tok/s โ€ข GPU offload โ”‚${NC}" echo -e " ${CYAN}โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”˜${NC}" echo -e " ${CYAN} โ”‚ โ”‚${NC}" echo -e " ${CYAN}โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”${NC}" diff --git a/dream-server/scripts/deploy-livekit.sh b/dream-server/scripts/deploy-livekit.sh deleted file mode 100755 index 107f2410d..000000000 --- a/dream-server/scripts/deploy-livekit.sh +++ /dev/null @@ -1,85 +0,0 @@ -#!/bin/bash -# Deploy LiveKit server for voice chat testing -# Usage: bash scripts/deploy-livekit.sh -# -# โš ๏ธ SECURITY WARNING -# ==================== -# Do NOT use default credentials in production or shared environments. -# For production deployments, set LIVEKIT_API_KEY and LIVEKIT_API_SECRET -# explicitly via environment. - -set -e - -# Required environment variables -if [[ -z "${LIVEKIT_API_KEY}" ]]; then - echo "ERROR: LIVEKIT_API_KEY must be set" >&2 - exit 1 -fi - -if [[ -z "${LIVEKIT_API_SECRET}" ]]; then - echo "ERROR: LIVEKIT_API_SECRET must be set" >&2 - exit 1 -fi - -LIVEKIT_PORT=${LIVEKIT_PORT:-7880} -LIVEKIT_RTC_START=${LIVEKIT_RTC_START:-50000} -LIVEKIT_RTC_END=${LIVEKIT_RTC_END:-50100} - -# Validate RTC port range -if [[ ${LIVEKIT_RTC_START} -ge ${LIVEKIT_RTC_END} ]]; then - echo "Error: RTC_START (${LIVEKIT_RTC_START}) must be less than RTC_END (${LIVEKIT_RTC_END})" >&2 - exit 1 -fi -if [[ ${LIVEKIT_RTC_START} -lt 1 || ${LIVEKIT_RTC_END} -gt 65535 ]]; then - echo "Error: RTC ports must be between 1 and 65535" >&2 - exit 1 -fi - -echo "๐ŸŽค Deploying LiveKit server..." - -# Create config directory -mkdir -p ~/livekit-config - -# Write config -cat > ~/livekit-config/livekit.yaml << YAML -port: ${LIVEKIT_PORT} -rtc: - port_range_start: ${LIVEKIT_RTC_START} - port_range_end: ${LIVEKIT_RTC_END} - use_external_ip: true -keys: - ${LIVEKIT_API_KEY}: ${LIVEKIT_API_SECRET} -logging: - level: info -room: - empty_timeout: 300 - max_participants: 10 -agent: - enabled: true -YAML - -# Stop existing if running -docker stop livekit-server 2>/dev/null || true -docker rm livekit-server 2>/dev/null || true - -# Run LiveKit -docker run -d \ - --name livekit-server \ - --restart unless-stopped \ - -p ${LIVEKIT_PORT}:7880 \ - -p ${LIVEKIT_RTC_START}-${LIVEKIT_RTC_END}:${LIVEKIT_RTC_START}-${LIVEKIT_RTC_END}/udp \ - -v ~/livekit-config/livekit.yaml:/etc/livekit.yaml:ro \ - livekit/livekit-server:v1.9.11 \ - --config /etc/livekit.yaml - -echo "โœ… LiveKit running on port ${LIVEKIT_PORT}" -echo "" -echo "Test: curl http://localhost:${LIVEKIT_PORT}/rtc/validate" -echo "" -echo "Next: Deploy voice agent with your server's IP or hostname:" -echo " LIVEKIT_URL=ws://:${LIVEKIT_PORT}" -echo " STT_URL=http://:9101" -echo " TTS_URL=http://:9102" -echo " LLM_URL=http://:9100/v1" -echo "" -echo "Replace with your actual server IP (e.g., 192.168.1.100)" diff --git a/dream-server/scripts/deploy-voice-agent.sh b/dream-server/scripts/deploy-voice-agent.sh deleted file mode 100755 index b20ac6f4d..000000000 --- a/dream-server/scripts/deploy-voice-agent.sh +++ /dev/null @@ -1,77 +0,0 @@ -#!/bin/bash -# Deploy Voice Agent connecting to cluster services -# -# Usage: bash scripts/deploy-voice-agent.sh -# -# Note: Update LIVEKIT_URL, STT_URL, TTS_URL, LLM_URL env vars if not running locally. - -set -e - -# Cluster service URLs (adjust if running elsewhere) -# Default: local deployment on .122 - update LIVEKIT_URL for remote setups -LIVEKIT_URL=${LIVEKIT_URL:-ws://localhost:7880} -if [[ -z "${LIVEKIT_API_KEY}" ]]; then - echo "Error: LIVEKIT_API_KEY not set" >&2 - exit 1 -fi -if [[ -z "${LIVEKIT_API_SECRET}" ]]; then - echo "Error: LIVEKIT_API_SECRET not set" >&2 - exit 1 -fi -STT_URL=${STT_URL:-http://localhost:9101} -TTS_URL=${TTS_URL:-http://localhost:9102} -LLM_URL=${LLM_URL:-http://localhost:9100/v1} -LLM_MODEL=${LLM_MODEL:-Qwen/Qwen2.5-32B-Instruct-AWQ} - -SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" -AGENT_DIR="${SCRIPT_DIR}/../agents/voice" - -echo "๐ŸŽค Deploying Voice Agent..." -echo " LiveKit: ${LIVEKIT_URL}" -echo " STT: ${STT_URL}" -echo " TTS: ${TTS_URL}" -echo " LLM: ${LLM_URL}" -echo "" - -# Stop existing if running -docker stop dream-voice-agent 2>/dev/null || true -docker rm dream-voice-agent 2>/dev/null || true - -# Build the agent -echo "Building voice agent..." -docker build -t dream-voice-agent:latest "${AGENT_DIR}" - -# Run the agent -docker run -d \ - --name dream-voice-agent \ - --restart unless-stopped \ - --network host \ - -e LIVEKIT_URL="${LIVEKIT_URL}" \ - -e LIVEKIT_API_KEY="${LIVEKIT_API_KEY}" \ - -e LIVEKIT_API_SECRET="${LIVEKIT_API_SECRET}" \ - -e STT_URL="${STT_URL}" \ - -e TTS_URL="${TTS_URL}" \ - -e LLM_URL="${LLM_URL}" \ - -e LLM_MODEL="${LLM_MODEL}" \ - dream-voice-agent:latest - -# Wait for container to start and check health -echo "Waiting for agent to initialize..." -sleep 3 -if docker ps | grep -q dream-voice-agent; then - echo "โœ… Voice Agent started successfully" -else - echo "โš ๏ธ Voice Agent container failed to start - check logs: docker logs dream-voice-agent" - exit 1 -fi - -echo "" -echo "โœ… Voice Agent deployed!" -echo "" -echo "The agent will automatically connect to LiveKit and handle:" -echo " - Speech-to-text via Whisper" -echo " - LLM responses via vLLM" -echo " - Text-to-speech via Kokoro" -echo "" -echo "To test: Open the Dream Server dashboard โ†’ Voice page" -echo "Logs: docker logs -f dream-voice-agent" diff --git a/dream-server/scripts/detect-hardware.ps1 b/dream-server/scripts/detect-hardware.ps1 deleted file mode 100644 index 54acc94ff..000000000 --- a/dream-server/scripts/detect-hardware.ps1 +++ /dev/null @@ -1,130 +0,0 @@ -# Dream Server Hardware Detection (Windows) -# Detects GPU, CPU, RAM and recommends tier - -param( - [switch]$Json -) - -function Get-GpuInfo { - $gpu = @{ - type = "none" - name = "" - vram_mb = 0 - vram_gb = 0 - } - - # Try nvidia-smi first - try { - $nvidiaSmi = & nvidia-smi --query-gpu=name,memory.total --format=csv,noheader,nounits 2>$null - if ($nvidiaSmi) { - $parts = $nvidiaSmi -split ',' - $gpu.type = "nvidia" - $gpu.name = $parts[0].Trim() - $gpu.vram_mb = [int]$parts[1].Trim() - $gpu.vram_gb = [math]::Floor($gpu.vram_mb / 1024) - return $gpu - } - } catch {} - - # Fallback to WMI - try { - $wmiGpu = Get-WmiObject Win32_VideoController | Where-Object { $_.AdapterRAM -gt 0 } | Select-Object -First 1 - if ($wmiGpu) { - $gpu.type = "generic" - $gpu.name = $wmiGpu.Name - $gpu.vram_mb = [math]::Floor($wmiGpu.AdapterRAM / 1024 / 1024) - $gpu.vram_gb = [math]::Floor($gpu.vram_mb / 1024) - return $gpu - } - } catch {} - - return $gpu -} - -function Get-CpuInfo { - try { - $cpu = Get-WmiObject Win32_Processor | Select-Object -First 1 - return @{ - name = $cpu.Name - cores = $cpu.NumberOfCores - threads = $cpu.NumberOfLogicalProcessors - } - } catch { - return @{ - name = "Unknown" - cores = 0 - threads = 0 - } - } -} - -function Get-RamGb { - try { - $ram = Get-WmiObject Win32_ComputerSystem - return [math]::Floor($ram.TotalPhysicalMemory / 1024 / 1024 / 1024) - } catch { - return 0 - } -} - -function Get-Tier { - param([int]$VramGb) - - if ($VramGb -ge 48) { return "T4" } - elseif ($VramGb -ge 20) { return "T3" } - elseif ($VramGb -ge 12) { return "T2" } - else { return "T1" } -} - -function Get-TierDescription { - param([string]$Tier) - - switch ($Tier) { - "T4" { return "Ultimate (48GB+): Full 70B models, multi-model serving" } - "T3" { return "Pro (20-47GB): 32B models, comfortable headroom" } - "T2" { return "Starter (12-19GB): 7-14B models, lean configs" } - "T1" { return "Mini (<12GB): Small models or CPU inference" } - } -} - -# Main -$gpu = Get-GpuInfo -$cpu = Get-CpuInfo -$ram = Get-RamGb -$tier = Get-Tier -VramGb $gpu.vram_gb -$tierDesc = Get-TierDescription -Tier $tier - -if ($Json) { - @{ - os = "windows" - cpu = $cpu.name - cores = $cpu.cores - ram_gb = $ram - gpu = $gpu - tier = $tier - tier_description = $tierDesc - } | ConvertTo-Json -} else { - Write-Host "โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—" -ForegroundColor Blue - Write-Host "โ•‘ Dream Server Hardware Detection โ•‘" -ForegroundColor Blue - Write-Host "โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•" -ForegroundColor Blue - Write-Host "" - Write-Host "System:" -ForegroundColor Green - Write-Host " OS: Windows" - Write-Host " CPU: $($cpu.name)" - Write-Host " Cores: $($cpu.cores)" - Write-Host " RAM: ${ram}GB" - Write-Host "" - Write-Host "GPU:" -ForegroundColor Green - if ($gpu.name) { - Write-Host " Type: $($gpu.type)" - Write-Host " Name: $($gpu.name)" - Write-Host " VRAM: $($gpu.vram_gb)GB" - } else { - Write-Host " No GPU detected (CPU-only mode)" - } - Write-Host "" - Write-Host "Recommended Tier: $tier" -ForegroundColor Yellow - Write-Host " $tierDesc" - Write-Host "" -} diff --git a/dream-server/scripts/detect-hardware.sh b/dream-server/scripts/detect-hardware.sh old mode 100755 new mode 100644 index 7862d2ede..2a97c6a86 --- a/dream-server/scripts/detect-hardware.sh +++ b/dream-server/scripts/detect-hardware.sh @@ -1,6 +1,7 @@ #!/bin/bash # Dream Server Hardware Detection # Detects GPU, CPU, RAM and recommends tier +# Supports: NVIDIA (nvidia-smi), AMD APU/dGPU (sysfs), Apple Silicon set -e @@ -9,6 +10,7 @@ RED='\033[0;31m' GREEN='\033[0;32m' YELLOW='\033[1;33m' BLUE='\033[0;34m' +CYAN='\033[0;36m' NC='\033[0m' # Detect OS and environment @@ -31,8 +33,109 @@ detect_nvidia() { fi } -# Detect AMD GPU (ROCm) +# Detect AMD GPU via sysfs (works without ROCm installed) +# Returns: gpu_name|vram_bytes|gtt_bytes|is_apu|gpu_busy|temp|power|vulkan|rocm|driver|device_id|subsystem_device|revision +detect_amd_sysfs() { + for card_dir in /sys/class/drm/card*/device; do + [[ -d "$card_dir" ]] || continue + local vendor + vendor=$(cat "$card_dir/vendor" 2>/dev/null) || continue + + # 0x1002 = AMD + if [[ "$vendor" == "0x1002" ]]; then + local vram_total gtt_total gpu_name gpu_busy temp power hwmon_dir is_apu + local device_id subsystem_device revision + + # Read PCI device identifiers + device_id=$(cat "$card_dir/device" 2>/dev/null) || device_id="unknown" + subsystem_device=$(cat "$card_dir/subsystem_device" 2>/dev/null) || subsystem_device="unknown" + revision=$(cat "$card_dir/revision" 2>/dev/null) || revision="unknown" + + # Read memory info + vram_total=$(cat "$card_dir/mem_info_vram_total" 2>/dev/null) || vram_total=0 + gtt_total=$(cat "$card_dir/mem_info_gtt_total" 2>/dev/null) || gtt_total=0 + + # Detect if APU (unified memory) + # Strix Halo has small VRAM carve-out (UMA frame buffer, often 1GB) + # but large GTT (actual usable GPU memory from system RAM). + is_apu="false" + if [[ $vram_total -gt 0 && $gtt_total -gt 0 ]]; then + local vram_gb=$(( vram_total / 1073741824 )) + local gtt_gb=$(( gtt_total / 1073741824 )) + if [[ $gtt_gb -ge 16 && $vram_gb -le 4 ]]; then + # Small VRAM + large GTT = APU with unified memory + is_apu="true" + elif [[ $gtt_gb -ge 32 ]]; then + is_apu="true" + elif [[ $vram_gb -ge 32 ]]; then + is_apu="true" + fi + fi + + # GPU utilization + gpu_busy=$(cat "$card_dir/gpu_busy_percent" 2>/dev/null) || gpu_busy=0 + + # Find hwmon for temp/power + temp=0 + power=0 + for hwmon_dir in "$card_dir"/hwmon/hwmon*; do + if [[ -d "$hwmon_dir" ]]; then + local raw_temp raw_power + raw_temp=$(cat "$hwmon_dir/temp1_input" 2>/dev/null) || raw_temp=0 + temp=$(( raw_temp / 1000 )) # millidegrees โ†’ C + raw_power=$(cat "$hwmon_dir/power1_average" 2>/dev/null) || raw_power=0 + power=$(( raw_power / 1000000 )) # microwatts โ†’ W + break + fi + done + + # Try to get GPU name from various sources + gpu_name="" + # Try marketing name first + if [[ -f "$card_dir/product_name" ]]; then + gpu_name=$(cat "$card_dir/product_name" 2>/dev/null) || true + fi + # Fall back to device ID lookup + if [[ -z "$gpu_name" ]]; then + gpu_name="AMD GPU ($device_id)" + fi + + # Check for Vulkan support + local vulkan_available="false" + if command -v vulkaninfo &>/dev/null; then + if vulkaninfo --summary 2>/dev/null | grep -qi "radeon\|amd\|gfx11"; then + vulkan_available="true" + fi + fi + + # Check for ROCm + local rocm_available="false" + if command -v rocminfo &>/dev/null; then + rocm_available="true" + fi + + # Check amdgpu driver loaded + local driver_loaded="false" + if lsmod 2>/dev/null | grep -q amdgpu; then + driver_loaded="true" + fi + + echo "${gpu_name}|${vram_total}|${gtt_total}|${is_apu}|${gpu_busy}|${temp}|${power}|${vulkan_available}|${rocm_available}|${driver_loaded}|${device_id}|${subsystem_device}|${revision}" + return 0 + fi + done + return 1 +} + +# Detect AMD GPU (legacy ROCm-only path) detect_amd() { + # Try sysfs first (works without ROCm) + local sysfs_out + if sysfs_out=$(detect_amd_sysfs 2>/dev/null); then + echo "$sysfs_out" + return 0 + fi + # Fall back to rocm-smi if command -v rocm-smi &>/dev/null; then rocm-smi --showproductname --showmeminfo vram 2>/dev/null | grep -E "GPU|Total Memory" | head -2 fi @@ -92,12 +195,12 @@ parse_nvidia_vram() { echo "$output" | awk -F',' '{gsub(/^ +| +$/,"",$2); print int($2)}' } -# Determine tier based on VRAM +# Determine tier based on VRAM (discrete GPU) # T4: 48GB+ | T3: 20-47GB | T2: 12-19GB | T1: <12GB get_tier() { local vram_mb=$1 local vram_gb=$((vram_mb / 1024)) - + if [[ $vram_gb -ge 48 ]]; then echo "T4" elif [[ $vram_gb -ge 20 ]]; then @@ -109,13 +212,58 @@ get_tier() { fi } -# Get tier description +# Determine Strix Halo tier based on unified memory +# SH_LARGE: 90GB+ | SH_COMPACT: <90GB +get_strix_halo_tier() { + local unified_gb=$1 + + if [[ $unified_gb -ge 90 ]]; then + echo "SH_LARGE" + else + echo "SH_COMPACT" + fi +} + +# Determine Apple Silicon tier based on unified memory +# AP_PRO: 36GB+ | AP_BASE: <36GB +get_apple_tier() { + local unified_gb=$1 + if [[ $unified_gb -ge 96 ]]; then + echo "AP_ULTRA" + elif [[ $unified_gb -ge 36 ]]; then + echo "AP_PRO" + else + echo "AP_BASE" + fi +} + +# Get tier description (supports NVIDIA, Strix Halo, and Apple tiers) tier_description() { case $1 in - T4) echo "Ultimate (48GB+): Full 70B models, multi-model serving" ;; - T3) echo "Pro (20-47GB): 32B models, comfortable headroom" ;; - T2) echo "Starter (12-19GB): 7-14B models, lean configs" ;; - T1) echo "Mini (<12GB): Small models or CPU inference" ;; + T4) echo "Ultimate (48GB+): Full 70B models, multi-model serving" ;; + T3) echo "Pro (20-47GB): 32B models, comfortable headroom" ;; + T2) echo "Starter (12-19GB): 7-14B models, lean configs" ;; + T1) echo "Mini (<12GB): Small models or CPU inference" ;; + SH_LARGE) echo "Strix Halo 90+: qwen3-coder-next 80B MoE (90GB+ unified)" ;; + SH_COMPACT) echo "Strix Halo Compact: qwen3:30b-a3b 30B MoE (<90GB unified)" ;; + AP_ULTRA) echo "Apple Ultra (96GB+): 70B models via CPU inference in Docker" ;; + AP_PRO) echo "Apple Pro (36GB+): 32B models via CPU inference in Docker" ;; + AP_BASE) echo "Apple Base (<36GB): 7B models via CPU inference in Docker" ;; + esac +} + +# Get recommended model for tier +tier_model() { + case $1 in + T4) echo "Qwen/Qwen2.5-72B-Instruct-AWQ" ;; + T3) echo "Qwen/Qwen2.5-32B-Instruct-AWQ" ;; + T2) echo "Qwen/Qwen2.5-7B-Instruct-AWQ" ;; + T1) echo "Qwen/Qwen2.5-1.5B-Instruct" ;; + SH_LARGE) echo "qwen3-coder-next" ;; + SH_COMPACT) echo "qwen3:30b-a3b" ;; + AP_ULTRA) echo "Qwen/Qwen2.5-72B-Instruct-Q4_K_M.gguf" ;; + AP_PRO) echo "Qwen/Qwen2.5-32B-Instruct-Q4_K_M.gguf" ;; + AP_BASE) echo "Qwen/Qwen2.5-7B-Instruct-Q4_K_M.gguf" ;; esac } @@ -131,39 +279,92 @@ main() { local gpu_name="" local gpu_vram_mb=0 local gpu_type="none" - + local gpu_architecture="" + local memory_type="discrete" + local gpu_temp=0 + local gpu_power=0 + local gpu_busy=0 + local vulkan_available="false" + local rocm_available="false" + local driver_loaded="false" + local device_id="" + local subsystem_device="" + local revision="" + # Try NVIDIA first local nvidia_out=$(detect_nvidia) if [[ -n "$nvidia_out" ]]; then gpu_name=$(echo "$nvidia_out" | awk -F',' '{gsub(/^ +| +$/,"",$1); print $1}') gpu_vram_mb=$(parse_nvidia_vram "$nvidia_out") gpu_type="nvidia" + gpu_architecture="cuda" + memory_type="discrete" + # Extract PCI device ID from nvidia-smi + if command -v nvidia-smi &>/dev/null; then + local pci_id + pci_id=$(nvidia-smi --query-gpu=pci.device_id --format=csv,noheader 2>/dev/null | head -1 | xargs) + # nvidia-smi returns e.g. "0x26B110DE" โ€” extract device portion (first 6 chars) + [[ -n "$pci_id" ]] && device_id="${pci_id:0:6}" + fi fi - + # Try AMD if no NVIDIA if [[ -z "$gpu_name" ]]; then - local amd_out=$(detect_amd) - if [[ -n "$amd_out" ]]; then - gpu_name="AMD GPU (ROCm)" + local amd_out + if amd_out=$(detect_amd_sysfs 2>/dev/null); then + # Parse pipe-delimited output from detect_amd_sysfs + IFS='|' read -r gpu_name vram_bytes gtt_bytes is_apu busy temp power vulkan rocm driver dev_id subsys_dev rev <<< "$amd_out" + + local vram_gb=$(( vram_bytes / 1073741824 )) + gpu_vram_mb=$(( vram_bytes / 1048576 )) gpu_type="amd" - # ROCm VRAM parsing would need work + gpu_temp=$temp + gpu_power=$power + gpu_busy=$busy + vulkan_available=$vulkan + rocm_available=$rocm + driver_loaded=$driver + device_id=$dev_id + subsystem_device=$subsys_dev + revision=$rev + + if [[ "$is_apu" == "true" ]]; then + gpu_architecture="apu-unified" + memory_type="unified" + else + gpu_architecture="rdna" + memory_type="discrete" + fi fi fi - + # Try Apple Silicon if macOS if [[ -z "$gpu_name" && "$os" == "macos" ]]; then local apple_out=$(detect_apple) if [[ -n "$apple_out" ]]; then gpu_name="Apple Silicon (Unified Memory)" - gpu_vram_mb=$((ram * 1024)) # Use system RAM as "VRAM" + gpu_vram_mb=$((ram * 1024)) gpu_type="apple" + gpu_architecture="apple-unified" + memory_type="unified" fi fi - - local tier=$(get_tier $gpu_vram_mb) - local tier_desc=$(tier_description $tier) + + # Determine tier + # For unified memory AMD APUs, use system RAM โ€” VRAM reports only GTT (unreliable) + local tier tier_desc recommended_model + if [[ "$memory_type" == "unified" && "$gpu_type" == "amd" ]]; then + tier=$(get_strix_halo_tier "$ram") + elif [[ "$gpu_type" == "apple" ]]; then + local unified_gb=$((gpu_vram_mb / 1024)) + tier=$(get_apple_tier $unified_gb) + else + tier=$(get_tier $gpu_vram_mb) + fi + tier_desc=$(tier_description $tier) + recommended_model=$(tier_model $tier) local gpu_vram_gb=$((gpu_vram_mb / 1024)) - + if $json_output; then cat </dev/null | tail -1 | awk '{gsub(/G/,"",$4); print int($4)}' || echo 0)" + +if [[ -x "$SCRIPT_DIR/build-capability-profile.sh" ]]; then + CAP_ENV="$("$SCRIPT_DIR/build-capability-profile.sh" --output "$CAP_FILE" --env)" + eval "$CAP_ENV" +else + echo "build-capability-profile.sh not found/executable" >&2 + exit 1 +fi + +if [[ -x "$SCRIPT_DIR/preflight-engine.sh" ]]; then + PREFLIGHT_ENV="$("$SCRIPT_DIR/preflight-engine.sh" \ + --report "$PREFLIGHT_FILE" \ + --tier "${CAP_RECOMMENDED_TIER:-T1}" \ + --ram-gb "$RAM_GB" \ + --disk-gb "$DISK_GB" \ + --gpu-backend "${CAP_LLM_BACKEND:-cpu}" \ + --gpu-vram-mb "${CAP_GPU_VRAM_MB:-0}" \ + --gpu-name "${CAP_GPU_NAME:-Unknown}" \ + --platform-id "${CAP_PLATFORM_ID:-unknown}" \ + --compose-overlays "${CAP_COMPOSE_OVERLAYS:-}" \ + --script-dir "$ROOT_DIR" \ + --env)" + eval "$PREFLIGHT_ENV" +else + echo "preflight-engine.sh not found/executable" >&2 + exit 1 +fi + +DOCKER_CLI="false" +DOCKER_DAEMON="false" +COMPOSE_CLI="false" +DASHBOARD_HTTP="false" +WEBUI_HTTP="false" + +if command -v docker >/dev/null 2>&1; then + DOCKER_CLI="true" + if docker info >/dev/null 2>&1; then + DOCKER_DAEMON="true" + fi + if docker compose version >/dev/null 2>&1 || command -v docker-compose >/dev/null 2>&1; then + COMPOSE_CLI="true" + fi +fi + +if command -v curl >/dev/null 2>&1; then + if curl -sf "http://localhost:${_DASHBOARD_PORT}" >/dev/null 2>&1; then + DASHBOARD_HTTP="true" + fi + if curl -sf "http://localhost:${_WEBUI_PORT}" >/dev/null 2>&1; then + WEBUI_HTTP="true" + fi +fi + +python3 - "$CAP_FILE" "$PREFLIGHT_FILE" "$REPORT_FILE" "$DOCKER_CLI" "$DOCKER_DAEMON" "$COMPOSE_CLI" "$DASHBOARD_HTTP" "$WEBUI_HTTP" "$_DASHBOARD_PORT" "$_WEBUI_PORT" <<'PY' +import json +import pathlib +import sys +from datetime import datetime, timezone + +cap_file, preflight_file, report_file, docker_cli, docker_daemon, compose_cli, dashboard_http, webui_http, dashboard_port, webui_port = sys.argv[1:] + +cap = json.load(open(cap_file, "r", encoding="utf-8")) +pre = json.load(open(preflight_file, "r", encoding="utf-8")) + +report = { + "version": "1", + "generated_at": datetime.now(timezone.utc).isoformat(), + "capability_profile": cap, + "preflight": pre, + "runtime": { + "docker_cli": docker_cli == "true", + "docker_daemon": docker_daemon == "true", + "compose_cli": compose_cli == "true", + "dashboard_http": dashboard_http == "true", + "webui_http": webui_http == "true", + }, + "summary": { + "preflight_blockers": pre.get("summary", {}).get("blockers", 0), + "preflight_warnings": pre.get("summary", {}).get("warnings", 0), + "runtime_ready": (docker_daemon == "true" and compose_cli == "true"), + }, +} + +fix_hints = [] +for check in pre.get("checks", []): + status = check.get("status") + action = (check.get("action") or "").strip() + if status in {"blocker", "warn"} and action: + fix_hints.append(action) + +runtime = report["runtime"] +if not runtime["docker_cli"]: + fix_hints.append("Install Docker CLI/Docker Desktop and reopen your terminal.") +if runtime["docker_cli"] and not runtime["docker_daemon"]: + fix_hints.append("Start Docker daemon/Desktop before launching Dream Server.") +if not runtime["compose_cli"]: + fix_hints.append("Install Docker Compose v2 plugin (or docker-compose).") +if runtime["docker_daemon"] and not runtime["dashboard_http"]: + fix_hints.append(f"Run installer/start command, then verify dashboard on http://localhost:{dashboard_port}.") +if runtime["docker_daemon"] and not runtime["webui_http"]: + fix_hints.append(f"Verify Open WebUI container and port {webui_port} mapping.") + +# Deduplicate while preserving order +seen = set() +uniq_hints = [] +for hint in fix_hints: + if hint in seen: + continue + seen.add(hint) + uniq_hints.append(hint) + +report["autofix_hints"] = uniq_hints + +path = pathlib.Path(report_file) +path.parent.mkdir(parents=True, exist_ok=True) +path.write_text(json.dumps(report, indent=2) + "\n", encoding="utf-8") +PY + +echo "Dream Doctor report: $REPORT_FILE" +echo " Preflight blockers: ${PREFLIGHT_BLOCKERS:-0}" +echo " Preflight warnings: ${PREFLIGHT_WARNINGS:-0}" +echo " Docker daemon: $DOCKER_DAEMON" +echo " Compose CLI: $COMPOSE_CLI" +python3 - "$REPORT_FILE" <<'PY' +import json +import sys + +path = sys.argv[1] +try: + data = json.load(open(path, "r", encoding="utf-8")) +except Exception: + raise SystemExit(0) +hints = data.get("autofix_hints") or [] +if hints: + print(" Suggested fixes:") + for hint in hints[:6]: + print(f" - {hint}") +PY diff --git a/dream-server/scripts/dream-preflight.ps1 b/dream-server/scripts/dream-preflight.ps1 deleted file mode 100644 index a03ab9d55..000000000 --- a/dream-server/scripts/dream-preflight.ps1 +++ /dev/null @@ -1,237 +0,0 @@ -# Dream Server Preflight Check for Windows -# Usage: .\scripts\dream-preflight.ps1 - -param( - [switch]$Fix -) - -$ErrorActionPreference = "Continue" -$global:Issues = @() -$global:Warnings = @() - -function Write-Header { - param([string]$Title) - Write-Host "" - Write-Host ("=" * 60) -ForegroundColor Cyan - Write-Host " $Title" -ForegroundColor Cyan - Write-Host ("=" * 60) -ForegroundColor Cyan - Write-Host "" -} - -function Test-Prereq { - param( - [string]$Name, - [scriptblock]$Test, - [string]$FixCmd = "", - [string]$DocsLink = "" - ) - - Write-Host "Checking $Name... " -NoNewline - try { - $result = & $Test - if ($result) { - Write-Host "OK" -ForegroundColor Green - return $true - } else { - Write-Host "FAIL" -ForegroundColor Red - $global:Issues += @{ - Name = $Name - Fix = $FixCmd - Docs = $DocsLink - } - return $false - } - } catch { - Write-Host "FAIL" -ForegroundColor Red - Write-Host " Error: $_" -ForegroundColor DarkGray - $global:Issues += @{ - Name = $Name - Fix = $FixCmd - Docs = $DocsLink - } - return $false - } -} - -function Test-Warning { - param( - [string]$Name, - [scriptblock]$Test, - [string]$Advice = "" - ) - - Write-Host "Checking $Name... " -NoNewline - try { - $result = & $Test - if ($result) { - Write-Host "OK" -ForegroundColor Green - return $true - } else { - Write-Host "WARN" -ForegroundColor Yellow - if ($Advice) { - Write-Host " $Advice" -ForegroundColor DarkYellow - } - $global:Warnings += @{ - Name = $Name - Advice = $Advice - } - return $false - } - } catch { - Write-Host "WARN" -ForegroundColor Yellow - Write-Host " $Advice" -ForegroundColor DarkYellow - $global:Warnings += @{ - Name = $Name - Advice = $Advice - } - return $false - } -} - -Write-Header "Dream Server Preflight Check (Windows)" - -# Windows version -Test-Prereq "Windows Version" { - $winVer = [System.Environment]::OSVersion.Version - return $winVer.Build -ge 19041 -} -FixCmd "Update to Windows 10 version 2004+ or Windows 11" -DocsLink "https://aka.ms/windows-update" - -# WSL2 installed -$wslInstalled = Test-Prereq "WSL2 Installation" { - $status = wsl --status 2>&1 - return $LASTEXITCODE -eq 0 -} -FixCmd "wsl --install" -DocsLink "https://docs.microsoft.com/en-us/windows/wsl/install" - -# WSL2 default version -if ($wslInstalled) { - Test-Prereq "WSL2 Default Version" { - $status = wsl --status 2>&1 | Out-String - return $status -match "Default Version: 2" - } -FixCmd "wsl --set-default-version 2" -DocsLink "docs/WINDOWS-WSL2-GPU-GUIDE.md" -} - -# Ubuntu distro -Test-Prereq "Ubuntu WSL Distro" { - $distros = wsl -l -q 2>&1 - return $distros -match "Ubuntu" -} -FixCmd "wsl --install -d Ubuntu" -DocsLink "docs/WINDOWS-WSL2-GPU-GUIDE.md" - -# Docker Desktop installed -$dockerInstalled = Test-Prereq "Docker Desktop" { - $docker = Get-Command docker -ErrorAction SilentlyContinue - return $null -ne $docker -} -FixCmd "Install from https://docker.com/products/docker-desktop" -DocsLink "docs/WINDOWS-WSL2-GPU-GUIDE.md" - -# Docker running -if ($dockerInstalled) { - Test-Prereq "Docker Running" { - $info = docker info 2>&1 - return $LASTEXITCODE -eq 0 - } -FixCmd "Start Docker Desktop from Start Menu" -DocsLink "docs/WINDOWS-WSL2-GPU-GUIDE.md" -} - -# WSL2 backend -Test-Warning "Docker WSL2 Backend" { - $info = docker info 2>&1 | Out-String - return $info -match "WSL" -} -Advice "Enable WSL2 backend in Docker Desktop settings for GPU support" - -# NVIDIA drivers on Windows -$nvidiaWindows = Test-Prereq "NVIDIA Drivers (Windows)" { - $smi = nvidia-smi 2>&1 - return $LASTEXITCODE -eq 0 -} -FixCmd "Install from https://www.nvidia.com/drivers" -DocsLink "docs/WINDOWS-WSL2-GPU-GUIDE.md" - -# GPU in WSL2 -if ($nvidiaWindows) { - Test-Prereq "GPU in WSL2" { - $wslSmi = wsl nvidia-smi 2>&1 - return $LASTEXITCODE -eq 0 - } -FixCmd "See WSL2 GPU troubleshooting" -DocsLink "docs/WINDOWS-WSL2-GPU-GUIDE.md" - - # GPU memory check - try { - $gpuMem = wsl nvidia-smi --query-gpu=memory.total --format=csv,noheader,nounits 2>&1 | Select-Object -First 1 - $gpuMemNum = [int]$gpuMem.Trim() - if ($gpuMemNum -lt 8192) { - Write-Host " GPU VRAM: ${gpuMemNum}MB" -ForegroundColor Yellow - Write-Host " Warning: 8GB+ VRAM recommended for Dream Server" -ForegroundColor DarkYellow - } else { - Write-Host " GPU VRAM: ${gpuMemNum}MB" -ForegroundColor Green - } - } catch { - Write-Host " Could not detect GPU memory" -ForegroundColor DarkGray - } -} - -# GPU in Docker (most critical) -if ($nvidiaWindows -and $dockerInstalled) { - Test-Prereq "GPU in Docker" { - $result = docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi 2>&1 - return $LASTEXITCODE -eq 0 - } -FixCmd "Enable WSL2 integration in Docker Desktop settings" -DocsLink "docs/WINDOWS-WSL2-GPU-GUIDE.md" -} - -# Memory check -$totalMem = (Get-CimInstance -ClassName Win32_ComputerSystem).TotalPhysicalMemory / 1GB -$wslMem = "" -try { - $wslConfig = Get-Content "$env:USERPROFILE\.wslconfig" -ErrorAction SilentlyContinue - $wslMemMatch = $wslConfig | Select-String "memory=(\d+)" - if ($wslMemMatch) { - $wslMem = $wslMemMatch.Matches[0].Groups[1].Value - } -} catch {} - -Write-Host "" -Write-Host "System Memory: $([math]::Round($totalMem, 1)) GB total" -ForegroundColor Cyan -if ($wslMem) { - Write-Host "WSL2 Memory: $wslMem GB (from .wslconfig)" -ForegroundColor Cyan -} else { - Write-Host "WSL2 Memory: $([math]::Round($totalMem * 0.5, 1)) GB (default 50%)" -ForegroundColor Yellow - Write-Host " Consider creating .wslconfig to increase memory" -ForegroundColor DarkYellow -} - -if ($totalMem -lt 16) { - Write-Host " Warning: 16GB+ RAM recommended" -ForegroundColor Yellow -} - -# Summary -Write-Header "Summary" - -if ($global:Issues.Count -eq 0 -and $global:Warnings.Count -eq 0) { - Write-Host "All checks passed! Ready to install Dream Server." -ForegroundColor Green - Write-Host "" - Write-Host "Next steps:" -ForegroundColor Cyan - Write-Host " 1. Run: .\install.ps1" - Write-Host " 2. After install: cd ~/dream-server && ./scripts/dream-preflight.sh" -} else { - if ($global:Issues.Count -gt 0) { - Write-Host "BLOCKERS ($($global:Issues.Count)):" -ForegroundColor Red - foreach ($issue in $global:Issues) { - Write-Host " - $($issue.Name)" -ForegroundColor Red - if ($issue.Fix) { - Write-Host " Fix: $($issue.Fix)" -ForegroundColor DarkGray - } - if ($issue.Docs) { - Write-Host " See: $($issue.Docs)" -ForegroundColor DarkGray - } - } - } - - if ($global:Warnings.Count -gt 0) { - Write-Host "" - Write-Host "WARNINGS ($($global:Warnings.Count)):" -ForegroundColor Yellow - foreach ($warn in $global:Warnings) { - Write-Host " - $($warn.Name)" -ForegroundColor Yellow - if ($warn.Advice) { - Write-Host " $($warn.Advice)" -ForegroundColor DarkGray - } - } - } - - Write-Host "" - Write-Host "Fix the blockers above, then run this script again." -ForegroundColor Cyan -} - -Write-Host "" diff --git a/dream-server/scripts/dream-preflight.sh b/dream-server/scripts/dream-preflight.sh old mode 100755 new mode 100644 index de97ca112..6a78e7465 --- a/dream-server/scripts/dream-preflight.sh +++ b/dream-server/scripts/dream-preflight.sh @@ -4,21 +4,34 @@ set -e -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -DREAM_DIR="$(dirname "$SCRIPT_DIR")" -cd "$DREAM_DIR" +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +cd "$SCRIPT_DIR" + +# Source service registry +. "$SCRIPT_DIR/lib/service-registry.sh" +sr_load + +# Source .env for port overrides +[[ -f "$SCRIPT_DIR/.env" ]] && set -a && . "$SCRIPT_DIR/.env" && set +a # Colors RED='\033[0;31m' GREEN='\033[0;32m' YELLOW='\033[1;33m' CYAN='\033[0;36m' -NC='\033[0m' # No Color +NC='\033[0m' echo -e "${CYAN}Dream Server Preflight Check${NC}" echo "==============================" echo "" +# Resolve ports from registry +LLM_PORT="${SERVICE_PORTS[llama-server]:-8080}" +LLM_HEALTH="${SERVICE_HEALTH[llama-server]:-/health}" +LLM_CONTAINER="${SERVICE_CONTAINERS[llama-server]:-dream-llama-server}" +WEBUI_PORT="${SERVICE_PORTS[open-webui]:-3000}" +WEBUI_HEALTH="${SERVICE_HEALTH[open-webui]:-/}" + # Check Docker is running echo -n "Docker daemon... " if docker info >/dev/null 2>&1; then @@ -31,7 +44,7 @@ fi # Check containers are up echo -n "Core containers... " -if docker compose ps | grep -q "dream-vllm"; then +if docker compose ps | grep -q "$LLM_CONTAINER"; then echo -e "${GREEN}โœ“ running${NC}" else echo -e "${RED}โœ— not running${NC}" @@ -39,19 +52,19 @@ else exit 1 fi -# Check vLLM health -echo -n "vLLM API (port 8000)... " -if curl -sf http://localhost:8000/health >/dev/null 2>&1; then +# Check llama-server health +echo -n "llama-server API (port $LLM_PORT)... " +if curl -sf "http://localhost:${LLM_PORT}${LLM_HEALTH}" >/dev/null 2>&1; then echo -e "${GREEN}โœ“ healthy${NC}" else echo -e "${YELLOW}โš  starting up${NC}" echo " The model is still loading. Wait 1-2 minutes and retry." - echo " Monitor: docker compose logs -f vllm" + echo " Monitor: docker compose logs -f llama-server" fi # Check WebUI -echo -n "Open WebUI (port 3000)... " -if curl -sf http://localhost:3000 >/dev/null 2>&1; then +echo -n "Open WebUI (port $WEBUI_PORT)... " +if curl -sf "http://localhost:${WEBUI_PORT}${WEBUI_HEALTH}" >/dev/null 2>&1; then echo -e "${GREEN}โœ“ accessible${NC}" else echo -e "${YELLOW}โš  not ready${NC}" @@ -59,30 +72,35 @@ fi # Check GPU if available echo -n "GPU availability... " -if docker exec dream-vllm nvidia-smi >/dev/null 2>&1; then - GPU_MEM=$(docker exec dream-vllm nvidia-smi --query-gpu=memory.free --format=csv,noheader,nounits 2>/dev/null | head -1 | tr -d ' ') +if docker exec "$LLM_CONTAINER" nvidia-smi >/dev/null 2>&1; then + GPU_MEM=$(docker exec "$LLM_CONTAINER" nvidia-smi --query-gpu=memory.free --format=csv,noheader,nounits 2>/dev/null | head -1 | tr -d ' ') echo -e "${GREEN}โœ“ detected (${GPU_MEM}MB free)${NC}" else echo -e "${YELLOW}โš  not detected (CPU mode)${NC}" fi -# Check voice services if enabled -echo -n "Voice services... " -if docker compose ps | grep -q "dream-whisper"; then - WHISPER_OK=$(curl -sf http://localhost:9000/ >/dev/null 2>&1 && echo "yes" || echo "no") - TTS_OK=$(curl -sf http://localhost:8880/health >/dev/null 2>&1 && echo "yes" || echo "no") - if [[ "$WHISPER_OK" == "yes" && "$TTS_OK" == "yes" ]]; then - echo -e "${GREEN}โœ“ whisper + TTS ready${NC}" +# Check extension services that are running +for sid in "${SERVICE_IDS[@]}"; do + [[ "${SERVICE_CATEGORIES[$sid]}" == "core" ]] && continue + container="${SERVICE_CONTAINERS[$sid]}" + docker compose ps 2>/dev/null | grep -q "$container" || continue + + port="${SERVICE_PORTS[$sid]:-0}" + health="${SERVICE_HEALTH[$sid]:-/}" + name="${SERVICE_NAMES[$sid]:-$sid}" + [[ "$port" == "0" ]] && continue + + echo -n "$name (port $port)... " + if curl -sf "http://localhost:${port}${health}" >/dev/null 2>&1; then + echo -e "${GREEN}โœ“ ready${NC}" else - echo -e "${YELLOW}โš  partial (whisper:$WHISPER_OK, tts:$TTS_OK)${NC}" + echo -e "${YELLOW}โš  not ready${NC}" fi -else - echo -e "${YELLOW}โš  not enabled${NC} (run: docker compose --profile voice up -d)" -fi +done echo "" echo -e "${CYAN}Next steps:${NC}" -echo " 1. Open http://localhost:3000" +echo " 1. Open http://localhost:${WEBUI_PORT}" echo " 2. Sign in (first user becomes admin)" echo " 3. Type 'What's 2+2?' to test" echo "" diff --git a/dream-server/scripts/dream-test-functional.sh b/dream-server/scripts/dream-test-functional.sh old mode 100755 new mode 100644 index eff88aa1b..ab3940a7d --- a/dream-server/scripts/dream-test-functional.sh +++ b/dream-server/scripts/dream-test-functional.sh @@ -3,7 +3,7 @@ # dream-test-functional.sh - Functional Testing for Dream Server # # Tests actual functionality, not just port availability: -# - vLLM generates coherent text +# - LLM (llama-server) generates coherent text # - Whisper transcribes actual audio # - TTS generates valid audio files # - Embeddings produce vectors @@ -19,11 +19,20 @@ GREEN='\e[0;32m' YELLOW='\e[1;33m' NC='\e[0m' -# Service endpoints -VLLM_URL="${VLLM_URL:-http://localhost:8000}" -WHISPER_URL="${WHISPER_URL:-http://localhost:9000}" -TTS_URL="${TTS_URL:-http://localhost:8880}" -EMBEDDING_URL="${EMBEDDING_URL:-http://localhost:9103}" +# Source service registry for port resolution +_FT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +if [[ -f "$_FT_DIR/lib/service-registry.sh" ]]; then + export SCRIPT_DIR="$_FT_DIR" + . "$_FT_DIR/lib/service-registry.sh" + sr_load + [[ -f "$_FT_DIR/.env" ]] && set -a && . "$_FT_DIR/.env" && set +a +fi + +# Service endpoints โ€” resolved from registry +LLM_URL="${LLM_URL:-http://localhost:${SERVICE_PORTS[llama-server]:-8080}}" +WHISPER_URL="${WHISPER_URL:-http://localhost:${SERVICE_PORTS[whisper]:-9000}}" +TTS_URL="${TTS_URL:-http://localhost:${SERVICE_PORTS[tts]:-8880}}" +EMBEDDING_URL="${EMBEDDING_URL:-http://localhost:${SERVICE_PORTS[embeddings]:-9103}}" # Test tracking TESTS_PASSED=0 @@ -43,39 +52,43 @@ warn() { echo -e "${YELLOW}โš ${NC} $1" } -# Test 1: vLLM generates coherent text -test_vllm_functional() { +# Test 1: LLM generates coherent text +test_llm_functional() { echo "" - echo "> Testing vLLM Functional Generation" - + echo "> Testing LLM Functional Generation" + + local model_id + model_id=$(curl -s --max-time 10 "$LLM_URL/v1/models" 2>/dev/null | grep -o '"id":"[^"]*"' | head -1 | cut -d'"' -f4) + model_id="${model_id:-local}" + local prompt="What is 2+2? Answer with just the number." - local payload="{\"model\": \"Qwen/Qwen2.5-32B-Instruct-AWQ\", \"messages\": [{\"role\": \"user\", \"content\": \"$prompt\"}], \"max_tokens\": 10, \"temperature\": 0.1}" - + local payload="{\"model\": \"$model_id\", \"messages\": [{\"role\": \"user\", \"content\": \"$prompt\"}], \"max_tokens\": 10, \"temperature\": 0.1}" + local response response=$(curl -s --max-time 30 \ - -X POST "$VLLM_URL/v1/chat/completions" \ + -X POST "$LLM_URL/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "$payload" 2>/dev/null || echo "") - + if [[ -z "$response" ]]; then - fail "vLLM returned no response" + fail "LLM returned no response" return 1 fi - + local content content=$(echo "$response" | grep -oP '"content":\s*"[^"]+"' | head -1 | cut -d'"' -f4) - + if [[ -z "$content" ]]; then - fail "vLLM returned empty content" + fail "LLM returned empty content" return 1 fi - + # Check if response contains "4" (the answer to 2+2) if echo "$content" | grep -q "4"; then - pass "vLLM generates correct answer: '$content'" + pass "LLM generates correct answer: '$content'" else - warn "vLLM generated: '$content' (expected '4')" - pass "vLLM generates text (answer may vary)" + warn "LLM generated: '$content' (expected '4')" + pass "LLM generates text (answer may vary)" fi } @@ -230,7 +243,7 @@ echo " DREAM SERVER - FUNCTIONAL TESTS" echo " Tests actual functionality, not ports" echo "========================================" -test_vllm_functional +test_llm_functional test_tts_functional test_embeddings_functional test_whisper_functional diff --git a/dream-server/scripts/dream-test.sh b/dream-server/scripts/dream-test.sh old mode 100755 new mode 100644 index 80f48b04f..3ab24d4fa --- a/dream-server/scripts/dream-test.sh +++ b/dream-server/scripts/dream-test.sh @@ -13,7 +13,7 @@ # ./dream-test.sh # Run all tests # ./dream-test.sh --quick # Fast mode (~30s, no inference) # ./dream-test.sh --json # JSON output for automation -# ./dream-test.sh --service vllm # Test specific service +# ./dream-test.sh --service llm # Test specific service # # Exit codes: # 0 - All critical tests passed @@ -30,19 +30,28 @@ ENV_FILE="${ENV_FILE:-$DREAM_DIR/.env}" TIMEOUT=15 QUICK_TIMEOUT=5 -# Service endpoints -VLLM_HOST="${VLLM_HOST:-localhost}" -VLLM_PORT="${VLLM_PORT:-8000}" -VLLM_URL="http://${VLLM_HOST}:${VLLM_PORT}" +# Source service registry for port resolution +_DT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +if [[ -f "$_DT_DIR/lib/service-registry.sh" ]]; then + export SCRIPT_DIR="$_DT_DIR" + . "$_DT_DIR/lib/service-registry.sh" + sr_load + [[ -f "$_DT_DIR/.env" ]] && set -a && . "$_DT_DIR/.env" && set +a +fi + +# Service endpoints โ€” resolved from registry +LLM_HOST="${LLM_HOST:-localhost}" +LLM_PORT="${LLM_PORT:-${SERVICE_PORTS[llama-server]:-8080}}" +LLM_URL="http://${LLM_HOST}:${LLM_PORT}" WHISPER_HOST="${WHISPER_HOST:-localhost}" -WHISPER_PORT="${WHISPER_PORT:-9000}" +WHISPER_PORT="${WHISPER_PORT:-${SERVICE_PORTS[whisper]:-9000}}" TTS_HOST="${TTS_HOST:-localhost}" -TTS_PORT="${TTS_PORT:-8880}" +TTS_PORT="${TTS_PORT:-${SERVICE_PORTS[tts]:-8880}}" EMBEDDING_HOST="${EMBEDDING_HOST:-localhost}" -EMBEDDING_PORT="${EMBEDDING_PORT:-9103}" +EMBEDDING_PORT="${EMBEDDING_PORT:-${SERVICE_PORTS[embeddings]:-9103}}" LIVEKIT_HOST="${LIVEKIT_HOST:-localhost}" LIVEKIT_PORT="${LIVEKIT_PORT:-7880}" -PRIVACY_SHIELD_PORT="${PRIVACY_SHIELD_PORT:-8085}" +PRIVACY_SHIELD_PORT="${PRIVACY_SHIELD_PORT:-${SERVICE_PORTS[privacy-shield]:-8085}}" # Colors (ANSI escape sequences) RED='\e[0;31m' @@ -262,35 +271,39 @@ test_gpu() { fi } -test_vllm() { +test_llm() { echo "" - echo "> vLLM LLM Inference" - - test_http "vLLM Health" "$VLLM_URL/health" "200" || return 1 - test_http "vLLM Models API" "$VLLM_URL/v1/models" "200" - + echo "> LLM Inference (llama-server)" + + test_http "LLM Health" "$LLM_URL/health" "200" || return 1 + test_http "LLM Models API" "$LLM_URL/v1/models" "200" + if [[ "$QUICK_MODE" == "true" ]]; then - record_result "vLLM Inference" "skip" "quick mode" - print_test "vLLM Inference" "skip" + record_result "LLM Inference" "skip" "quick mode" + print_test "LLM Inference" "skip" return 0 fi - - local payload='{"model": "Qwen/Qwen2.5-32B-Instruct-AWQ", "messages": [{"role": "user", "content": "Say hello"}], "max_tokens": 10}' + + local model_id + model_id=$(curl -s --max-time 10 "$LLM_URL/v1/models" 2>/dev/null | grep -o '"id":"[^"]*"' | head -1 | cut -d'"' -f4) + model_id="${model_id:-local}" + + local payload="{\"model\": \"$model_id\", \"messages\": [{\"role\": \"user\", \"content\": \"Say hello\"}], \"max_tokens\": 10}" local response - + response=$(curl -s --max-time 30 \ - -X POST "$VLLM_URL/v1/chat/completions" \ + -X POST "$LLM_URL/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "$payload" 2>/dev/null) - + if echo "$response" | grep -q '"content"'; then local tokens_used tokens_used=$(echo "$response" | grep -o '"total_tokens":[0-9]*' | cut -d: -f2) - record_result "vLLM Inference" "pass" "${tokens_used} tokens" - print_test "vLLM Inference" "pass" "${tokens_used} tokens" + record_result "LLM Inference" "pass" "${tokens_used} tokens" + print_test "LLM Inference" "pass" "${tokens_used} tokens" else - record_result "vLLM Inference" "fail" "no content in response" - print_test "vLLM Inference" "fail" + record_result "LLM Inference" "fail" "no content in response" + print_test "LLM Inference" "fail" return 1 fi } @@ -310,7 +323,7 @@ test_tool_calling() { local response response=$(curl -s --max-time 30 \ - -X POST "$VLLM_URL/v1/chat/completions" \ + -X POST "$LLM_URL/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "$payload" 2>/dev/null) @@ -405,7 +418,7 @@ test_voice_roundtrip() { if curl -s --max-time 5 "http://${TTS_HOST}:${TTS_PORT}/v1/audio/voices" &>/dev/null; then tts_ready=true fi - if curl -s --max-time 5 "$VLLM_URL/health" &>/dev/null; then + if curl -s --max-time 5 "$LLM_URL/health" &>/dev/null; then llm_ready=true fi @@ -431,7 +444,7 @@ test_voice_roundtrip() { local llm_payload='{"model": "Qwen/Qwen2.5-32B-Instruct-AWQ", "messages": [{"role": "user", "content": "What is the weather today?"}], "max_tokens": 50}' local llm_response llm_response=$(curl -s --max-time 15 \ - -X POST "$VLLM_URL/v1/chat/completions" \ + -X POST "$LLM_URL/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "$llm_payload" 2>/dev/null) @@ -573,15 +586,15 @@ _print_text_summary() { echo "" echo "Actionable fixes:" - if [[ "${RESULTS_STATUS[0]:-}" == "fail" ]] && [[ "${RESULTS_NAMES[0]:-}" == *"vLLM"* ]]; then - echo " - vLLM not responding - check: docker logs dream-vllm" + if [[ "${RESULTS_STATUS[0]:-}" == "fail" ]] && [[ "${RESULTS_NAMES[0]:-}" == *"LLM"* ]]; then + echo " - LLM not responding - check: docker logs dream-llama-server" fi - + local i for i in "${!RESULTS_NAMES[@]}"; do if [[ "${RESULTS_STATUS[$i]}" == "fail" ]]; then case "${RESULTS_NAMES[$i]}" in - "Tool Calling") echo " - Tool calling failed - check vLLM tool proxy on port 8003" ;; + "Tool Calling") echo " - Tool calling failed - check llama-server tool support" ;; "Whisper Port") echo " - Whisper not running - start: docker compose up whisper" ;; "TTS Port") echo " - TTS not running - start: docker compose up kokoro-tts" ;; esac @@ -611,14 +624,14 @@ OPTIONS: --help, -h Show this help SERVICES: - docker, gpu, vllm, tool-calling, whisper, tts, + docker, gpu, llm, tool-calling, whisper, tts, embeddings, voice-roundtrip, privacy-shield, livekit EXAMPLES: dream-test.sh # Run all tests dream-test.sh --quick # Fast health check dream-test.sh --json > results.json - dream-test.sh --service vllm # Test LLM only + dream-test.sh --service llm # Test LLM only EXIT CODES: 0 - All tests passed @@ -633,7 +646,7 @@ run_all_tests() { test_docker test_gpu - test_vllm + test_llm test_tool_calling test_whisper test_tts @@ -651,7 +664,7 @@ run_specific_service() { case "$service" in docker) test_docker ;; gpu) test_gpu ;; - vllm) test_vllm ;; + llm) test_llm ;; tool-calling) test_tool_calling ;; whisper) test_whisper ;; tts) test_tts ;; @@ -661,7 +674,7 @@ run_specific_service() { livekit) test_livekit ;; *) echo "Unknown service: $service" >&2 - echo "Available: docker, gpu, vllm, tool-calling, whisper, tts, embeddings, voice-roundtrip, privacy-shield, livekit" >&2 + echo "Available: docker, gpu, llm, tool-calling, whisper, tts, embeddings, voice-roundtrip, privacy-shield, livekit" >&2 exit 2 ;; esac diff --git a/dream-server/scripts/first-boot-demo.sh b/dream-server/scripts/first-boot-demo.sh old mode 100755 new mode 100644 index 213f0b82b..8150bad65 --- a/dream-server/scripts/first-boot-demo.sh +++ b/dream-server/scripts/first-boot-demo.sh @@ -20,13 +20,21 @@ NC='\033[0m' BOLD='\033[1m' #============================================================================= -# Config +# Config โ€” resolve from service registry when available #============================================================================= -VLLM_URL="${VLLM_URL:-http://localhost:8000}" -WHISPER_URL="${WHISPER_URL:-http://localhost:9000}" -PIPER_URL="${PIPER_URL:-http://localhost:8880}" -N8N_URL="${N8N_URL:-http://localhost:5678}" -WEBUI_URL="${WEBUI_URL:-http://localhost:3000}" +_DEMO_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +if [[ -f "$_DEMO_DIR/lib/service-registry.sh" ]]; then + export SCRIPT_DIR="$_DEMO_DIR" + . "$_DEMO_DIR/lib/service-registry.sh" + sr_load + [[ -f "$_DEMO_DIR/.env" ]] && set -a && . "$_DEMO_DIR/.env" && set +a +fi + +LLM_URL="${LLM_URL:-http://localhost:${SERVICE_PORTS[llama-server]:-8080}}" +WHISPER_URL="${WHISPER_URL:-http://localhost:${SERVICE_PORTS[whisper]:-9000}}" +PIPER_URL="${PIPER_URL:-http://localhost:${SERVICE_PORTS[tts]:-8880}}" +N8N_URL="${N8N_URL:-http://localhost:${SERVICE_PORTS[n8n]:-5678}}" +WEBUI_URL="${WEBUI_URL:-http://localhost:${SERVICE_PORTS[open-webui]:-3000}}" QUICK_MODE=false ALL_MODE=false @@ -120,11 +128,11 @@ SERVICES_TOTAL=0 # Core services ((SERVICES_TOTAL++)) -if check_service "vLLM (Local LLM)" "$VLLM_URL" "/health"; then +if check_service "LLM (llama-server)" "$LLM_URL" "/health"; then ((SERVICES_OK++)) - VLLM_AVAILABLE=true + LLM_AVAILABLE=true else - VLLM_AVAILABLE=false + LLM_AVAILABLE=false fi ((SERVICES_TOTAL++)) @@ -166,9 +174,9 @@ fi echo "" echo -e "${BOLD}Services: ${SERVICES_OK}/${SERVICES_TOTAL} running${NC}" -if [[ "$VLLM_AVAILABLE" != "true" ]]; then - echo -e "\n${RED}vLLM is required for demos. Is it still loading?${NC}" - echo "Check status: docker compose logs -f vllm" +if [[ "$LLM_AVAILABLE" != "true" ]]; then + echo -e "\n${RED}LLM (llama-server) is required for demos. Is it still loading?${NC}" + echo "Check status: docker compose logs -f llama-server" exit 1 fi @@ -181,7 +189,7 @@ header "๐Ÿ’ฌ Demo 1: Local Chat Completion" demo "Asking your local AI a question..." -RESPONSE=$(curl -sf "${VLLM_URL}/v1/chat/completions" \ +RESPONSE=$(curl -sf "${LLM_URL}/v1/chat/completions" \ -H "Content-Type: application/json" \ -d '{ "model": "Qwen/Qwen2.5-32B-Instruct-AWQ", @@ -209,7 +217,7 @@ header "๐Ÿง‘โ€๐Ÿ’ป Demo 2: Code Assistance" demo "Asking for help with a Python function..." -CODE_RESPONSE=$(curl -sf "${VLLM_URL}/v1/chat/completions" \ +CODE_RESPONSE=$(curl -sf "${LLM_URL}/v1/chat/completions" \ -H "Content-Type: application/json" \ -d '{ "model": "Qwen/Qwen2.5-32B-Instruct-AWQ", @@ -239,7 +247,7 @@ demo "Watching tokens stream in real-time..." echo "" # Simple streaming demo - just show it works -curl -sN "${VLLM_URL}/v1/chat/completions" \ +curl -sN "${LLM_URL}/v1/chat/completions" \ -H "Content-Type: application/json" \ -d '{ "model": "Qwen/Qwen2.5-32B-Instruct-AWQ", @@ -285,7 +293,7 @@ echo -e "${BOLD}Next steps:${NC}" echo " 1. Open ${WEBUI_URL} and start chatting" echo " 2. Import workflows from ./workflows/ into n8n" echo " 3. Try the voice demo: ./scripts/voice-demo.sh" -echo " 4. Enable OpenClaw: docker compose --profile openclaw up -d" +echo " 4. OpenClaw agent: http://localhost:7860" echo "" echo -e "${CYAN}Everything runs locally. Your data stays private. Enjoy! ๐Ÿš€${NC}" diff --git a/dream-server/scripts/generate-livekit-secrets.sh b/dream-server/scripts/generate-livekit-secrets.sh deleted file mode 100755 index 0d3bc26b0..000000000 --- a/dream-server/scripts/generate-livekit-secrets.sh +++ /dev/null @@ -1,55 +0,0 @@ -#!/bin/bash -# generate-livekit-secrets.sh -# Generates random LiveKit API keys and secrets for Dream Server -# Run this before first install to create secure credentials - -set -euo pipefail - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -ENV_FILE="${SCRIPT_DIR}/../.env" - -# Generate cryptographically secure random strings -# API key: 16 chars alphanumeric -API_KEY=$(openssl rand -base64 24 | tr -dc 'a-zA-Z0-9' | head -c 16) - -# API secret: 32 chars alphanumeric -API_SECRET=$(openssl rand -base64 48 | tr -dc 'a-zA-Z0-9' | head -c 32) - -echo "=== LiveKit Secret Generation ===" -echo "API Key: ${API_KEY}" -echo "API Secret: ${API_SECRET:0:8}... (hidden)" -echo "" - -# Check if .env exists -if [[ -f "${ENV_FILE}" ]]; then - echo "Found existing .env file" - - # Backup existing .env - cp "${ENV_FILE}" "${ENV_FILE}.backup.$(date +%Y%m%d-%H%M%S)" - echo "Backed up existing .env" - - # Remove old LiveKit vars if they exist - sed -i '/^LIVEKIT_API_KEY=/d' "${ENV_FILE}" - sed -i '/^LIVEKIT_API_SECRET=/d' "${ENV_FILE}" - echo "Removed existing LiveKit credentials" -else - echo "Creating new .env file" - touch "${ENV_FILE}" -fi - -# Append new secrets -cat >> "${ENV_FILE}" << EOF - -# LiveKit API Credentials (auto-generated $(date +%Y-%m-%d)) -LIVEKIT_API_KEY=${API_KEY} -LIVEKIT_API_SECRET=${API_SECRET} -EOF - -echo "" -echo "=== LiveKit secrets added to .env ===" -echo "File: ${ENV_FILE}" -echo "" -echo "Next steps:" -echo "1. Review ${ENV_FILE} to verify credentials" -echo "2. Run: docker compose up -d livekit" -echo "3. Update voice agent configs to use these credentials" diff --git a/dream-server/scripts/health-check.sh b/dream-server/scripts/health-check.sh old mode 100755 new mode 100644 index 19f168b5f..d07066ecf --- a/dream-server/scripts/health-check.sh +++ b/dream-server/scripts/health-check.sh @@ -2,7 +2,7 @@ # Dream Server Comprehensive Health Check # Tests each component with actual API calls, not just connectivity # Exit codes: 0=healthy, 1=degraded (some services down), 2=critical (core services down) -# +# # Usage: ./health-check.sh [--json] [--quiet] set -euo pipefail @@ -19,21 +19,28 @@ done # Config INSTALL_DIR="${INSTALL_DIR:-$HOME/dream-server}" -VLLM_HOST="${VLLM_HOST:-localhost}" -VLLM_PORT="${VLLM_PORT:-8000}" +LLM_HOST="${LLM_HOST:-localhost}" +LLM_PORT="${LLM_PORT:-${SERVICE_PORTS[llama-server]:-8080}}" TIMEOUT="${TIMEOUT:-5}" -# Load ports from .env if available +# Source service registry +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +. "$SCRIPT_DIR/lib/service-registry.sh" +sr_load + +# Load env for port overrides ENV_FILE="${INSTALL_DIR}/.env" if [[ -f "$ENV_FILE" ]]; then - # Source only PORT variable lines to avoid executing malicious content - WHISPER_PORT=$(grep "^WHISPER_PORT=" "$ENV_FILE" | cut -d= -f2 | tr -d ' "' || echo "9000") - TTS_PORT=$(grep "^TTS_PORT=" "$ENV_FILE" | cut -d= -f2 | tr -d ' "' || echo "8880") - EMBEDDINGS_PORT=$(grep "^EMBEDDINGS_PORT=" "$ENV_FILE" | cut -d= -f2 | tr -d ' "' || echo "8090") -else - WHISPER_PORT="${WHISPER_PORT:-9000}" - TTS_PORT="${TTS_PORT:-8880}" - EMBEDDINGS_PORT="${EMBEDDINGS_PORT:-8090}" + set -a + while IFS='=' read -r key value; do + [[ "$key" =~ ^[[:space:]]*# ]] && continue + [[ -z "$key" ]] && continue + [[ "$key" =~ ^[A-Za-z_][A-Za-z0-9_]*$ ]] || continue + value="${value%\"}" + value="${value#\"}" + export "$key=$value" + done < "$ENV_FILE" + set +a fi # Colors (disabled for JSON/quiet) @@ -50,95 +57,51 @@ ANY_FAIL=false log() { $QUIET || echo -e "$1"; } -# Test functions -test_vllm() { +# โ”€โ”€ Test functions โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + +# llama-server: critical path โ€” performs an actual inference test +test_llm() { local start=$(date +%s%3N) - # Test actual inference with simple completion local response=$(curl -sf --max-time $TIMEOUT \ -H "Content-Type: application/json" \ -d '{"model":"default","prompt":"Hi","max_tokens":1}' \ - "http://${VLLM_HOST}:${VLLM_PORT}/v1/completions" 2>/dev/null) + "http://${LLM_HOST}:${LLM_PORT}/v1/completions" 2>/dev/null) local end=$(date +%s%3N) - + if echo "$response" | grep -q '"text"'; then - RESULTS[vllm]="ok" - RESULTS[vllm_latency]=$((end - start)) + RESULTS[llm]="ok" + RESULTS[llm_latency]=$((end - start)) return 0 fi - RESULTS[vllm]="fail" + RESULTS[llm]="fail" CRITICAL_FAIL=true ANY_FAIL=true return 1 } -test_embeddings() { - local response=$(curl -sf --max-time $TIMEOUT \ - -H "Content-Type: application/json" \ - -d '{"input":"test"}' \ - "http://localhost:${EMBEDDINGS_PORT}/embed" 2>/dev/null) +# Generic registry-driven service health check +test_service() { + local sid="$1" + local port_env="${SERVICE_PORT_ENVS[$sid]}" + local default_port="${SERVICE_PORTS[$sid]}" + local health="${SERVICE_HEALTH[$sid]}" - if echo "$response" | grep -q '\['; then - RESULTS[embeddings]="ok" - return 0 - fi - RESULTS[embeddings]="fail" - ANY_FAIL=true - return 1 -} + # Resolve port + local port="$default_port" + [[ -n "$port_env" ]] && port="${!port_env:-$default_port}" -test_whisper() { - # Just check health endpoint - actual transcription needs audio - if curl -sf --max-time $TIMEOUT "http://localhost:${WHISPER_PORT}/health" >/dev/null 2>&1; then - RESULTS[whisper]="ok" - return 0 - fi - RESULTS[whisper]="fail" - ANY_FAIL=true - return 1 -} - -test_tts() { - # Check TTS endpoint health - if curl -sf --max-time $TIMEOUT "http://localhost:${TTS_PORT}/health" >/dev/null 2>&1; then - RESULTS[tts]="ok" - return 0 - fi - RESULTS[tts]="fail" - ANY_FAIL=true - return 1 -} + [[ -z "$health" || "$port" == "0" ]] && return 1 -test_qdrant() { - local response=$(curl -sf --max-time $TIMEOUT "http://localhost:6333/collections" 2>/dev/null) - if echo "$response" | grep -q '"result"'; then - RESULTS[qdrant]="ok" + if curl -sf --max-time $TIMEOUT "http://localhost:${port}${health}" >/dev/null 2>&1; then + RESULTS[$sid]="ok" return 0 fi - RESULTS[qdrant]="fail" - ANY_FAIL=true - return 1 -} - -test_open_webui() { - if curl -sf --max-time $TIMEOUT "http://localhost:3000" >/dev/null 2>&1; then - RESULTS[open_webui]="ok" - return 0 - fi - RESULTS[open_webui]="fail" - ANY_FAIL=true - return 1 -} - -test_n8n() { - if curl -sf --max-time $TIMEOUT "http://localhost:5678/healthz" >/dev/null 2>&1; then - RESULTS[n8n]="ok" - return 0 - fi - RESULTS[n8n]="fail" + RESULTS[$sid]="fail" ANY_FAIL=true return 1 } +# System-level: GPU test_gpu() { if command -v nvidia-smi &>/dev/null; then local gpu_info=$(nvidia-smi --query-gpu=memory.used,memory.total,utilization.gpu,temperature.gpu --format=csv,noheader,nounits 2>/dev/null | head -1) @@ -149,7 +112,7 @@ test_gpu() { RESULTS[gpu_mem_total]="${mem_total// /}" RESULTS[gpu_util]="${gpu_util// /}" RESULTS[gpu_temp]="${temp// /}" - + # Warn if GPU memory > 95% or temp > 80C if [ "${RESULTS[gpu_util]}" -gt 95 ] 2>/dev/null; then RESULTS[gpu]="warn" @@ -164,6 +127,7 @@ test_gpu() { return 1 } +# System-level: Disk test_disk() { local usage=$(df -h "$INSTALL_DIR" 2>/dev/null | tail -1 | awk '{print $5}' | tr -d '%') if [ -n "$usage" ]; then @@ -178,7 +142,19 @@ test_disk() { return 1 } -# Run tests +# Helper: run test_service for a service ID and log the result +check_service() { + local sid="$1" + local name="${SERVICE_NAMES[$sid]:-$sid}" + if test_service "$sid" 2>/dev/null; then + log " ${GREEN}โœ“${NC} $name - healthy" + else + log " ${YELLOW}!${NC} $name - not responding" + fi +} + +# โ”€โ”€ Run tests โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ + log "${CYAN}โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”${NC}" log "${CYAN} Dream Server Health Check${NC}" log "${CYAN}โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”${NC}" @@ -186,64 +162,35 @@ log "" log "${CYAN}Core Services:${NC}" -# vLLM (critical) -if test_vllm 2>/dev/null; then - log " ${GREEN}โœ“${NC} vLLM - inference working (${RESULTS[vllm_latency]}ms)" -else - log " ${RED}โœ—${NC} vLLM - CRITICAL: inference failed" -fi - -# Embeddings -if test_embeddings 2>/dev/null; then - log " ${GREEN}โœ“${NC} Embeddings - working" -else - log " ${YELLOW}!${NC} Embeddings - not responding" -fi - -# Whisper -if test_whisper 2>/dev/null; then - log " ${GREEN}โœ“${NC} Whisper STT - healthy" +# llama-server (critical โ€” does inference test, not just health) +if test_llm 2>/dev/null; then + log " ${GREEN}โœ“${NC} llama-server - inference working (${RESULTS[llm_latency]}ms)" else - log " ${YELLOW}!${NC} Whisper STT - not responding" + log " ${RED}โœ—${NC} llama-server - CRITICAL: inference failed" fi -# TTS -if test_tts 2>/dev/null; then - log " ${GREEN}โœ“${NC} TTS - healthy" -else - log " ${YELLOW}!${NC} TTS - not responding" -fi +# All other core services +for sid in "${SERVICE_IDS[@]}"; do + [[ "$sid" == "llama-server" ]] && continue + [[ "${SERVICE_CATEGORIES[$sid]}" != "core" ]] && continue + check_service "$sid" +done log "" -log "${CYAN}Support Services:${NC}" - -# Qdrant -if test_qdrant 2>/dev/null; then - log " ${GREEN}โœ“${NC} Qdrant - responding" -else - log " ${YELLOW}!${NC} Qdrant - not responding" -fi - -# Open WebUI -if test_open_webui 2>/dev/null; then - log " ${GREEN}โœ“${NC} Open WebUI - accessible" -else - log " ${YELLOW}!${NC} Open WebUI - not responding" -fi +log "${CYAN}Extension Services:${NC}" -# n8n -if test_n8n 2>/dev/null; then - log " ${GREEN}โœ“${NC} n8n - healthy" -else - log " ${YELLOW}!${NC} n8n - not responding" -fi +# All non-core services +for sid in "${SERVICE_IDS[@]}"; do + [[ "${SERVICE_CATEGORIES[$sid]}" == "core" ]] && continue + check_service "$sid" +done log "" log "${CYAN}System Resources:${NC}" # GPU if test_gpu 2>/dev/null; then - local status_icon="${GREEN}โœ“${NC}" + status_icon="${GREEN}โœ“${NC}" [ "${RESULTS[gpu]}" = "warn" ] && status_icon="${YELLOW}!${NC}" log " ${status_icon} GPU - ${RESULTS[gpu_mem_used]}/${RESULTS[gpu_mem_total]} MiB, ${RESULTS[gpu_util]}% util, ${RESULTS[gpu_temp]}ยฐC" else @@ -252,7 +199,7 @@ fi # Disk if test_disk 2>/dev/null; then - local status_icon="${GREEN}โœ“${NC}" + status_icon="${GREEN}โœ“${NC}" [ "${RESULTS[disk]}" = "warn" ] && status_icon="${YELLOW}!${NC}" log " ${status_icon} Disk - ${RESULTS[disk_usage]}% used" else diff --git a/dream-server/scripts/healthcheck.py b/dream-server/scripts/healthcheck.py old mode 100755 new mode 100644 diff --git a/dream-server/scripts/llm-cold-storage.sh b/dream-server/scripts/llm-cold-storage.sh new file mode 100644 index 000000000..0f99c37ec --- /dev/null +++ b/dream-server/scripts/llm-cold-storage.sh @@ -0,0 +1,245 @@ +#!/usr/bin/env bash +# +# llm-cold-storage.sh โ€” Archive idle HuggingFace models to cold storage +# +# Part of Lighthouse AI tooling. +# +# Models not accessed in 7+ days are moved to cold storage on a backup drive. +# A symlink replaces the original so HuggingFace cache resolution still works. +# Models can be restored manually or are auto-detected if a process loads them. +# +# Usage: +# ./llm-cold-storage.sh # Archive idle models (dry-run) +# ./llm-cold-storage.sh --execute # Archive idle models (for real) +# ./llm-cold-storage.sh --restore # Restore a specific model +# ./llm-cold-storage.sh --restore-all # Restore all archived models +# ./llm-cold-storage.sh --status # Show archive status +# +set -uo pipefail + +HF_CACHE="${HF_CACHE:-$HOME/.cache/huggingface/hub}" +COLD_DIR="${COLD_DIR:-$HOME/llm-cold-storage}" +LOG_FILE="${LOG_FILE:-$HOME/.local/log/llm-cold-storage.log}" +MAX_IDLE_DAYS=7 + +# Ensure the log directory exists +mkdir -p "$(dirname "$LOG_FILE")" + +# Models to never archive (currently serving or critical) +PROTECTED_MODELS=( + "models--BAAI--bge-base-en-v1.5" + "models--Systran--faster-whisper-base" + "models--sentence-transformers--all-MiniLM-L6-v2" +) + +log() { + local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*" + echo "$msg" | tee -a "$LOG_FILE" +} + +is_protected() { + local name="$1" + for p in "${PROTECTED_MODELS[@]}"; do + [[ "$name" == "$p" ]] && return 0 + done + return 1 +} + +is_model_in_use() { + local name="$1" + # Extract model identifier: models--Org--Name -> Org/Name + local model_id + model_id="$(echo "$name" | sed 's/^models--//; s/--/\//g')" + + # Check if any running process references this model + if pgrep -af "$model_id" > /dev/null 2>&1; then + return 0 + fi + return 1 +} + +get_last_access_days() { + local dir="$1" + # Check most recent access time across all blobs in the model + local newest_atime + newest_atime="$(find "$dir" -type f -printf '%A@\n' 2>/dev/null | sort -rn | head -1)" + if [[ -z "$newest_atime" ]]; then + echo "9999" + return + fi + local now + now="$(date +%s)" + local age_secs + age_secs="$(echo "$now - ${newest_atime%.*}" | bc)" + echo "$(( age_secs / 86400 ))" +} + +do_archive() { + local dry_run="${1:-true}" + local archived=0 + local skipped=0 + + log "========== LLM cold storage scan started (dry_run=$dry_run) ==========" + + for model_dir in "$HF_CACHE"/models--*/; do + [[ -d "$model_dir" ]] || continue + # Skip if already a symlink (already archived) + [[ -L "${model_dir%/}" ]] && continue + + local name + name="$(basename "$model_dir")" + + # Skip protected models + if is_protected "$name"; then + log "SKIP (protected): $name" + ((skipped++)) + continue + fi + + # Skip if actively in use by a process + if is_model_in_use "$name"; then + log "SKIP (in use): $name" + ((skipped++)) + continue + fi + + local idle_days + idle_days="$(get_last_access_days "$model_dir")" + local size + size="$(du -sh "$model_dir" 2>/dev/null | cut -f1)" + + if (( idle_days >= MAX_IDLE_DAYS )); then + if [[ "$dry_run" == "true" ]]; then + log "WOULD ARCHIVE: $name ($size, idle ${idle_days}d)" + else + log "ARCHIVING: $name ($size, idle ${idle_days}d)" + # Move to cold storage + mv "$model_dir" "$COLD_DIR/$name" + # Create symlink so HF cache still resolves + ln -s "$COLD_DIR/$name" "${model_dir%/}" + log "ARCHIVED: $name -> $COLD_DIR/$name" + fi + ((archived++)) + else + log "SKIP (recent, ${idle_days}d): $name ($size)" + ((skipped++)) + fi + done + + log "========== Scan complete: $archived archived, $skipped skipped ==========" +} + +do_restore() { + local name="$1" + + # Normalize: accept "Qwen/Qwen2.5-7B" or "models--Qwen--Qwen2.5-7B" + if [[ "$name" != models--* ]]; then + name="models--$(echo "$name" | sed 's/\//--/g')" + fi + + local cold_path="$COLD_DIR/$name" + local cache_path="$HF_CACHE/$name" + + if [[ ! -d "$cold_path" ]]; then + echo "ERROR: Model not found in cold storage: $cold_path" + exit 1 + fi + + # Remove symlink if it exists + if [[ -L "$cache_path" ]]; then + rm "$cache_path" + fi + + log "RESTORING: $name to $cache_path" + mv "$cold_path" "$cache_path" + log "RESTORED: $name" + echo "Restored: $name" +} + +do_restore_all() { + log "========== Restoring all archived models ==========" + for cold_model in "$COLD_DIR"/models--*/; do + [[ -d "$cold_model" ]] || continue + local name + name="$(basename "$cold_model")" + local cache_path="$HF_CACHE/$name" + + if [[ -L "$cache_path" ]]; then + rm "$cache_path" + fi + + log "RESTORING: $name" + mv "$cold_model" "$cache_path" + log "RESTORED: $name" + done + log "========== All models restored ==========" +} + +show_status() { + echo "=== LLM Cold Storage Status ===" + echo "" + + echo "Active models (on NVMe):" + for model_dir in "$HF_CACHE"/models--*/; do + [[ -d "$model_dir" ]] || continue + local name + name="$(basename "$model_dir")" + if [[ -L "${model_dir%/}" ]]; then + local size + size="$(du -sh "$model_dir" 2>/dev/null | cut -f1)" + echo " [SYMLINK -> cold] $name ($size)" + else + local size idle_days status="" + size="$(du -sh "$model_dir" 2>/dev/null | cut -f1)" + idle_days="$(get_last_access_days "$model_dir")" + is_protected "$name" && status=" [protected]" + is_model_in_use "$name" && status=" [in use]" + echo " [HOT] $name ($size, idle ${idle_days}d)${status}" + fi + done + + echo "" + echo "Archived models (on backup SSD):" + local has_archived=false + for cold_model in "$COLD_DIR"/models--*/; do + [[ -d "$cold_model" ]] || continue + has_archived=true + local name size + name="$(basename "$cold_model")" + size="$(du -sh "$cold_model" 2>/dev/null | cut -f1)" + echo " [COLD] $name ($size)" + done + $has_archived || echo " (none)" + + echo "" + echo "NVMe cache total: $(du -sh "$HF_CACHE" 2>/dev/null | cut -f1)" + echo "Cold storage total: $(du -sh "$COLD_DIR" 2>/dev/null | cut -f1)" +} + +case "${1:-}" in + --execute) + do_archive false + ;; + --restore) + [[ -n "${2:-}" ]] || { echo "Usage: $0 --restore "; exit 1; } + do_restore "$2" + ;; + --restore-all) + do_restore_all + ;; + --status) + show_status + ;; + --help|-h) + echo "Usage: $0 [--execute|--restore |--restore-all|--status|--help]" + echo "" + echo " (no args) Dry-run: show what would be archived" + echo " --execute Archive idle models (>$MAX_IDLE_DAYS days)" + echo " --restore Restore model from cold storage" + echo " --restore-all Restore all archived models" + echo " --status Show current hot/cold status" + ;; + *) + do_archive true + ;; +esac diff --git a/dream-server/scripts/load-backend-contract.sh b/dream-server/scripts/load-backend-contract.sh new file mode 100644 index 000000000..671db45f9 --- /dev/null +++ b/dream-server/scripts/load-backend-contract.sh @@ -0,0 +1,59 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +ROOT_DIR="$(cd "${SCRIPT_DIR}/.." && pwd)" +BACKEND_ID="" +ENV_MODE="false" + +while [[ $# -gt 0 ]]; do + case "$1" in + --backend) + BACKEND_ID="${2:-}" + shift 2 + ;; + --env) + ENV_MODE="true" + shift + ;; + *) + echo "Unknown argument: $1" >&2 + exit 1 + ;; + esac +done + +if [[ -z "$BACKEND_ID" ]]; then + echo "Missing required argument: --backend" >&2 + exit 1 +fi + +CONTRACT_FILE="${ROOT_DIR}/config/backends/${BACKEND_ID}.json" +if [[ ! -f "$CONTRACT_FILE" ]]; then + echo "Backend contract not found: $CONTRACT_FILE" >&2 + exit 1 +fi + +if [[ "$ENV_MODE" == "true" ]]; then + python3 - "$CONTRACT_FILE" <<'PY' +import json +import sys + +contract = json.load(open(sys.argv[1], "r", encoding="utf-8")) + +def out(key, value): + safe = str(value).replace("\\", "\\\\").replace('"', '\\"') + print(f'{key}="{safe}"') + +out("BACKEND_CONTRACT_ID", contract.get("id", "")) +out("BACKEND_LLM_ENGINE", contract.get("llm_engine", "")) +out("BACKEND_SERVICE_NAME", contract.get("service_name", "")) +out("BACKEND_PUBLIC_API_PORT", contract.get("public_api_port", "")) +out("BACKEND_PUBLIC_HEALTH_URL", contract.get("public_health_url", "")) +out("BACKEND_PROVIDER_NAME", contract.get("provider_name", "")) +out("BACKEND_PROVIDER_URL", contract.get("provider_url", "")) +out("BACKEND_CONTRACT_FILE", sys.argv[1]) +PY +else + cat "$CONTRACT_FILE" +fi diff --git a/dream-server/scripts/migrate-config.sh b/dream-server/scripts/migrate-config.sh old mode 100755 new mode 100644 index d166fad18..09cf0468f --- a/dream-server/scripts/migrate-config.sh +++ b/dream-server/scripts/migrate-config.sh @@ -243,6 +243,23 @@ cmd_migrate() { fi } +# Validate .env against schema +cmd_validate() { + local validator="${SCRIPT_DIR}/validate-env.sh" + local env_file="${INSTALL_DIR}/.env" + local schema_file="${INSTALL_DIR}/.env.schema.json" + + if [[ ! -f "$validator" ]]; then + log_error "Validator script missing: $validator" + return 1 + fi + if [[ ! -f "$schema_file" ]]; then + log_error "Schema missing: $schema_file" + return 1 + fi + bash "$validator" "$env_file" "$schema_file" +} + # Show help cmd_help() { cat << 'EOF' @@ -255,12 +272,14 @@ Commands: migrate Run pending migrations (with backup) diff Show configuration differences backup Backup current configuration + validate Validate .env against .env.schema.json help Show this help message Examples: ./migrate-config.sh check ./migrate-config.sh migrate ./migrate-config.sh diff + ./migrate-config.sh validate Migration scripts should be placed in the migrations/ directory and named: migrate-vX.Y.Z.sh @@ -282,6 +301,9 @@ case "${1:-help}" in backup) cmd_backup ;; + validate) + cmd_validate + ;; help|--help|-h) cmd_help ;; diff --git a/dream-server/scripts/mode-switch.sh b/dream-server/scripts/mode-switch.sh old mode 100755 new mode 100644 index 7b4c208e2..87c1024fa --- a/dream-server/scripts/mode-switch.sh +++ b/dream-server/scripts/mode-switch.sh @@ -1,300 +1,89 @@ -#!/bin/bash +#!/usr/bin/env bash +# ============================================================================ # Dream Server Mode Switch -# Usage: ./mode-switch.sh [cloud|local|hybrid|status] +# ============================================================================ +# Usage: ./mode-switch.sh [--status] # -# Part of M1 Zero-Cloud Initiative - Phase 3 +# Switches Dream Server between local/cloud/hybrid modes by updating .env. +# This is the backend for `dream mode `. +# ============================================================================ -set -e +set -euo pipefail -#============================================================================= -# Configuration -#============================================================================= -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -DREAM_DIR="${SCRIPT_DIR}/.." -MODE_FILE="${DREAM_DIR}/.current-mode" -DEFAULT_MODE="cloud" +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +ENV_FILE="$SCRIPT_DIR/.env" # Colors RED='\033[0;31m' GREEN='\033[0;32m' YELLOW='\033[1;33m' -BLUE='\033[0;34m' CYAN='\033[0;36m' -BOLD='\033[1m' NC='\033[0m' -#============================================================================= -# Helpers -#============================================================================= log() { echo -e "${CYAN}[dream-mode]${NC} $1"; } success() { echo -e "${GREEN}โœ“${NC} $1"; } warn() { echo -e "${YELLOW}โš ${NC} $1"; } -error() { echo -e "${RED}โœ—${NC} $1"; exit 1; } +error() { echo -e "${RED}โœ—${NC} $1" >&2; exit 1; } -# Auto-detect docker compose command availability -get_docker_compose_cmd() { - if docker compose version &>/dev/null; then - echo "docker compose" +# Update or add a key=value in .env +env_set() { + local key="$1" val="$2" + if grep -q "^${key}=" "$ENV_FILE" 2>/dev/null; then + sed -i "s|^${key}=.*|${key}=${val}|" "$ENV_FILE" else - echo "docker-compose" + echo "${key}=${val}" >> "$ENV_FILE" fi } -# Get local model path from compose file (handles both Qwen2.5-32B and Qwen2.5-Coder-32B) -get_local_model_path() { - local compose_file="${DREAM_DIR}/docker-compose.local.yml" - if [[ -f "$compose_file" ]]; then - grep -o 'Qwen/Qwen2\.5[^ ]*AWQ' "$compose_file" 2>/dev/null | head -1 - fi +show_status() { + local current + current=$(grep "^DREAM_MODE=" "$ENV_FILE" 2>/dev/null | cut -d= -f2) + echo "Current mode: ${current:-local}" + echo "" + echo "Available modes:" + echo " local โ€” Local inference via llama-server (requires GPU/CPU)" + echo " cloud โ€” Cloud APIs via LiteLLM (requires API keys)" + echo " hybrid โ€” Local primary, cloud fallback" } -get_current_mode() { - if [[ -f "$MODE_FILE" ]]; then - cat "$MODE_FILE" - else - echo "$DEFAULT_MODE" - fi -} +switch_mode() { + local mode="$1" -save_mode() { - echo "$1" > "$MODE_FILE" -} - -#============================================================================= -# Mode Information -#============================================================================= -print_mode_info() { - local mode=$1 - echo "" + # Validate case "$mode" in - cloud) - echo -e "${BLUE}โ”โ”โ” Cloud Mode โ”โ”โ”${NC}" - echo " โ€ข LiteLLM gateway with cloud model access" - echo " โ€ข Requires API keys: ANTHROPIC_API_KEY, OPENAI_API_KEY" - echo " โ€ข Best quality, internet required" - echo " โ€ข Cost: ~\$0.003-0.06/1K tokens" - echo "" - echo -e "${YELLOW}Requirements:${NC}" - echo " โ€ข Internet connection" - echo " โ€ข Valid API keys in .env" - ;; - local) - echo -e "${BLUE}โ”โ”โ” Local Mode โ”โ”โ”${NC}" - echo " โ€ข 100% offline operation" - echo " โ€ข All inference on local hardware" - echo " โ€ข No API keys or internet needed" - echo " โ€ข Cost: \$0 (just electricity)" - echo "" - echo -e "${YELLOW}Requirements:${NC}" - echo " โ€ข Pre-downloaded models in ./models/" - echo " โ€ข NVIDIA GPU with sufficient VRAM (24GB+ for 32B model)" - echo "" - local model_path - model_path=$(get_local_model_path) - if [[ -n "$model_path" ]]; then - echo -e "${YELLOW}Local model configured:${NC} $model_path" - echo -e "${YELLOW}Pre-download model:${NC}" - echo " huggingface-cli download $model_path --local-dir ./models/" - else - echo -e "${YELLOW}Pre-download models:${NC}" - echo " huggingface-cli download Qwen/Qwen2.5-32B-Instruct-AWQ --local-dir ./models/" - fi - ;; - hybrid) - echo -e "${BLUE}โ”โ”โ” Hybrid Mode โ”โ”โ”${NC}" - echo " โ€ข Local-first with automatic cloud fallback" - echo " โ€ข Best of both worlds: privacy + reliability" - echo " โ€ข Local vLLM as primary, cloud as backup" - echo " โ€ข Cost: \$0 when local works, cloud rates when fallback" - echo "" - echo -e "${YELLOW}Requirements:${NC}" - echo " โ€ข Local models downloaded" - echo " โ€ข API keys for fallback (optional but recommended)" - echo "" - echo -e "${YELLOW}Fallback triggers:${NC}" - echo " โ€ข Local model timeout (default: 30s)" - echo " โ€ข Local model error (5xx, connection refused)" - echo " โ€ข Empty/invalid response from local" - ;; + local|cloud|hybrid) ;; + *) error "Unknown mode: $mode. Use: local, cloud, hybrid" ;; esac - echo "" -} -#============================================================================= -# Commands -#============================================================================= + [[ -f "$ENV_FILE" ]] || error ".env not found at $ENV_FILE" -cmd_status() { - local current=$(get_current_mode) - - echo -e "${BLUE}โ”โ”โ” Dream Server Mode Status โ”โ”โ”${NC}" - echo "" - echo -e "Current mode: ${BOLD}${current}${NC}" - - # Check compose file - local compose_file="${DREAM_DIR}/docker-compose.${current}.yml" - if [[ -f "$compose_file" ]]; then - success "Compose file exists: docker-compose.${current}.yml" - else - warn "Compose file missing: docker-compose.${current}.yml" - fi - - # Check running containers - echo "" - echo -e "${CYAN}Running containers:${NC}" - cd "$DREAM_DIR" - local docker_cmd - docker_cmd=$(get_docker_compose_cmd) - $docker_cmd -f "docker-compose.${current}.yml" ps --format "table {{.Name}}\t{{.Status}}" 2>/dev/null || \ - docker-compose -f "docker-compose.${current}.yml" ps 2>/dev/null || \ - echo " (no containers running)" - - print_mode_info "$current" -} + # Update .env + env_set "DREAM_MODE" "$mode" -cmd_switch() { - local new_mode=$1 - local current=$(get_current_mode) - - # Validate mode - case "$new_mode" in - cloud|local|hybrid) ;; - *) error "Invalid mode: $new_mode. Use: cloud, local, or hybrid" ;; - esac - - # Check compose file exists - local compose_file="${DREAM_DIR}/docker-compose.${new_mode}.yml" - if [[ ! -f "$compose_file" ]]; then - error "Compose file not found: $compose_file" - fi - - echo -e "${BLUE}โ”โ”โ” Switching Dream Server Mode โ”โ”โ”${NC}" - echo "" - echo -e " From: ${YELLOW}${current}${NC}" - echo -e " To: ${GREEN}${new_mode}${NC}" - echo "" - - # Show warnings based on mode - case "$new_mode" in - local) - warn "Local mode requires pre-downloaded models" - warn "Web search will be disabled (requires internet)" - echo "" - ;; - cloud) - warn "Cloud mode requires valid API keys in .env" - warn "All LLM requests will go to cloud providers" - echo "" - ;; - hybrid) - warn "Hybrid mode uses local first, cloud as fallback" - warn "API keys optional but recommended for reliability" - echo "" - ;; - esac - - # Prompt for confirmation (unless -y flag provided) - if [[ "$AUTO_CONFIRM" != "true" ]]; then - read -p "Continue? [y/N] " -n 1 -r - echo "" - if [[ ! $REPLY =~ ^[Yy]$ ]]; then - log "Cancelled" - exit 0 + if [[ "$mode" == "local" ]]; then + env_set "LLM_API_URL" "http://llama-server:8080" + else + env_set "LLM_API_URL" "http://litellm:4000" + # Auto-enable litellm extension + local litellm_cf="$SCRIPT_DIR/extensions/services/litellm/compose.yaml" + local litellm_disabled="${litellm_cf}.disabled" + if [[ -f "$litellm_disabled" && ! -f "$litellm_cf" ]]; then + mv "$litellm_disabled" "$litellm_cf" + success "Auto-enabled litellm for $mode mode" fi fi - - cd "$DREAM_DIR" - - # Stop current services - log "Stopping current services..." - local current_compose="${DREAM_DIR}/docker-compose.${current}.yml" - local docker_cmd - docker_cmd=$(get_docker_compose_cmd) - if [[ -f "$current_compose" ]]; then - $docker_cmd -f "$current_compose" down 2>/dev/null || true - fi - - # Save new mode - save_mode "$new_mode" - - # Start new services - log "Starting ${new_mode} mode services..." - $docker_cmd -f "$compose_file" up -d - - echo "" - success "Mode switched to: ${new_mode}" - echo "" - - # Wait and show status - log "Waiting for services to start..." - sleep 5 - - echo "" - echo -e "${CYAN}Service status:${NC}" - docker_cmd=$(get_docker_compose_cmd) - $docker_cmd -f "$compose_file" ps --format "table {{.Name}}\t{{.Status}}" 2>/dev/null || \ - docker-compose -f "$compose_file" ps 2>/dev/null || true - - print_mode_info "$new_mode" -} -cmd_help() { - cat << EOF -${BLUE}Dream Server Mode Switch${NC} -Part of M1 Zero-Cloud Initiative - -${CYAN}Usage:${NC} - mode-switch.sh - -${CYAN}Commands:${NC} - cloud Switch to cloud mode (full API access) - local Switch to local mode (100% offline) - hybrid Switch to hybrid mode (local-first + cloud fallback) - status Show current mode and service status - help Show this help - -${CYAN}Modes:${NC} - ${GREEN}cloud${NC} - Uses LiteLLM gateway with cloud model access - Requires API keys, internet connection - Best quality, typical cloud costs - - ${GREEN}local${NC} - 100% offline operation - All inference on local hardware - Requires pre-downloaded models - - ${GREEN}hybrid${NC} - Local-first with automatic cloud fallback - Tries local vLLM first, falls back to cloud on failure - Best balance of privacy, speed, and reliability - -${CYAN}Examples:${NC} - ./mode-switch.sh status # Check current mode - ./mode-switch.sh cloud # Switch to cloud mode - ./mode-switch.sh local # Switch to local mode - ./mode-switch.sh hybrid # Switch to hybrid mode - -${CYAN}Data Safety:${NC} - All modes share the same data volumes in ./data/ - Switching modes preserves all user data, conversations, etc. - -EOF + success "Switched to $mode mode." + log "Run 'dream restart' to apply." } -#============================================================================= -# Main -#============================================================================= -cd "$DREAM_DIR" - -# Handle -y flag for non-interactive mode -if [[ "$1" == "-y" ]]; then - AUTO_CONFIRM="true" - shift +# Called directly or sourced +if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then + case "${1:---status}" in + --status|-s|status) show_status ;; + --help|-h|help) + echo "Usage: mode-switch.sh " + ;; + *) switch_mode "${1:-}" ;; + esac fi - -case "${1:-help}" in - status|s) cmd_status ;; - cloud|c) cmd_switch "cloud" ;; - local|l) cmd_switch "local" ;; - hybrid|h) cmd_switch "hybrid" ;; - help|--help|-h) cmd_help ;; - *) error "Unknown command: $1. Run './mode-switch.sh help' for usage." ;; -esac diff --git a/dream-server/scripts/model-bootstrap.sh b/dream-server/scripts/model-bootstrap.sh deleted file mode 100755 index 66d359f7e..000000000 --- a/dream-server/scripts/model-bootstrap.sh +++ /dev/null @@ -1,453 +0,0 @@ -#!/bin/bash -#============================================================================= -# model-bootstrap.sh โ€” Background Model Download with Progress Tracking -# -# Part of Dream Server โ€” Phase 0 Foundation -# -# Downloads the full model in the background while a lightweight bootstrap -# model serves requests. Tracks progress for Dashboard display. -# -# Usage: -# ./model-bootstrap.sh # Interactive -# ./model-bootstrap.sh --background # Daemon mode (no output) -# ./model-bootstrap.sh --status # Check download status -# ./model-bootstrap.sh --cancel # Cancel active download -# -# Progress file: ~/.dream-server/bootstrap-status.json -#============================================================================= - -set -euo pipefail - -# Configuration -DREAM_DIR="${DREAM_DIR:-$HOME/.dream-server}" -STATUS_FILE="$DREAM_DIR/bootstrap-status.json" -PID_FILE="$DREAM_DIR/bootstrap.pid" -LOG_FILE="$DREAM_DIR/bootstrap.log" -MODELS_DIR="${MODELS_DIR:-$DREAM_DIR/models}" - -# Default models (can be overridden via env) -BOOTSTRAP_MODEL="${BOOTSTRAP_MODEL:-Qwen/Qwen2.5-1.5B-Instruct}" -FULL_MODEL="${FULL_MODEL:-Qwen/Qwen2.5-32B-Instruct-AWQ}" - -# Retry configuration -MAX_RETRIES=3 -RETRY_DELAYS=(2 8 32) # Exponential backoff: 2s, 8s, 32s -DOWNLOAD_TIMEOUT=7200 # 2 hours max - -# Colors (disabled in background mode) -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -BLUE='\033[0;34m' -CYAN='\033[0;36m' -NC='\033[0m' - -BACKGROUND=false -QUIET=false - -#----------------------------------------------------------------------------- -# Utility Functions -#----------------------------------------------------------------------------- - -log() { - local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $1" - if [[ "$BACKGROUND" == "true" ]]; then - echo "$msg" >> "$LOG_FILE" - elif [[ "$QUIET" != "true" ]]; then - echo -e "${BLUE}[INFO]${NC} $1" - fi -} - -success() { - if [[ "$BACKGROUND" == "true" ]]; then - echo "[$(date '+%Y-%m-%d %H:%M:%S')] SUCCESS: $1" >> "$LOG_FILE" - elif [[ "$QUIET" != "true" ]]; then - echo -e "${GREEN}[OK]${NC} $1" - fi -} - -warn() { - if [[ "$BACKGROUND" == "true" ]]; then - echo "[$(date '+%Y-%m-%d %H:%M:%S')] WARN: $1" >> "$LOG_FILE" - elif [[ "$QUIET" != "true" ]]; then - echo -e "${YELLOW}[WARN]${NC} $1" - fi -} - -error() { - if [[ "$BACKGROUND" == "true" ]]; then - echo "[$(date '+%Y-%m-%d %H:%M:%S')] ERROR: $1" >> "$LOG_FILE" - else - echo -e "${RED}[ERROR]${NC} $1" >&2 - fi -} - -ensure_dirs() { - mkdir -p "$DREAM_DIR" "$MODELS_DIR" -} - -#----------------------------------------------------------------------------- -# Status File Management -#----------------------------------------------------------------------------- - -write_status() { - local status="$1" - local percent="${2:-0}" - local bytes_downloaded="${3:-0}" - local bytes_total="${4:-0}" - local speed="${5:-0}" - local eta="${6:-}" - local error_msg="${7:-}" - - cat > "$STATUS_FILE" << EOF -{ - "status": "$status", - "model": "$FULL_MODEL", - "bootstrapModel": "$BOOTSTRAP_MODEL", - "percent": $percent, - "bytesDownloaded": $bytes_downloaded, - "bytesTotal": $bytes_total, - "speedBytesPerSec": $speed, - "eta": "$eta", - "error": "$error_msg", - "startedAt": "${STARTED_AT:-}", - "updatedAt": "$(date -u '+%Y-%m-%dT%H:%M:%SZ')", - "pid": $$ -} -EOF -} - -read_status() { - if [[ -f "$STATUS_FILE" ]]; then - cat "$STATUS_FILE" - else - echo '{"status": "none"}' - fi -} - -#----------------------------------------------------------------------------- -# Model Download with Progress -#----------------------------------------------------------------------------- - -get_model_size() { - local model="$1" - # Query HuggingFace API for model size - local api_url="https://huggingface.co/api/models/${model}" - local size - size=$(curl -s "$api_url" | grep -o '"size":[0-9]*' | head -1 | cut -d: -f2) - echo "${size:-0}" -} - -download_model() { - local model="$1" - local target_dir="$2" - local attempt=1 - - STARTED_AT=$(date -u '+%Y-%m-%dT%H:%M:%SZ') - - # Get expected size - local total_size - total_size=$(get_model_size "$model") - - log "Downloading model: $model" - log "Target directory: $target_dir" - [[ "$total_size" -gt 0 ]] && log "Expected size: $(numfmt --to=iec-i --suffix=B $total_size 2>/dev/null || echo "$total_size bytes")" - - while [[ $attempt -le $MAX_RETRIES ]]; do - log "Download attempt $attempt of $MAX_RETRIES" - write_status "downloading" 0 0 "$total_size" 0 "calculating..." - - # Use huggingface-cli if available, otherwise fallback to git lfs - if command -v huggingface-cli &> /dev/null; then - download_with_hf_cli "$model" "$target_dir" "$total_size" && return 0 - else - download_with_git_lfs "$model" "$target_dir" "$total_size" && return 0 - fi - - # Download failed, retry with backoff - if [[ $attempt -lt $MAX_RETRIES ]]; then - local delay=${RETRY_DELAYS[$((attempt-1))]} - warn "Download failed, retrying in ${delay}s..." - write_status "retrying" 0 0 "$total_size" 0 "" "Attempt $attempt failed, retrying in ${delay}s" - sleep "$delay" - fi - - ((attempt++)) - done - - error "Download failed after $MAX_RETRIES attempts" - write_status "failed" 0 0 "$total_size" 0 "" "Download failed after $MAX_RETRIES attempts" - return 1 -} - -download_with_hf_cli() { - local model="$1" - local target_dir="$2" - local total_size="$3" - - # Create a named pipe for progress monitoring - local progress_pipe=$(mktemp -u) - mkfifo "$progress_pipe" - - # Monitor progress in background - ( - local last_size=0 - local last_time=$(date +%s) - - while true; do - sleep 5 - - # Calculate current download size - local current_size=0 - if [[ -d "$target_dir" ]]; then - current_size=$(du -sb "$target_dir" 2>/dev/null | cut -f1 || echo 0) - fi - - # Calculate speed - local now=$(date +%s) - local elapsed=$((now - last_time)) - local speed=0 - if [[ $elapsed -gt 0 ]]; then - speed=$(( (current_size - last_size) / elapsed )) - fi - - # Calculate percentage and ETA - local percent=0 - local eta="unknown" - if [[ "$total_size" -gt 0 ]]; then - percent=$(( (current_size * 100) / total_size )) - if [[ $speed -gt 0 ]]; then - local remaining=$((total_size - current_size)) - local eta_secs=$((remaining / speed)) - eta=$(printf '%02d:%02d:%02d' $((eta_secs/3600)) $(((eta_secs%3600)/60)) $((eta_secs%60))) - fi - fi - - write_status "downloading" "$percent" "$current_size" "$total_size" "$speed" "$eta" - - last_size=$current_size - last_time=$now - - # Check if download process is still running - if ! kill -0 $$ 2>/dev/null; then - break - fi - done - ) & - local monitor_pid=$! - - # Run the actual download - local result=0 - huggingface-cli download "$model" \ - --local-dir "$target_dir" \ - --local-dir-use-symlinks False \ - --resume-download \ - 2>> "$LOG_FILE" || result=$? - - # Stop the monitor - kill $monitor_pid 2>/dev/null || true - rm -f "$progress_pipe" - - return $result -} - -download_with_git_lfs() { - local model="$1" - local target_dir="$2" - local total_size="$3" - - log "Using git-lfs for download (huggingface-cli not found)" - - # Clone with git lfs - local repo_url="https://huggingface.co/${model}" - - GIT_LFS_SKIP_SMUDGE=1 git clone "$repo_url" "$target_dir" 2>> "$LOG_FILE" || return 1 - - cd "$target_dir" - git lfs pull 2>> "$LOG_FILE" || return 1 - - return 0 -} - -#----------------------------------------------------------------------------- -# vLLM Hot-Swap -#----------------------------------------------------------------------------- - -notify_vllm_model_ready() { - local model_path="$1" - - log "Notifying vLLM that new model is ready..." - - # Check if vLLM supports hot-swap API - local vllm_host="${VLLM_HOST:-localhost}" - local vllm_port="${VLLM_PORT:-8000}" - - # Try the model loading API (if available in vLLM version) - local response - response=$(curl -s -X POST "http://${vllm_host}:${vllm_port}/v1/models/load" \ - -H "Content-Type: application/json" \ - -d "{\"model\": \"$model_path\"}" 2>/dev/null || echo "") - - if [[ -n "$response" ]] && echo "$response" | grep -q '"success"'; then - success "vLLM hot-swap successful" - return 0 - else - warn "vLLM hot-swap not available, manual restart required" - warn "Run: dream restart vllm" - return 1 - fi -} - -#----------------------------------------------------------------------------- -# Main Commands -#----------------------------------------------------------------------------- - -cmd_status() { - local status - status=$(read_status) - - if [[ "$1" == "--json" ]]; then - echo "$status" - return - fi - - local current_status - current_status=$(echo "$status" | grep -o '"status": *"[^"]*"' | cut -d'"' -f4) - - case "$current_status" in - none) - echo "No bootstrap in progress" - ;; - downloading) - local percent model eta - percent=$(echo "$status" | grep -o '"percent": *[0-9]*' | grep -o '[0-9]*') - model=$(echo "$status" | grep -o '"model": *"[^"]*"' | cut -d'"' -f4) - eta=$(echo "$status" | grep -o '"eta": *"[^"]*"' | cut -d'"' -f4) - echo -e "${CYAN}Downloading:${NC} $model" - echo -e "${CYAN}Progress:${NC} ${percent}%" - echo -e "${CYAN}ETA:${NC} $eta" - ;; - completed) - echo -e "${GREEN}Bootstrap complete!${NC} Full model ready." - ;; - failed) - local err - err=$(echo "$status" | grep -o '"error": *"[^"]*"' | cut -d'"' -f4) - echo -e "${RED}Bootstrap failed:${NC} $err" - ;; - *) - echo "Status: $current_status" - ;; - esac -} - -cmd_cancel() { - if [[ -f "$PID_FILE" ]]; then - local pid - pid=$(cat "$PID_FILE") - if kill -0 "$pid" 2>/dev/null; then - log "Cancelling bootstrap download (PID: $pid)" - kill "$pid" - write_status "cancelled" 0 0 0 0 "" "Cancelled by user" - rm -f "$PID_FILE" - success "Download cancelled" - else - warn "No active download found" - rm -f "$PID_FILE" - fi - else - warn "No active download found" - fi -} - -cmd_download() { - ensure_dirs - - # Check if already downloading - if [[ -f "$PID_FILE" ]]; then - local existing_pid - existing_pid=$(cat "$PID_FILE") - if kill -0 "$existing_pid" 2>/dev/null; then - error "Download already in progress (PID: $existing_pid)" - error "Use --cancel to stop it, or --status to check progress" - return 1 - fi - fi - - # Save PID - echo $$ > "$PID_FILE" - - # Trap to clean up on exit - trap 'rm -f "$PID_FILE"' EXIT - - local target_dir="$MODELS_DIR/$(basename "$FULL_MODEL")" - - if [[ -d "$target_dir" ]] && [[ -f "$target_dir/config.json" ]]; then - success "Model already downloaded: $target_dir" - write_status "completed" 100 0 0 0 "" - return 0 - fi - - # Start download - if download_model "$FULL_MODEL" "$target_dir"; then - success "Model download complete!" - write_status "completed" 100 0 0 0 "" - - # Try hot-swap - notify_vllm_model_ready "$target_dir" || true - - return 0 - else - return 1 - fi -} - -#----------------------------------------------------------------------------- -# Entry Point -#----------------------------------------------------------------------------- - -main() { - case "${1:-}" in - --status|-s) - cmd_status "${2:-}" - ;; - --cancel|-c) - cmd_cancel - ;; - --background|-b) - BACKGROUND=true - shift - cmd_download "$@" & - disown - echo "Bootstrap started in background. Check progress with: $0 --status" - ;; - --help|-h) - cat << EOF -Dream Server Model Bootstrap - -Usage: - $0 Start download (interactive) - $0 --background Start download in background - $0 --status Check download progress - $0 --status --json Get status as JSON (for Dashboard) - $0 --cancel Cancel active download - -Environment Variables: - FULL_MODEL Model to download (default: $FULL_MODEL) - BOOTSTRAP_MODEL Lightweight model for immediate use (default: $BOOTSTRAP_MODEL) - MODELS_DIR Where to store models (default: $MODELS_DIR) - VLLM_HOST vLLM hostname for hot-swap (default: localhost) - VLLM_PORT vLLM port for hot-swap (default: 8000) - -Progress File: - $STATUS_FILE - -EOF - ;; - *) - cmd_download "$@" - ;; - esac -} - -main "$@" diff --git a/dream-server/scripts/pre-download.sh b/dream-server/scripts/pre-download.sh old mode 100755 new mode 100644 index 5d03f1187..7d3813b6c --- a/dream-server/scripts/pre-download.sh +++ b/dream-server/scripts/pre-download.sh @@ -4,7 +4,7 @@ # # Part of Dream Server โ€” Phase 3 # -# Downloads models ahead of time so setup.sh can skip the download step. +# Downloads models ahead of time so install.sh can skip the download step. # Useful for slow/metered connections or offline installs. # # Usage: @@ -92,7 +92,7 @@ check_dependencies() { } #============================================================================= -# Hardware Detection (simplified from setup.sh) +# Hardware Detection (simplified from install-core.sh) #============================================================================= detect_vram_gb() { @@ -282,8 +282,8 @@ download_tier() { echo "" success "Pre-download complete!" echo "" - echo "You can now run setup.sh โ€” it will use the cached models." - echo " curl -fsSL https://dream.openclaw.ai/setup.sh | bash" + echo "You can now run install.sh โ€” it will use the cached models." + echo " ./install.sh" } interactive_menu() { diff --git a/dream-server/scripts/preflight-engine.sh b/dream-server/scripts/preflight-engine.sh new file mode 100644 index 000000000..78d5f06c5 --- /dev/null +++ b/dream-server/scripts/preflight-engine.sh @@ -0,0 +1,341 @@ +#!/usr/bin/env bash +set -euo pipefail + +REPORT_FILE="/tmp/dream-server-preflight-report.json" +TIER="${TIER:-1}" +RAM_GB="${RAM_GB:-0}" +DISK_GB="${DISK_GB:-0}" +GPU_BACKEND="${GPU_BACKEND:-nvidia}" +GPU_VRAM_MB="${GPU_VRAM_MB:-0}" +GPU_NAME="${GPU_NAME:-Unknown}" +PLATFORM_ID="${PLATFORM_ID:-linux}" +COMPOSE_OVERLAYS="${COMPOSE_OVERLAYS:-}" +SCRIPT_DIR="${SCRIPT_DIR:-$(pwd)}" +STRICT="false" +ENV_MODE="false" + +while [[ $# -gt 0 ]]; do + case "$1" in + --report) + REPORT_FILE="${2:-$REPORT_FILE}" + shift 2 + ;; + --tier) + TIER="${2:-$TIER}" + shift 2 + ;; + --ram-gb) + RAM_GB="${2:-$RAM_GB}" + shift 2 + ;; + --disk-gb) + DISK_GB="${2:-$DISK_GB}" + shift 2 + ;; + --gpu-backend) + GPU_BACKEND="${2:-$GPU_BACKEND}" + shift 2 + ;; + --gpu-vram-mb) + GPU_VRAM_MB="${2:-$GPU_VRAM_MB}" + shift 2 + ;; + --gpu-name) + GPU_NAME="${2:-$GPU_NAME}" + shift 2 + ;; + --platform-id) + PLATFORM_ID="${2:-$PLATFORM_ID}" + shift 2 + ;; + --compose-overlays) + COMPOSE_OVERLAYS="${2:-$COMPOSE_OVERLAYS}" + shift 2 + ;; + --script-dir) + SCRIPT_DIR="${2:-$SCRIPT_DIR}" + shift 2 + ;; + --strict) + STRICT="true" + shift + ;; + --env) + ENV_MODE="true" + shift + ;; + *) + echo "Unknown argument: $1" >&2 + exit 1 + ;; + esac +done + +python3 - "$REPORT_FILE" "$TIER" "$RAM_GB" "$DISK_GB" "$GPU_BACKEND" "$GPU_VRAM_MB" "$GPU_NAME" "$PLATFORM_ID" "$COMPOSE_OVERLAYS" "$SCRIPT_DIR" "$ENV_MODE" "$STRICT" <<'PY' +import json +import pathlib +import sys +from datetime import datetime, timezone + +( + report_file, + tier, + ram_gb, + disk_gb, + gpu_backend, + gpu_vram_mb, + gpu_name, + platform_id, + compose_overlays, + script_dir, + env_mode, + strict_mode, +) = sys.argv[1:] + +env_mode = env_mode == "true" +strict_mode = strict_mode == "true" + +try: + ram_gb = int(float(ram_gb)) +except Exception: + ram_gb = 0 +try: + disk_gb = int(float(disk_gb)) +except Exception: + disk_gb = 0 +try: + gpu_vram_mb = int(float(gpu_vram_mb)) +except Exception: + gpu_vram_mb = 0 + +tier_key = str(tier).upper() +tier_rank_map = { + "1": 1, + "2": 2, + "3": 3, + "4": 4, + "T1": 1, + "T2": 2, + "T3": 3, + "T4": 4, + "SH_COMPACT": 3, + "SH_LARGE": 4, +} +tier_rank = tier_rank_map.get(tier_key, 1) + +min_ram_map = { + "1": 16, + "2": 32, + "3": 48, + "4": 64, + "SH_COMPACT": 64, + "SH_LARGE": 96, +} +min_disk_map = { + "1": 30, + "2": 50, + "3": 80, + "4": 150, + "SH_COMPACT": 80, + "SH_LARGE": 120, +} +min_ram = min_ram_map.get(tier_key, 16) +min_disk = min_disk_map.get(tier_key, 50) + +checks = [] + +def add_check(check_id, status, message, action): + checks.append( + { + "id": check_id, + "status": status, + "message": message, + "action": action, + } + ) + +# Platform support check +if platform_id in {"linux", "wsl"}: + add_check( + "platform-support", + "pass", + f"Platform '{platform_id}' is currently supported by install-core.sh.", + "", + ) +elif platform_id in {"macos", "windows"}: + add_check( + "platform-support", + "warn", + f"Platform '{platform_id}' is supported via installer MVP path (not full parity yet).", + "Continue with platform installer and follow generated doctor report recommendations.", + ) +else: + add_check( + "platform-support", + "blocker", + f"Platform '{platform_id}' is not yet supported by install-core.sh.", + "Use Linux/WSL path for now or run platform-specific installer once implemented.", + ) + +# Compose overlay existence check +overlays = [o.strip() for o in compose_overlays.split(",") if o.strip()] +if overlays: + missing = [o for o in overlays if not (pathlib.Path(script_dir) / o).exists()] + if missing: + add_check( + "compose-overlays", + "blocker", + f"Compose overlays are missing: {', '.join(missing)}.", + "Restore missing compose files or update capability profile overlay mapping.", + ) + else: + add_check( + "compose-overlays", + "pass", + f"Compose overlays resolved: {', '.join(overlays)}.", + "", + ) +else: + add_check( + "compose-overlays", + "warn", + "No compose overlays supplied from capability profile.", + "Ensure CAP_COMPOSE_OVERLAYS is populated; installer will use legacy fallback.", + ) + +# RAM and disk checks +if ram_gb >= min_ram: + add_check( + "memory", + "pass", + f"RAM {ram_gb}GB meets tier {tier_key} recommendation ({min_ram}GB).", + "", + ) +else: + add_check( + "memory", + "warn", + f"RAM {ram_gb}GB is below tier {tier_key} recommendation ({min_ram}GB).", + f"Use a lower tier or increase memory to at least {min_ram}GB.", + ) + +if disk_gb >= min_disk: + add_check( + "disk", + "pass", + f"Disk {disk_gb}GB meets tier {tier_key} recommendation ({min_disk}GB).", + "", + ) +else: + add_check( + "disk", + "blocker", + f"Disk {disk_gb}GB is below required minimum for tier {tier_key} ({min_disk}GB).", + f"Free at least {min_disk - disk_gb}GB or choose a smaller tier.", + ) + +# GPU checks +gpu_backend = (gpu_backend or "").lower() +if gpu_backend == "amd": + add_check( + "gpu-backend", + "pass", + f"AMD backend selected ({gpu_name}).", + "", + ) +elif gpu_backend == "nvidia": + if gpu_name.strip().lower() in {"none", ""} or gpu_vram_mb <= 0: + add_check( + "gpu-vram", + "warn", + "NVIDIA backend selected but no NVIDIA GPU VRAM was detected.", + "Install/verify NVIDIA drivers or switch to a supported AMD path.", + ) + elif tier_rank >= 2 and gpu_vram_mb < 10000: + add_check( + "gpu-vram", + "warn", + f"NVIDIA VRAM {gpu_vram_mb}MB is below recommended floor for tier {tier_key}.", + "Use tier 1 or a GPU with at least 12GB VRAM for better performance.", + ) + else: + add_check( + "gpu-vram", + "pass", + f"NVIDIA backend selected ({gpu_name}, {gpu_vram_mb}MB VRAM).", + "", + ) +elif gpu_backend == "apple": + add_check( + "gpu-backend", + "warn", + "Apple backend selected (experimental path).", + "Use macOS installer preflight + doctor and run reduced profile set until Tier A parity is complete.", + ) +elif gpu_backend == "cpu": + if platform_id in {"windows", "macos"}: + add_check( + "gpu-backend", + "warn", + "CPU fallback selected on non-Linux platform.", + "Use reduced model/profile defaults; expect slower inference.", + ) + else: + add_check( + "gpu-backend", + "warn", + "CPU fallback selected.", + "Install/verify GPU drivers for best performance or continue with small models.", + ) +else: + add_check( + "gpu-backend", + "warn", + f"Unknown backend '{gpu_backend}'.", + "Verify capability profile and hardware detection output.", + ) + +blockers = [c for c in checks if c["status"] == "blocker"] +warnings = [c for c in checks if c["status"] == "warn"] + +report = { + "version": "1", + "generated_at": datetime.now(timezone.utc).isoformat(), + "inputs": { + "tier": tier_key, + "ram_gb": ram_gb, + "disk_gb": disk_gb, + "gpu_backend": gpu_backend, + "gpu_vram_mb": gpu_vram_mb, + "gpu_name": gpu_name, + "platform_id": platform_id, + "compose_overlays": overlays, + "script_dir": script_dir, + }, + "summary": { + "checks": len(checks), + "blockers": len(blockers), + "warnings": len(warnings), + "can_proceed": len(blockers) == 0, + }, + "checks": checks, +} + +report_path = pathlib.Path(report_file) +report_path.parent.mkdir(parents=True, exist_ok=True) +report_path.write_text(json.dumps(report, indent=2) + "\n", encoding="utf-8") + +if env_mode: + def out(key, value): + safe = str(value).replace("\\", "\\\\").replace('"', '\\"') + print(f'{key}="{safe}"') + + out("PREFLIGHT_REPORT_FILE", str(report_path)) + out("PREFLIGHT_CHECK_COUNT", report["summary"]["checks"]) + out("PREFLIGHT_BLOCKERS", report["summary"]["blockers"]) + out("PREFLIGHT_WARNINGS", report["summary"]["warnings"]) + out("PREFLIGHT_CAN_PROCEED", str(report["summary"]["can_proceed"]).lower()) + +if strict_mode and blockers: + raise SystemExit(1) +PY diff --git a/dream-server/scripts/release-gate.sh b/dream-server/scripts/release-gate.sh new file mode 100644 index 000000000..4c1eeff39 --- /dev/null +++ b/dream-server/scripts/release-gate.sh @@ -0,0 +1,31 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +cd "$ROOT_DIR" + +echo "[gate] shell syntax" +mapfile -t sh_files < <(git ls-files '*.sh') +for f in "${sh_files[@]}"; do + bash -n "$f" +done + +echo "[gate] compatibility + claims" +bash scripts/check-compatibility.sh +bash scripts/check-release-claims.sh + +echo "[gate] contracts" +bash tests/contracts/test-installer-contracts.sh +bash tests/contracts/test-preflight-fixtures.sh + +echo "[gate] smoke" +bash tests/smoke/linux-amd.sh +bash tests/smoke/linux-nvidia.sh +bash tests/smoke/wsl-logic.sh +bash tests/smoke/macos-dispatch.sh + +echo "[gate] installer simulation" +bash scripts/simulate-installers.sh +python3 scripts/validate-sim-summary.py artifacts/installer-sim/summary.json + +echo "[PASS] release gate" diff --git a/dream-server/scripts/resolve-compose-stack.sh b/dream-server/scripts/resolve-compose-stack.sh new file mode 100644 index 000000000..fa83f7221 --- /dev/null +++ b/dream-server/scripts/resolve-compose-stack.sh @@ -0,0 +1,157 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(pwd)" +TIER="1" +GPU_BACKEND="nvidia" +PROFILE_OVERLAYS="" +ENV_MODE="false" + +while [[ $# -gt 0 ]]; do + case "$1" in + --script-dir) + SCRIPT_DIR="${2:-$SCRIPT_DIR}" + shift 2 + ;; + --tier) + TIER="${2:-$TIER}" + shift 2 + ;; + --gpu-backend) + GPU_BACKEND="${2:-$GPU_BACKEND}" + shift 2 + ;; + --profile-overlays) + PROFILE_OVERLAYS="${2:-$PROFILE_OVERLAYS}" + shift 2 + ;; + --env) + ENV_MODE="true" + shift + ;; + *) + echo "Unknown argument: $1" >&2 + exit 1 + ;; + esac +done + +python3 - "$SCRIPT_DIR" "$TIER" "$GPU_BACKEND" "$PROFILE_OVERLAYS" "$ENV_MODE" <<'PY' +import pathlib +import sys +import json + +script_dir = pathlib.Path(sys.argv[1]) +tier = (sys.argv[2] or "1").upper() +gpu_backend = (sys.argv[3] or "nvidia").lower() +profile_overlays = [x.strip() for x in (sys.argv[4] or "").split(",") if x.strip()] +env_mode = (sys.argv[5] or "false").lower() == "true" + +def existing(overlays): + return all((script_dir / f).exists() for f in overlays) + +resolved = [] +primary = "docker-compose.yml" + +if profile_overlays and existing(profile_overlays): + resolved = profile_overlays + primary = profile_overlays[-1] +elif tier in {"AP_ULTRA", "AP_PRO", "AP_BASE"}: + if existing(["docker-compose.base.yml", "docker-compose.apple.yml"]): + resolved = ["docker-compose.base.yml", "docker-compose.apple.yml"] + primary = "docker-compose.apple.yml" + elif existing(["docker-compose.base.yml"]): + resolved = ["docker-compose.base.yml"] + primary = "docker-compose.base.yml" +elif tier in {"SH_LARGE", "SH_COMPACT"}: + if existing(["docker-compose.base.yml", "docker-compose.amd.yml"]): + resolved = ["docker-compose.base.yml", "docker-compose.amd.yml"] + primary = "docker-compose.amd.yml" +elif gpu_backend == "apple": + if existing(["docker-compose.base.yml", "docker-compose.apple.yml"]): + resolved = ["docker-compose.base.yml", "docker-compose.apple.yml"] + primary = "docker-compose.apple.yml" + elif existing(["docker-compose.base.yml"]): + resolved = ["docker-compose.base.yml"] + primary = "docker-compose.base.yml" +elif gpu_backend == "amd": + if existing(["docker-compose.base.yml", "docker-compose.amd.yml"]): + resolved = ["docker-compose.base.yml", "docker-compose.amd.yml"] + primary = "docker-compose.amd.yml" +else: + if existing(["docker-compose.base.yml", "docker-compose.nvidia.yml"]): + resolved = ["docker-compose.base.yml", "docker-compose.nvidia.yml"] + primary = "docker-compose.nvidia.yml" + elif (script_dir / "docker-compose.yml").exists(): + resolved = ["docker-compose.yml"] + primary = "docker-compose.yml" + +if not resolved: + resolved = [primary] + +# Discover enabled extension compose fragments via manifests +ext_dir = script_dir / "extensions" / "services" +if ext_dir.exists(): + try: + import yaml + except ImportError: + import json as yaml # fallback if yaml not available + + for service_dir in sorted(ext_dir.iterdir()): + if not service_dir.is_dir(): + continue + # Find manifest + manifest_path = None + for name in ("manifest.yaml", "manifest.yml", "manifest.json"): + candidate = service_dir / name + if candidate.exists(): + manifest_path = candidate + break + if not manifest_path: + continue + try: + with open(manifest_path) as f: + if manifest_path.suffix == ".json": + manifest = json.load(f) + else: + manifest = yaml.safe_load(f) + if manifest.get("schema_version") != "dream.services.v1": + continue + service = manifest.get("service", {}) + # Check GPU backend compatibility + backends = service.get("gpu_backends", ["amd", "nvidia"]) + if gpu_backend not in backends and "all" not in backends: + continue + # Get compose file from manifest + compose_rel = service.get("compose_file", "") + if compose_rel: + compose_path = service_dir / compose_rel + if compose_path.exists(): + resolved.append(str(compose_path.relative_to(script_dir))) + # GPU-specific overlay (filesystem discovery โ€” not in manifest) + gpu_overlay = service_dir / f"compose.{gpu_backend}.yaml" + if gpu_overlay.exists(): + resolved.append(str(gpu_overlay.relative_to(script_dir))) + except Exception: + continue + +# Include docker-compose.override.yml if it exists (user customizations) +override = script_dir / "docker-compose.override.yml" +if override.exists(): + resolved.append("docker-compose.override.yml") + +def to_flags(files): + return " ".join(f"-f {f}" for f in files) + +resolved_flags = to_flags(resolved) + +if env_mode: + def out(key, value): + safe = str(value).replace("\\", "\\\\").replace('"', '\\"') + print(f'{key}="{safe}"') + out("COMPOSE_PRIMARY_FILE", primary) + out("COMPOSE_FILE_LIST", ",".join(resolved)) + out("COMPOSE_FLAGS", resolved_flags) +else: + print(resolved_flags) +PY diff --git a/dream-server/scripts/scrub-livekit-secrets.sh b/dream-server/scripts/scrub-livekit-secrets.sh deleted file mode 100755 index 112ded82f..000000000 --- a/dream-server/scripts/scrub-livekit-secrets.sh +++ /dev/null @@ -1,122 +0,0 @@ -#!/bin/bash -# scrub-livekit-secrets.sh -# Removes hardcoded LiveKit secrets from git history using BFG Repo-Cleaner -# WARNING: This rewrites history - all collaborators must reclone -# -# Usage: ./scripts/scrub-livekit-secrets.sh [secrets-file] -# secrets-file: Path to file containing secrets (one per line) -# Defaults to .secrets-to-scrub in repo root (must be in .gitignore) - -set -euo pipefail - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -REPO_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)" -SECRETS_FILE="${1:-${REPO_ROOT}/.secrets-to-scrub}" - -echo "=== LiveKit Secrets Git History Scrub ===" -echo "WARNING: This will rewrite git history!" -echo "" - -# Validate secrets file exists -if [[ ! -f "$SECRETS_FILE" ]]; then - echo "ERROR: Secrets file not found: $SECRETS_FILE" - echo "" - echo "Create a file with secrets to scrub (one per line):" - echo " echo 'your-secret-key' > .secrets-to-scrub" - echo " echo 'another-secret' >> .secrets-to-scrub" - echo "" - echo "IMPORTANT: Add .secrets-to-scrub to .gitignore!" - exit 1 -fi - -# Load secrets from file - handle NUL bytes and long lines safely -if [[ ! -s "$SECRETS_FILE" ]]; then - echo "ERROR: Secrets file is empty: $SECRETS_FILE" - exit 1 -fi - -# Read file line by line safely, handling potential NUL bytes -SECRETS=() -while IFS= read -r -d $'\n' line || [[ -n "$line" ]]; do - # Skip NUL bytes and empty lines - if [[ -n "$line" ]] && [[ "$line" != *$'\0'* ]]; then - SECRETS+=("$line") - fi -done < "$SECRETS_FILE" - -# Validate secrets were loaded -if [[ ${#SECRETS[@]} -eq 0 ]]; then - echo "ERROR: No secrets found in $SECRETS_FILE" - exit 1 -fi - -echo "Secrets to remove from history (loaded from $SECRETS_FILE):" -for secret in "${SECRETS[@]}"; do - if [[ -n "$secret" ]]; then - echo " - ${secret:0:20}..." - fi -done -echo "" - -# Check if BFG is installed -if ! command -v bfg &> /dev/null; then - echo "BFG not found. Installing..." - - # Download BFG - BFG_VERSION="1.14.0" - BFG_JAR="bfg-${BFG_VERSION}.jar" - BFG_URL="https://repo1.maven.org/maven2/com/madgag/bfg/${BFG_VERSION}/${BFG_JAR}" - - if [[ ! -f "/tmp/${BFG_JAR}" ]]; then - curl -L -o "/tmp/${BFG_JAR}" "${BFG_URL}" - fi - - # Create wrapper script - cat > /tmp/bfg << 'EOF' -#!/bin/bash -java -jar /tmp/bfg-1.14.0.jar "$@" -EOF - chmod +x /tmp/bfg - export PATH="/tmp:$PATH" -fi - -echo "Creating sensitive-data.txt..." -> /tmp/sensitive-data.txt -for secret in "${SECRETS[@]}"; do - if [[ -n "$secret" ]]; then - echo "${secret}" >> /tmp/sensitive-data.txt - fi -done - -echo "" -echo "Files that will be scrubbed:" -git log --all --pretty=format: --name-only | sort -u | grep -E "(livekit|config)" || true -echo "" - -echo "Step 1: Create backup branch" -git branch backup-before-secret-scrub-$(date +%Y%m%d) || true - -echo "" -echo "Step 2: Run BFG to remove secrets" -echo "Command: bfg --replace-text /tmp/sensitive-data.txt" - -# Run BFG -cd "$REPO_ROOT" -bfg --replace-text /tmp/sensitive-data.txt - -echo "" -echo "Step 3: Clean up and garbage collect" -git reflog expire --expire=now --all -git gc --prune=now --aggressive - -echo "" -echo "=== Scrub Complete ===" -echo "" -echo "NEXT STEPS:" -echo "1. Review changes: git log --oneline -5" -echo "2. Force push: git push --force-with-lease origin main" -echo "3. Notify all collaborators to reclone the repo" -echo "4. Rotate any exposed LiveKit credentials immediately" -echo "5. Delete $SECRETS_FILE when done" -echo "" -echo "Backup branch created: backup-before-secret-scrub-$(date +%Y%m%d)" diff --git a/dream-server/scripts/session-cleanup.sh b/dream-server/scripts/session-cleanup.sh new file mode 100644 index 000000000..edc25da0c --- /dev/null +++ b/dream-server/scripts/session-cleanup.sh @@ -0,0 +1,115 @@ +#!/bin/bash +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# Dream Server - Session Cleanup Script +# https://github.com/Light-Heart-Labs/DreamServer +# +# Prevents context overflow crashes by automatically managing +# session file lifecycle. When a session file exceeds the size +# threshold, it's deleted and its reference removed from +# sessions.json, forcing the gateway to create a fresh session. +# +# The agent doesn't notice โ€” it just gets a clean context window. +# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• + +set -euo pipefail + +# โ”€โ”€ Configuration โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ +# Strix Halo: OpenClaw runs in Docker, sessions are in data volume +OPENCLAW_DIR="${OPENCLAW_DIR:-$HOME/dream-server/data/openclaw/home/.openclaw}" +SESSIONS_DIR="${SESSIONS_DIR:-$OPENCLAW_DIR/agents/main/sessions}" +SESSIONS_JSON="$SESSIONS_DIR/sessions.json" +MAX_SIZE="${MAX_SIZE:-256000}" + +# โ”€โ”€ Preflight โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ +if [ ! -f "$SESSIONS_JSON" ]; then + echo "[$(date)] No sessions.json found at $SESSIONS_JSON, skipping" + exit 0 +fi + +if [ ! -d "$SESSIONS_DIR" ]; then + echo "[$(date)] Sessions directory not found at $SESSIONS_DIR, skipping" + exit 0 +fi + +# โ”€โ”€ Extract active session IDs โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ +ACTIVE_IDS=$(grep -oP '"sessionId":\s*"\K[^"]+' "$SESSIONS_JSON" 2>/dev/null || true) + +echo "[$(date)] Session cleanup starting" +echo "[$(date)] Sessions dir: $SESSIONS_DIR" +echo "[$(date)] Max size threshold: $MAX_SIZE bytes" +echo "[$(date)] Active sessions found: $(echo "$ACTIVE_IDS" | wc -w)" + +# โ”€โ”€ Clean up debris โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ +DELETED_COUNT=$(find "$SESSIONS_DIR" -name '*.deleted.*' -delete -print 2>/dev/null | wc -l) +BAK_COUNT=$(find "$SESSIONS_DIR" -name '*.bak*' -not -name '*.bak-cleanup' -delete -print 2>/dev/null | wc -l) +if [ "$DELETED_COUNT" -gt 0 ] || [ "$BAK_COUNT" -gt 0 ]; then + echo "[$(date)] Cleaned up $DELETED_COUNT .deleted files, $BAK_COUNT .bak files" +fi + +# โ”€โ”€ Process session files โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ +WIPE_IDS="" +REMOVED_INACTIVE=0 +REMOVED_BLOATED=0 + +for f in "$SESSIONS_DIR"/*.jsonl; do + [ -f "$f" ] || continue + BASENAME=$(basename "$f" .jsonl) + + # Check if this session is active + IS_ACTIVE=false + for ID in $ACTIVE_IDS; do + if [ "$BASENAME" = "$ID" ]; then + IS_ACTIVE=true + break + fi + done + + if [ "$IS_ACTIVE" = false ]; then + SIZE=$(du -h "$f" | cut -f1) + echo "[$(date)] Removing inactive session: $BASENAME ($SIZE)" + rm -f "$f" + REMOVED_INACTIVE=$((REMOVED_INACTIVE + 1)) + else + SIZE_BYTES=$(stat -c%s "$f" 2>/dev/null || echo 0) + if [ "$SIZE_BYTES" -gt "$MAX_SIZE" ]; then + SIZE=$(du -h "$f" | cut -f1) + echo "[$(date)] Session $BASENAME is bloated ($SIZE > $(numfmt --to=iec $MAX_SIZE 2>/dev/null || echo "${MAX_SIZE}B")), deleting to force fresh session" + rm -f "$f" + WIPE_IDS="$WIPE_IDS $BASENAME" + REMOVED_BLOATED=$((REMOVED_BLOATED + 1)) + fi + fi +done + +# โ”€โ”€ Remove wiped session references from sessions.json โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ +if [ -n "$WIPE_IDS" ]; then + echo "[$(date)] Clearing session references from sessions.json for:$WIPE_IDS" + cp "$SESSIONS_JSON" "$SESSIONS_JSON.bak-cleanup" + + for ID in $WIPE_IDS; do + python3 -c " +import json, sys +with open('$SESSIONS_JSON', 'r') as f: + data = json.load(f) +to_remove = [k for k, v in data.items() if isinstance(v, dict) and v.get('sessionId') == '$ID'] +for k in to_remove: + del data[k] + print(f' Removed session key: {k}', file=sys.stderr) +with open('$SESSIONS_JSON', 'w') as f: + json.dump(data, f, indent=2) +" 2>&1 + done + + # Clean up the backup + rm -f "$SESSIONS_JSON.bak-cleanup" +fi + +# โ”€โ”€ Summary โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ +echo "[$(date)] Cleanup complete: removed $REMOVED_INACTIVE inactive, $REMOVED_BLOATED bloated" +REMAINING=$(find "$SESSIONS_DIR" -maxdepth 1 -name '*.jsonl' 2>/dev/null | wc -l) +echo "[$(date)] Remaining session files: $REMAINING" +if [ "$REMAINING" -gt 0 ]; then + ls -lhS "$SESSIONS_DIR"/*.jsonl 2>/dev/null | while read -r line; do + echo " $line" + done +fi diff --git a/dream-server/scripts/showcase.sh b/dream-server/scripts/showcase.sh old mode 100755 new mode 100644 index a45a08f52..cf0fc2053 --- a/dream-server/scripts/showcase.sh +++ b/dream-server/scripts/showcase.sh @@ -15,15 +15,23 @@ BOLD='\033[1m' DIM='\033[2m' NC='\033[0m' -# URLs -VLLM_URL="${VLLM_URL:-http://localhost:8000}" -WHISPER_URL="${WHISPER_URL:-http://localhost:9000}" -TTS_URL="${TTS_URL:-http://localhost:8880}" -QDRANT_URL="${QDRANT_URL:-http://localhost:6333}" - # Get script directory SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" DREAM_DIR="$(dirname "$SCRIPT_DIR")" + +# Source service registry for port resolution +if [[ -f "$DREAM_DIR/lib/service-registry.sh" ]]; then + export SCRIPT_DIR="$DREAM_DIR" + . "$DREAM_DIR/lib/service-registry.sh" + sr_load + [[ -f "$DREAM_DIR/.env" ]] && set -a && . "$DREAM_DIR/.env" && set +a +fi + +# URLs โ€” resolved from registry +LLM_URL="${LLM_URL:-http://localhost:${SERVICE_PORTS[llama-server]:-8080}}" +WHISPER_URL="${WHISPER_URL:-http://localhost:${SERVICE_PORTS[whisper]:-9000}}" +TTS_URL="${TTS_URL:-http://localhost:${SERVICE_PORTS[tts]:-8880}}" +QDRANT_URL="${QDRANT_URL:-http://localhost:${SERVICE_PORTS[qdrant]:-6333}}" EXAMPLES_DIR="$DREAM_DIR/examples" clear_screen() { @@ -63,8 +71,8 @@ demo_chat() { echo -e "${DIM}โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€${NC}" echo "" - if ! check_service "$VLLM_URL" "/health"; then - echo -e "${RED}Error: vLLM is not running${NC}" + if ! check_service "$LLM_URL" "/health"; then + echo -e "${RED}Error: LLM is not running${NC}" echo "Start Dream Server first: docker compose up -d" return fi @@ -86,7 +94,7 @@ demo_chat() { echo -ne "${CYAN}AI: ${NC}" - response=$(curl -sf "${VLLM_URL}/v1/chat/completions" \ + response=$(curl -sf "${LLM_URL}/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "$(jq -n --arg msg "$user_input" '{ model: "local", @@ -108,13 +116,13 @@ demo_voice() { if ! check_service "$WHISPER_URL" "/health"; then echo -e "${YELLOW}Whisper (STT) not running. Voice input disabled.${NC}" - echo -e "${DIM}Enable with: docker compose --profile voice up -d${NC}" + echo -e "${DIM}Enable with: docker compose ps whisper # Voice services start with the stack${NC}" echo "" fi if ! check_service "$TTS_URL" "/health"; then echo -e "${YELLOW}Kokoro (TTS) not running. Voice output disabled.${NC}" - echo -e "${DIM}Enable with: docker compose --profile voice up -d${NC}" + echo -e "${DIM}Enable with: docker compose ps whisper # Voice services start with the stack${NC}" echo "" fi @@ -155,13 +163,13 @@ demo_rag() { echo -e "${DIM}โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€${NC}" echo "" - if ! check_service "$VLLM_URL" "/health"; then - echo -e "${RED}Error: vLLM is not running${NC}" + if ! check_service "$LLM_URL" "/health"; then + echo -e "${RED}Error: LLM is not running${NC}" return fi if ! check_service "$QDRANT_URL" "/healthz"; then - echo -e "${YELLOW}Qdrant not running. Enable with: docker compose --profile rag up -d${NC}" + echo -e "${YELLOW}Qdrant not running. Enable with: docker compose ps qdrant # RAG services start with the stack${NC}" echo "" echo -e "${DIM}Press Enter to return to menu...${NC}" read -r @@ -206,7 +214,7 @@ demo_rag() { echo -ne "${CYAN}Answer: ${NC}" # Use document as context - response=$(curl -sf "${VLLM_URL}/v1/chat/completions" \ + response=$(curl -sf "${LLM_URL}/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "$(jq -n --arg doc "$DOC_CONTENT" --arg q "$question" '{ model: "local", @@ -229,8 +237,8 @@ demo_code() { echo -e "${DIM}โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€${NC}" echo "" - if ! check_service "$VLLM_URL" "/health"; then - echo -e "${RED}Error: vLLM is not running${NC}" + if ! check_service "$LLM_URL" "/health"; then + echo -e "${RED}Error: LLM is not running${NC}" return fi @@ -278,7 +286,7 @@ demo_code() { prompt="Task: $task\n\nCode:\n\`\`\`\n$CODE\n\`\`\`" - response=$(curl -sf "${VLLM_URL}/v1/chat/completions" \ + response=$(curl -sf "${LLM_URL}/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "$(jq -n --arg p "$prompt" '{ model: "local", @@ -307,21 +315,16 @@ show_status() { echo -e "${BOLD}Services:${NC}" echo "" - services=( - "vLLM (LLM)|$VLLM_URL|/health" - "Whisper (STT)|$WHISPER_URL|/health" - "Kokoro (TTS)|$TTS_URL|/health" - "Qdrant (Vector DB)|$QDRANT_URL|/healthz" - "n8n (Workflows)|http://localhost:5678|/healthz" - "Open WebUI|http://localhost:3000|/" - ) - - for service in "${services[@]}"; do - IFS='|' read -r name url endpoint <<< "$service" - if check_service "$url" "$endpoint"; then - echo -e " ${GREEN}โœ“${NC} $name ${DIM}($url)${NC}" + for sid in "${SERVICE_IDS[@]}"; do + _port="${SERVICE_PORTS[$sid]:-0}" + _health="${SERVICE_HEALTH[$sid]:-/health}" + _name="${SERVICE_NAMES[$sid]:-$sid}" + [[ "$_port" == "0" ]] && continue + _url="http://localhost:${_port}" + if check_service "$_url" "$_health"; then + echo -e " ${GREEN}โœ“${NC} $_name ${DIM}($_url)${NC}" else - echo -e " ${RED}โœ—${NC} $name ${DIM}($url)${NC}" + echo -e " ${RED}โœ—${NC} $_name ${DIM}($_url)${NC}" fi done @@ -340,9 +343,9 @@ show_status() { echo "" echo -e "${BOLD}Quick Links:${NC}" echo "" - echo -e " Chat UI: ${CYAN}http://localhost:3000${NC}" - echo -e " Workflows: ${CYAN}http://localhost:5678${NC}" - echo -e " API: ${CYAN}http://localhost:8000/v1${NC}" + echo -e " Chat UI: ${CYAN}http://localhost:${SERVICE_PORTS[open-webui]:-3000}${NC}" + echo -e " Workflows: ${CYAN}http://localhost:${SERVICE_PORTS[n8n]:-5678}${NC}" + echo -e " API: ${CYAN}http://localhost:${SERVICE_PORTS[llama-server]:-8080}/v1${NC}" echo "" echo -e "${DIM}Press Enter to return to menu...${NC}" diff --git a/dream-server/scripts/simulate-installers.sh b/dream-server/scripts/simulate-installers.sh new file mode 100644 index 000000000..9b8a022c5 --- /dev/null +++ b/dream-server/scripts/simulate-installers.sh @@ -0,0 +1,177 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +OUT_DIR="${1:-${ROOT_DIR}/artifacts/installer-sim}" +mkdir -p "$OUT_DIR" + +LINUX_LOG="${OUT_DIR}/linux-dryrun.log" +LINUX_SUMMARY_JSON="${OUT_DIR}/linux-install-summary.json" +MACOS_LOG="${OUT_DIR}/macos-installer.log" +WINDOWS_SIM_JSON="${OUT_DIR}/windows-preflight-sim.json" +MACOS_PREFLIGHT_JSON="${OUT_DIR}/macos-preflight.json" +MACOS_DOCTOR_JSON="${OUT_DIR}/macos-doctor.json" +DOCTOR_JSON="${OUT_DIR}/doctor.json" +SUMMARY_JSON="${OUT_DIR}/summary.json" +SUMMARY_MD="${OUT_DIR}/SUMMARY.md" + +FAKEBIN="$(mktemp -d)" +trap 'rm -rf "$FAKEBIN"' EXIT +cat > "${FAKEBIN}/curl" <<'EOF' +#!/usr/bin/env bash +exit 0 +EOF +chmod +x "${FAKEBIN}/curl" + +cd "$ROOT_DIR" + +# 1) Linux installer dry-run simulation +LINUX_EXIT=0 +if ! PATH="${FAKEBIN}:$PATH" bash install-core.sh --dry-run --non-interactive --skip-docker --force --summary-json "$LINUX_SUMMARY_JSON" >"$LINUX_LOG" 2>&1; then + LINUX_EXIT=$? +fi + +# 2) macOS installer MVP simulation +MACOS_EXIT=0 +if ! bash installers/macos.sh --no-delegate --report "$MACOS_PREFLIGHT_JSON" --doctor-report "$MACOS_DOCTOR_JSON" >"$MACOS_LOG" 2>&1; then + MACOS_EXIT=$? +fi + +# 3) Windows scenario simulation via preflight engine (since pwsh may be unavailable in CI/sandbox) +scripts/preflight-engine.sh \ + --report "$WINDOWS_SIM_JSON" \ + --tier T1 \ + --ram-gb 16 \ + --disk-gb 120 \ + --gpu-backend nvidia \ + --gpu-vram-mb 12288 \ + --gpu-name "RTX 3060" \ + --platform-id windows \ + --compose-overlays docker-compose.base.yml,docker-compose.nvidia.yml \ + --script-dir "$ROOT_DIR" \ + --env >/dev/null + +# 4) Doctor snapshot for current machine context +DOCTOR_EXIT=0 +if ! scripts/dream-doctor.sh "$DOCTOR_JSON" >/dev/null 2>&1; then + DOCTOR_EXIT=$? +fi + +python3 - "$SUMMARY_JSON" "$SUMMARY_MD" "$LINUX_LOG" "$MACOS_LOG" "$WINDOWS_SIM_JSON" "$MACOS_PREFLIGHT_JSON" "$MACOS_DOCTOR_JSON" "$DOCTOR_JSON" "$LINUX_SUMMARY_JSON" "$LINUX_EXIT" "$MACOS_EXIT" "$DOCTOR_EXIT" <<'PY' +import json +import pathlib +import re +import sys +from datetime import datetime, timezone + +( + summary_json_path, + summary_md_path, + linux_log, + macos_log, + windows_sim_json, + macos_preflight_json, + macos_doctor_json, + doctor_json, + linux_install_summary_json, + linux_exit, + macos_exit, + doctor_exit, +) = sys.argv[1:] + +def load_json(path): + p = pathlib.Path(path) + if not p.exists(): + return None + try: + return json.loads(p.read_text(encoding="utf-8")) + except Exception: + return None + +linux_text = pathlib.Path(linux_log).read_text(encoding="utf-8", errors="replace") if pathlib.Path(linux_log).exists() else "" +macos_text = pathlib.Path(macos_log).read_text(encoding="utf-8", errors="replace") if pathlib.Path(macos_log).exists() else "" + +linux_signals = { + "capability_loaded": bool(re.search(r"Capability profile loaded", linux_text)), + "hardware_class_logged": bool(re.search(r"Hardware class:", linux_text)), + "backend_contract_loaded": bool(re.search(r"Backend contract loaded", linux_text)), + "preflight_report_logged": bool(re.search(r"Preflight report:", linux_text)), + "compose_selection_logged": bool(re.search(r"Compose selection:", linux_text)), +} + +summary = { + "version": "1", + "generated_at": datetime.now(timezone.utc).isoformat(), + "runs": { + "linux_dryrun": { + "exit_code": int(linux_exit), + "signals": linux_signals, + "log": linux_log, + "install_summary": load_json(linux_install_summary_json), + }, + "macos_installer_mvp": { + "exit_code": int(macos_exit), + "log": macos_log, + "preflight": load_json(macos_preflight_json), + "doctor": load_json(macos_doctor_json), + }, + "windows_scenario_preflight": { + "report": load_json(windows_sim_json), + }, + "doctor_snapshot": { + "exit_code": int(doctor_exit), + "report": load_json(doctor_json), + }, + }, +} + +pathlib.Path(summary_json_path).write_text(json.dumps(summary, indent=2) + "\n", encoding="utf-8") + +lines = [] +lines.append("# Installer Simulation Summary") +lines.append("") +lines.append(f"Generated: {summary['generated_at']}") +lines.append("") +lines.append("## Linux Dry-Run") +lines.append(f"- Exit code: {linux_exit}") +for k, v in linux_signals.items(): + lines.append(f"- {k}: {'yes' if v else 'no'}") +lines.append(f"- Log: `{linux_log}`") +lines.append("") + +mp = summary["runs"]["macos_installer_mvp"].get("preflight") or {} +ms = (mp.get("summary") or {}) +lines.append("## macOS Installer MVP") +lines.append(f"- Exit code: {macos_exit}") +lines.append(f"- Preflight blockers: {ms.get('blockers', 'n/a')}") +lines.append(f"- Preflight warnings: {ms.get('warnings', 'n/a')}") +lines.append(f"- Log: `{macos_log}`") +lines.append(f"- Preflight JSON: `{macos_preflight_json}`") +lines.append(f"- Doctor JSON: `{macos_doctor_json}`") +lines.append("") + +wp = summary["runs"]["windows_scenario_preflight"].get("report") or {} +ws = (wp.get("summary") or {}) +lines.append("## Windows Scenario (Simulated)") +lines.append(f"- Preflight blockers: {ws.get('blockers', 'n/a')}") +lines.append(f"- Preflight warnings: {ws.get('warnings', 'n/a')}") +lines.append(f"- Report: `{windows_sim_json}`") +lines.append("") + +dr = summary["runs"]["doctor_snapshot"].get("report") or {} +dsum = dr.get("summary") or {} +lines.append("## Doctor Snapshot") +lines.append(f"- Exit code: {doctor_exit}") +lines.append(f"- Runtime ready: {dsum.get('runtime_ready', 'n/a')}") +lines.append(f"- Report: `{doctor_json}`") + +pathlib.Path(summary_md_path).write_text("\n".join(lines) + "\n", encoding="utf-8") +PY + +if [[ -x "${ROOT_DIR}/scripts/validate-sim-summary.py" ]]; then + "${ROOT_DIR}/scripts/validate-sim-summary.py" "$SUMMARY_JSON" +fi + +echo "Installer simulation complete." +echo " JSON: $SUMMARY_JSON" +echo " MD: $SUMMARY_MD" diff --git a/dream-server/scripts/systemd/memory-shepherd-memory.service b/dream-server/scripts/systemd/memory-shepherd-memory.service new file mode 100644 index 000000000..aaa2b0202 --- /dev/null +++ b/dream-server/scripts/systemd/memory-shepherd-memory.service @@ -0,0 +1,6 @@ +[Unit] +Description=Memory Shepherd โ€” MEMORY.md Baseline Reset + +[Service] +Type=oneshot +ExecStart=%h/dream-server/memory-shepherd/memory-shepherd.sh dream-agent-memory diff --git a/dream-server/scripts/systemd/memory-shepherd-memory.timer b/dream-server/scripts/systemd/memory-shepherd-memory.timer new file mode 100644 index 000000000..157876f65 --- /dev/null +++ b/dream-server/scripts/systemd/memory-shepherd-memory.timer @@ -0,0 +1,11 @@ +[Unit] +Description=Memory Shepherd โ€” MEMORY.md Timer (3h) + +[Timer] +OnBootSec=5min +OnCalendar=*-*-* 00/3:00:00 +RandomizedDelaySec=5min +Persistent=true + +[Install] +WantedBy=timers.target diff --git a/dream-server/scripts/systemd/memory-shepherd-workspace.service b/dream-server/scripts/systemd/memory-shepherd-workspace.service new file mode 100644 index 000000000..2156045cc --- /dev/null +++ b/dream-server/scripts/systemd/memory-shepherd-workspace.service @@ -0,0 +1,7 @@ +[Unit] +Description=Memory Shepherd โ€” Workspace Files (AGENTS.md + TOOLS.md) + +[Service] +Type=oneshot +ExecStart=%h/dream-server/memory-shepherd/memory-shepherd.sh dream-agent-agents +ExecStart=%h/dream-server/memory-shepherd/memory-shepherd.sh dream-agent-tools diff --git a/dream-server/scripts/systemd/memory-shepherd-workspace.timer b/dream-server/scripts/systemd/memory-shepherd-workspace.timer new file mode 100644 index 000000000..1e6eb6b1d --- /dev/null +++ b/dream-server/scripts/systemd/memory-shepherd-workspace.timer @@ -0,0 +1,10 @@ +[Unit] +Description=Memory Shepherd โ€” Workspace Files Timer (60s) + +[Timer] +OnBootSec=20s +OnUnitActiveSec=60s +AccuracySec=5s + +[Install] +WantedBy=timers.target diff --git a/dream-server/scripts/systemd/openclaw-session-cleanup.service b/dream-server/scripts/systemd/openclaw-session-cleanup.service new file mode 100644 index 000000000..2ec3cc806 --- /dev/null +++ b/dream-server/scripts/systemd/openclaw-session-cleanup.service @@ -0,0 +1,9 @@ +[Unit] +Description=OpenClaw Session Cleanup +After=network.target + +[Service] +Type=oneshot +Environment=SESSIONS_DIR=%h/dream-server/data/openclaw/home/agents/main/sessions +Environment=MAX_SIZE=80000 +ExecStart=%h/dream-server/scripts/session-cleanup.sh diff --git a/dream-server/scripts/systemd/openclaw-session-cleanup.timer b/dream-server/scripts/systemd/openclaw-session-cleanup.timer new file mode 100644 index 000000000..ae3d83ad7 --- /dev/null +++ b/dream-server/scripts/systemd/openclaw-session-cleanup.timer @@ -0,0 +1,10 @@ +[Unit] +Description=OpenClaw Session Cleanup Timer + +[Timer] +OnBootSec=30s +OnUnitActiveSec=60s +AccuracySec=5s + +[Install] +WantedBy=timers.target diff --git a/dream-server/scripts/upgrade-model.ps1 b/dream-server/scripts/upgrade-model.ps1 deleted file mode 100644 index b6888fb1b..000000000 --- a/dream-server/scripts/upgrade-model.ps1 +++ /dev/null @@ -1,136 +0,0 @@ -# Dream Server Model Upgrade Script (Windows) -# Upgrades from bootstrap model to full tier model -# -# Usage: .\upgrade-model.ps1 -# .\upgrade-model.ps1 -Model "Qwen/Qwen2.5-32B-Instruct-AWQ" - -param( - [string]$Model = "", - [switch]$DryRun, - [switch]$Help -) - -$ErrorActionPreference = "Stop" -$InstallDir = "$env:LOCALAPPDATA\DreamServer" -$EnvFile = "$InstallDir\.env" - -function Write-Info { Write-Host "[INFO] $args" -ForegroundColor Cyan } -function Write-Ok { Write-Host "[OK] $args" -ForegroundColor Green } -function Write-Warn { Write-Host "[WARN] $args" -ForegroundColor Yellow } -function Write-Err { Write-Host "[ERROR] $args" -ForegroundColor Red } - -if ($Help) { - @" -Dream Server Model Upgrade - -Upgrades from bootstrap (small) model to full tier model. - -Usage: - .\upgrade-model.ps1 # Upgrade to target model from .env - .\upgrade-model.ps1 -Model X # Upgrade to specific model - .\upgrade-model.ps1 -DryRun # Preview without changes - -Models by tier: - Tier 1: Qwen/Qwen2.5-7B-Instruct - Tier 2: Qwen/Qwen2.5-14B-Instruct-AWQ - Tier 3: Qwen/Qwen2.5-32B-Instruct-AWQ - Tier 4: Qwen/Qwen2.5-72B-Instruct-AWQ -"@ - exit 0 -} - -# Check installation exists -if (-not (Test-Path $InstallDir)) { - Write-Err "Dream Server not installed at $InstallDir" - Write-Info "Run install-windows.bat first" - exit 1 -} - -if (-not (Test-Path $EnvFile)) { - Write-Err ".env file not found" - exit 1 -} - -# Read current config -$envContent = Get-Content $EnvFile -Raw -$currentModel = "" -$targetModel = "" - -if ($envContent -match 'LLM_MODEL=(.+)') { - $currentModel = $Matches[1].Trim() -} -if ($envContent -match 'TARGET_MODEL=(.+)') { - $targetModel = $Matches[1].Trim() -} - -Write-Host "" -Write-Host "Dream Server Model Upgrade" -ForegroundColor Cyan -Write-Host "==========================" -ForegroundColor Cyan -Write-Host "" -Write-Info "Current model: $currentModel" - -# Determine target -if ($Model) { - $newModel = $Model -} elseif ($targetModel -and $targetModel -ne $currentModel) { - $newModel = $targetModel -} else { - Write-Warn "No target model specified and TARGET_MODEL matches current" - Write-Info "Use -Model to specify a model manually" - exit 0 -} - -Write-Info "Target model: $newModel" - -if ($currentModel -eq $newModel) { - Write-Ok "Already running target model. No upgrade needed." - exit 0 -} - -if ($DryRun) { - Write-Host "" - Write-Info "[DRY RUN] Would update LLM_MODEL from '$currentModel' to '$newModel'" - Write-Info "[DRY RUN] Would restart vLLM container" - exit 0 -} - -Write-Host "" -Write-Info "Upgrading model..." - -# Update .env file -$envContent = $envContent -replace "LLM_MODEL=.+", "LLM_MODEL=$newModel" -$envContent | Set-Content $EnvFile -NoNewline -Write-Ok "Updated .env" - -# Restart vLLM to load new model -Set-Location $InstallDir -Write-Info "Restarting vLLM container (this will download the model)..." -Write-Warn "This may take 10-30 minutes depending on model size and internet speed" - -docker compose stop vllm -docker compose up -d vllm - -Write-Host "" -Write-Info "Model download starting in background." -Write-Info "Monitor progress with: docker compose logs -f vllm" -Write-Host "" - -# Wait a bit and check status -Write-Info "Waiting 30s for initial startup..." -Start-Sleep -Seconds 30 - -$health = docker compose exec vllm curl -s http://localhost:8000/health 2>&1 -if ($health -match "200" -or $health -match "ok") { - Write-Ok "vLLM is responding (model may still be loading)" -} else { - Write-Warn "vLLM not responding yet - check logs" -} - -Write-Host "" -Write-Ok "Upgrade initiated!" -Write-Host "" -Write-Host "Next steps:" -Write-Host " 1. Monitor: docker compose logs -f vllm" -Write-Host " 2. Wait for 'Running on http://0.0.0.0:8000' in logs" -Write-Host " 3. Test: curl http://localhost:8000/health" -Write-Host "" diff --git a/dream-server/scripts/upgrade-model.sh b/dream-server/scripts/upgrade-model.sh old mode 100755 new mode 100644 index 5ea165b1f..cdc2581b0 --- a/dream-server/scripts/upgrade-model.sh +++ b/dream-server/scripts/upgrade-model.sh @@ -4,7 +4,7 @@ # # Part of Dream Server โ€” Phase 0 Foundation # -# Gracefully swaps models in vLLM with automatic rollback on failure. +# Gracefully swaps models in llama-server with automatic rollback on failure. # Ensures zero downtime when possible, minimal downtime otherwise. # # Usage: @@ -18,19 +18,60 @@ set -euo pipefail # Configuration -DREAM_DIR="${DREAM_DIR:-$HOME/.dream-server}" +DREAM_DIR="${DREAM_DIR:-$HOME/dream-server}" MODELS_DIR="${MODELS_DIR:-$DREAM_DIR/models}" STATE_FILE="$DREAM_DIR/model-state.json" BACKUP_FILE="$DREAM_DIR/model-state.backup.json" LOG_FILE="$DREAM_DIR/upgrade-model.log" -VLLM_HOST="${VLLM_HOST:-localhost}" -VLLM_PORT="${VLLM_PORT:-8000}" -VLLM_CONTAINER="${VLLM_CONTAINER:-dream-server-vllm-1}" +LLAMA_SERVER_PORT="${LLAMA_SERVER_PORT:-8080}" +LLAMA_SERVER_CONTAINER="${LLAMA_SERVER_CONTAINER:-dream-llama-server}" HEALTH_CHECK_TIMEOUT=120 # seconds HEALTH_CHECK_INTERVAL=5 # seconds +INFERENCE_SERVICE="llama-server" +INFERENCE_PORT="$LLAMA_SERVER_PORT" +INFERENCE_CONTAINER="$LLAMA_SERVER_CONTAINER" +MODEL_ENV_KEY="LLM_MODEL" + +detect_compose_file() { + COMPOSE_FILE_ARGS=() + if [[ -f "$DREAM_DIR/docker-compose.base.yml" && -f "$DREAM_DIR/docker-compose.amd.yml" ]]; then + COMPOSE_FILE_ARGS=(-f "$DREAM_DIR/docker-compose.base.yml" -f "$DREAM_DIR/docker-compose.amd.yml") + elif [[ -f "$DREAM_DIR/docker-compose.base.yml" && -f "$DREAM_DIR/docker-compose.nvidia.yml" ]]; then + COMPOSE_FILE_ARGS=(-f "$DREAM_DIR/docker-compose.base.yml" -f "$DREAM_DIR/docker-compose.nvidia.yml") + elif [[ -f "$DREAM_DIR/docker-compose.yml" ]]; then + COMPOSE_FILE_ARGS=(-f "$DREAM_DIR/docker-compose.yml") + fi +} + +detect_inference_service() { + if [[ ${#COMPOSE_FILE_ARGS[@]} -eq 0 ]]; then + echo "llama-server" + return + fi + + if docker compose "${COMPOSE_FILE_ARGS[@]}" config --services 2>/dev/null | grep -q '^llama-server$'; then + echo "llama-server" + else + echo "llama-server" + fi +} + +resolve_inference_runtime() { + if command -v docker &> /dev/null; then + detect_compose_file + INFERENCE_SERVICE=$(detect_inference_service) + else + INFERENCE_SERVICE="llama-server" + fi + + INFERENCE_PORT="$LLAMA_SERVER_PORT" + INFERENCE_CONTAINER="$LLAMA_SERVER_CONTAINER" + MODEL_ENV_KEY="LLM_MODEL" +} + # Colors RED='\033[0;31m' GREEN='\033[0;32m' @@ -112,25 +153,27 @@ EOF } #----------------------------------------------------------------------------- -# vLLM Operations +# llama-server Operations #----------------------------------------------------------------------------- -check_vllm_health() { +check_llm_health() { + resolve_inference_runtime local response response=$(curl -s -o /dev/null -w "%{http_code}" \ - "http://${VLLM_HOST}:${VLLM_PORT}/health" 2>/dev/null || echo "000") + "http://${LLM_HOST:-localhost}:${INFERENCE_PORT}/health" 2>/dev/null || echo "000") [[ "$response" == "200" ]] } -wait_for_vllm() { +wait_for_llm() { local timeout=$1 local elapsed=0 - log "Waiting for vLLM to be ready (timeout: ${timeout}s)..." + resolve_inference_runtime + log "Waiting for ${INFERENCE_SERVICE} to be ready (timeout: ${timeout}s)..." while [[ $elapsed -lt $timeout ]]; do - if check_vllm_health; then - success "vLLM is ready" + if check_llm_health; then + success "${INFERENCE_SERVICE} is ready" return 0 fi sleep $HEALTH_CHECK_INTERVAL @@ -139,23 +182,18 @@ wait_for_vllm() { done echo "" - error "vLLM health check timed out after ${timeout}s" + error "${INFERENCE_SERVICE} health check timed out after ${timeout}s" return 1 } test_inference() { + resolve_inference_runtime log "Testing inference..." local response - response=$(curl -s -X POST "http://${VLLM_HOST}:${VLLM_PORT}/v1/completions" \ - -H "Content-Type: application/json" \ - -d '{ - "model": "default", - "prompt": "Hello, I am", - "max_tokens": 10 - }' 2>/dev/null || echo "") + response=$(curl -s "http://${LLM_HOST:-localhost}:${INFERENCE_PORT}/v1/models" 2>/dev/null || echo "") - if echo "$response" | grep -q '"text"'; then + if echo "$response" | grep -q '"data"'; then success "Inference test passed" return 0 else @@ -165,50 +203,55 @@ test_inference() { fi } -stop_vllm() { - log "Stopping vLLM..." +stop_llm() { + resolve_inference_runtime + log "Stopping ${INFERENCE_SERVICE}..." if command -v docker &> /dev/null; then - docker stop "$VLLM_CONTAINER" 2>/dev/null || true - docker wait "$VLLM_CONTAINER" 2>/dev/null || true + if [[ ${#COMPOSE_FILE_ARGS[@]} -gt 0 ]]; then + docker compose "${COMPOSE_FILE_ARGS[@]}" stop "$INFERENCE_SERVICE" 2>/dev/null || true + else + docker stop "$INFERENCE_CONTAINER" 2>/dev/null || true + docker wait "$INFERENCE_CONTAINER" 2>/dev/null || true + fi elif command -v dream &> /dev/null; then - dream stop vllm 2>/dev/null || true + dream stop llama-server 2>/dev/null || true else - warn "Cannot stop vLLM: no docker or dream CLI found" + warn "Cannot stop llama-server: no docker or dream CLI found" return 1 fi - success "vLLM stopped" + success "${INFERENCE_SERVICE} stopped" } -start_vllm() { +start_llm() { local model="$1" + resolve_inference_runtime - log "Starting vLLM with model: $model" + log "Starting ${INFERENCE_SERVICE} with model: $model" # Update environment or compose file local env_file="$DREAM_DIR/.env" if [[ -f "$env_file" ]]; then - # Update MODEL_PATH in .env - if grep -q "^MODEL_PATH=" "$env_file"; then - sed -i "s|^MODEL_PATH=.*|MODEL_PATH=$model|" "$env_file" + # Update active model env key for detected inference backend. + if grep -q "^${MODEL_ENV_KEY}=" "$env_file"; then + sed -i "s|^${MODEL_ENV_KEY}=.*|${MODEL_ENV_KEY}=$model|" "$env_file" else - echo "MODEL_PATH=$model" >> "$env_file" + echo "${MODEL_ENV_KEY}=$model" >> "$env_file" fi fi if command -v docker &> /dev/null; then - # Start via docker-compose - local compose_file="$DREAM_DIR/docker-compose.yml" - if [[ -f "$compose_file" ]]; then - docker compose -f "$compose_file" up -d vllm + # Start via docker compose (supports canonical base+overlay and legacy files) + if [[ ${#COMPOSE_FILE_ARGS[@]} -gt 0 ]]; then + docker compose "${COMPOSE_FILE_ARGS[@]}" up -d "$INFERENCE_SERVICE" else - docker start "$VLLM_CONTAINER" + docker start "$INFERENCE_CONTAINER" fi elif command -v dream &> /dev/null; then - dream start vllm + dream start llama-server else - error "Cannot start vLLM: no docker or dream CLI found" + error "Cannot start llama-server: no docker or dream CLI found" return 1 fi } @@ -244,16 +287,17 @@ cmd_list() { } cmd_current() { + resolve_inference_runtime local current current=$(get_current_model) if [[ -n "$current" ]]; then echo -e "${CYAN}Current model:${NC} $current" - if check_vllm_health; then - echo -e "${GREEN}Status:${NC} Running" + if check_llm_health; then + echo -e "${GREEN}Status:${NC} Running (${INFERENCE_SERVICE} on :${INFERENCE_PORT})" else - echo -e "${RED}Status:${NC} Not responding" + echo -e "${RED}Status:${NC} Not responding (${INFERENCE_SERVICE} on :${INFERENCE_PORT})" fi else echo "No model currently configured" @@ -290,23 +334,23 @@ cmd_upgrade() { log "Upgrading model: $current_model โ†’ $new_model" - # Phase 1: Stop vLLM + # Phase 1: Stop llama-server echo "" - echo -e "${CYAN}Phase 1/4:${NC} Stopping vLLM..." - stop_vllm || { - error "Failed to stop vLLM" + echo -e "${CYAN}Phase 1/4:${NC} Stopping llama-server..." + stop_llm || { + error "Failed to stop llama-server" return 1 } - + # Phase 2: Update configuration echo -e "${CYAN}Phase 2/4:${NC} Updating configuration..." save_state "$new_model" "$current_model" success "Configuration updated" - - # Phase 3: Start vLLM with new model - echo -e "${CYAN}Phase 3/4:${NC} Starting vLLM with new model..." - start_vllm "$model_path" || { - error "Failed to start vLLM" + + # Phase 3: Start llama-server with new model + echo -e "${CYAN}Phase 3/4:${NC} Starting llama-server with new model..." + start_llm "$model_path" || { + error "Failed to start llama-server" warn "Attempting rollback..." cmd_rollback return 1 @@ -314,7 +358,7 @@ cmd_upgrade() { # Phase 4: Health check echo -e "${CYAN}Phase 4/4:${NC} Verifying health..." - if wait_for_vllm $HEALTH_CHECK_TIMEOUT && test_inference; then + if wait_for_llm $HEALTH_CHECK_TIMEOUT && test_inference; then echo "" success "Model upgrade complete!" echo -e " Previous: ${YELLOW}$current_model${NC}" @@ -350,10 +394,10 @@ cmd_rollback() { local model_path="$MODELS_DIR/$previous_model" - stop_vllm || true - start_vllm "$model_path" + stop_llm || true + start_llm "$model_path" - if wait_for_vllm $HEALTH_CHECK_TIMEOUT && test_inference; then + if wait_for_llm $HEALTH_CHECK_TIMEOUT && test_inference; then success "Rollback complete" save_state "$previous_model" "$current_model" else @@ -396,9 +440,8 @@ Examples: Environment Variables: MODELS_DIR Models directory (default: $MODELS_DIR) - VLLM_HOST vLLM hostname (default: localhost) - VLLM_PORT vLLM port (default: 8000) - VLLM_CONTAINER Docker container name (default: dream-server-vllm-1) + LLAMA_SERVER_PORT llama-server port (default: 8080) + LLAMA_SERVER_CONTAINER Docker container name (default: dream-llama-server) EOF ;; diff --git a/dream-server/scripts/validate-env.sh b/dream-server/scripts/validate-env.sh new file mode 100644 index 000000000..9c6904b1c --- /dev/null +++ b/dream-server/scripts/validate-env.sh @@ -0,0 +1,123 @@ +#!/bin/bash +# Validate .env against .env.schema.json + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +INSTALL_DIR="${INSTALL_DIR:-$(dirname "$SCRIPT_DIR")}" +ENV_FILE="${1:-${INSTALL_DIR}/.env}" +SCHEMA_FILE="${2:-${INSTALL_DIR}/.env.schema.json}" + +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' + +log_info() { echo -e "${BLUE}[INFO]${NC} $1"; } +log_success() { echo -e "${GREEN}[SUCCESS]${NC} $1"; } +log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; } +log_error() { echo -e "${RED}[ERROR]${NC} $1"; } + +if [[ ! -f "$ENV_FILE" ]]; then + log_error "Env file not found: $ENV_FILE" + exit 1 +fi + +if [[ ! -f "$SCHEMA_FILE" ]]; then + log_error "Schema file not found: $SCHEMA_FILE" + exit 1 +fi + +if ! command -v jq >/dev/null 2>&1; then + log_error "jq is required for schema validation (sudo apt install jq)" + exit 1 +fi + +declare -A ENV_MAP +while IFS= read -r line; do + [[ -z "$line" || "$line" =~ ^[[:space:]]*# ]] && continue + if [[ "$line" =~ ^([A-Za-z_][A-Za-z0-9_]*)=(.*)$ ]]; then + key="${BASH_REMATCH[1]}" + value="${BASH_REMATCH[2]}" + ENV_MAP["$key"]="$value" + fi +done < "$ENV_FILE" + +missing=() +unknown=() +type_errors=() + +mapfile -t required_keys < <(jq -r '.required[]?' "$SCHEMA_FILE") +for key in "${required_keys[@]}"; do + val="${ENV_MAP[$key]-}" + if [[ -z "$val" ]]; then + missing+=("$key") + fi +done + +mapfile -t schema_keys < <(jq -r '.properties | keys[]' "$SCHEMA_FILE") +declare -A SCHEMA_KEY_SET +for key in "${schema_keys[@]}"; do + SCHEMA_KEY_SET["$key"]=1 +done + +for key in "${!ENV_MAP[@]}"; do + if [[ -z "${SCHEMA_KEY_SET[$key]-}" ]]; then + unknown+=("$key") + fi +done + +for key in "${schema_keys[@]}"; do + val="${ENV_MAP[$key]-}" + [[ -z "$val" ]] && continue + + expected_type="$(jq -r --arg k "$key" '.properties[$k].type // "string"' "$SCHEMA_FILE")" + case "$expected_type" in + integer) + if [[ ! "$val" =~ ^-?[0-9]+$ ]]; then + type_errors+=("$key (expected integer, got '$val')") + fi + ;; + number) + if [[ ! "$val" =~ ^-?[0-9]+([.][0-9]+)?$ ]]; then + type_errors+=("$key (expected number, got '$val')") + fi + ;; + boolean) + if [[ "$val" != "true" && "$val" != "false" ]]; then + type_errors+=("$key (expected boolean true/false, got '$val')") + fi + ;; + esac +done + +if (( ${#missing[@]} > 0 )); then + log_error "Missing required keys:" + for key in "${missing[@]}"; do + echo " - $key" + done +fi + +if (( ${#unknown[@]} > 0 )); then + log_error "Unknown keys not defined in schema:" + for key in "${unknown[@]}"; do + echo " - $key" + done +fi + +if (( ${#type_errors[@]} > 0 )); then + log_error "Type validation errors:" + for err in "${type_errors[@]}"; do + echo " - $err" + done +fi + +if (( ${#missing[@]} > 0 || ${#unknown[@]} > 0 || ${#type_errors[@]} > 0 )); then + echo "" + log_info "Fix .env using .env.example as reference, then re-run:" + echo " ./scripts/validate-env.sh" + exit 2 +fi + +log_success ".env matches schema: $SCHEMA_FILE" diff --git a/dream-server/scripts/validate-models.py b/dream-server/scripts/validate-models.py old mode 100755 new mode 100644 index f67274058..466f6eed7 --- a/dream-server/scripts/validate-models.py +++ b/dream-server/scripts/validate-models.py @@ -11,10 +11,10 @@ # Model requirements for offline mode REQUIRED_MODELS = { - "vllm": { - "path": "models/Qwen/Qwen2.5-32B-Instruct-AWQ", - "description": "Primary LLM (Qwen 2.5 32B AWQ)", - "size_gb": 18, + "llm": { + "path": "data/models", + "description": "Primary LLM (GGUF model)", + "size_gb": 4, }, "whisper": { "path": "data/whisper/faster-whisper-base", diff --git a/dream-server/scripts/validate-sim-summary.py b/dream-server/scripts/validate-sim-summary.py new file mode 100644 index 000000000..0102eb31a --- /dev/null +++ b/dream-server/scripts/validate-sim-summary.py @@ -0,0 +1,63 @@ +#!/usr/bin/env python3 +import json +import sys +from pathlib import Path + + +def fail(msg: str) -> None: + print(f"[FAIL] {msg}") + sys.exit(1) + + +def main() -> None: + if len(sys.argv) < 2: + fail("Usage: validate-sim-summary.py ") + + path = Path(sys.argv[1]) + if not path.exists(): + fail(f"summary file not found: {path}") + + try: + data = json.loads(path.read_text(encoding="utf-8")) + except Exception as exc: + fail(f"invalid JSON: {exc}") + + if data.get("version") != "1": + fail("version must be '1'") + + runs = data.get("runs") + if not isinstance(runs, dict): + fail("runs must be an object") + + required_runs = [ + "linux_dryrun", + "macos_installer_mvp", + "windows_scenario_preflight", + "doctor_snapshot", + ] + for key in required_runs: + if key not in runs: + fail(f"missing runs.{key}") + + linux = runs["linux_dryrun"] + if not isinstance(linux.get("signals"), dict): + fail("runs.linux_dryrun.signals must be an object") + if not isinstance(linux.get("install_summary"), dict): + fail("runs.linux_dryrun.install_summary must be an object") + for signal in ("capability_loaded", "backend_contract_loaded", "preflight_report_logged"): + if signal not in linux["signals"]: + fail(f"missing linux signal: {signal}") + + win_report = runs["windows_scenario_preflight"].get("report") + if not isinstance(win_report, dict) or "summary" not in win_report: + fail("runs.windows_scenario_preflight.report.summary missing") + + doctor_report = runs["doctor_snapshot"].get("report") + if not isinstance(doctor_report, dict) or "autofix_hints" not in doctor_report: + fail("runs.doctor_snapshot.report.autofix_hints missing") + + print("[PASS] simulation summary structure") + + +if __name__ == "__main__": + main() diff --git a/dream-server/scripts/validate.sh b/dream-server/scripts/validate.sh old mode 100755 new mode 100644 index f269c196e..8ae4fc530 --- a/dream-server/scripts/validate.sh +++ b/dream-server/scripts/validate.sh @@ -10,11 +10,26 @@ YELLOW='\033[1;33m' NC='\033[0m' SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -cd "$SCRIPT_DIR/.." +PROJECT_DIR="$(dirname "$SCRIPT_DIR")" +cd "$PROJECT_DIR" + +# Source service registry +export SCRIPT_DIR="$PROJECT_DIR" +. "$PROJECT_DIR/lib/service-registry.sh" +sr_load + +# Source .env for port overrides +[[ -f "$PROJECT_DIR/.env" ]] && set -a && . "$PROJECT_DIR/.env" && set +a + +# Resolve core ports from registry +LLM_PORT="${SERVICE_PORTS[llama-server]:-8080}" +LLM_HEALTH="${SERVICE_HEALTH[llama-server]:-/health}" +WEBUI_PORT="${SERVICE_PORTS[open-webui]:-3000}" +WEBUI_HEALTH="${SERVICE_HEALTH[open-webui]:-/}" echo "" echo "โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—" -echo "โ•‘ ๐Ÿงช Dream Server Validation Test โ•‘" +echo "โ•‘ Dream Server Validation Test โ•‘" echo "โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•" echo "" @@ -36,24 +51,24 @@ check() { echo "1. Container Status" echo "โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€" -check "vLLM running" "docker compose ps vllm 2>/dev/null | grep -q 'Up\|running'" +check "llama-server running" "docker compose ps llama-server 2>/dev/null | grep -q 'Up\|running'" check "Open WebUI running" "docker compose ps open-webui 2>/dev/null | grep -q 'Up\|running'" echo "" echo "2. Health Endpoints" echo "โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€" -check "vLLM health" "curl -sf http://localhost:8000/health" -check "vLLM models" "curl -sf http://localhost:8000/v1/models | grep -q model" -check "WebUI reachable" "curl -sf http://localhost:3000 -o /dev/null" +check "llama-server health" "curl -sf http://localhost:${LLM_PORT}${LLM_HEALTH}" +check "llama-server models" "curl -sf http://localhost:${LLM_PORT}/v1/models | grep -q model" +check "WebUI reachable" "curl -sf http://localhost:${WEBUI_PORT}${WEBUI_HEALTH} -o /dev/null" echo "" echo "3. Inference Test" echo "โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€" printf " %-30s " "Chat completion..." -RESPONSE=$(curl -sf http://localhost:8000/v1/chat/completions \ +RESPONSE=$(curl -sf "http://localhost:${LLM_PORT}/v1/chat/completions" \ -H "Content-Type: application/json" \ -d '{ - "model": "'"$(curl -sf http://localhost:8000/v1/models | jq -r '.data[0].id // "Qwen/Qwen2.5-32B-Instruct-AWQ"')"'", + "model": "'"$(curl -sf "http://localhost:${LLM_PORT}/v1/models" | jq -r '.data[0].id // "local"')"'", "messages": [{"role": "user", "content": "Say OK"}], "max_tokens": 10 }' 2>/dev/null) @@ -71,29 +86,34 @@ echo "" echo "4. Optional Services (if enabled)" echo "โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€" -if docker compose ps whisper 2>/dev/null | grep -q "Up\|running"; then - check "Whisper STT" "curl -sf http://localhost:9000/" -else - printf " %-30s ${YELLOW}โ—‹ SKIP (not enabled)${NC}\n" "Whisper STT..." -fi +SCRIPT_DIR_REG="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +. "$SCRIPT_DIR_REG/lib/service-registry.sh" +sr_load -if docker compose ps tts 2>/dev/null | grep -q "Up\|running"; then - check "OpenTTS" "curl -sf http://localhost:8880/api/voices" -else - printf " %-30s ${YELLOW}โ—‹ SKIP (not enabled)${NC}\n" "OpenTTS..." -fi +for sid in "${SERVICE_IDS[@]}"; do + _cat="${SERVICE_CATEGORIES[$sid]}" + [[ "$_cat" == "core" ]] && continue # Core already checked above -if docker compose ps n8n 2>/dev/null | grep -q "Up\|running"; then - check "n8n workflows" "curl -sf http://localhost:5678/" -else - printf " %-30s ${YELLOW}โ—‹ SKIP (not enabled)${NC}\n" "n8n workflows..." -fi + _container="${SERVICE_CONTAINERS[$sid]}" + _health="${SERVICE_HEALTH[$sid]}" + _port_env="${SERVICE_PORT_ENVS[$sid]}" + _default_port="${SERVICE_PORTS[$sid]}" + _name="${SERVICE_NAMES[$sid]:-$sid}" -if docker compose ps qdrant 2>/dev/null | grep -q "Up\|running"; then - check "Qdrant vector DB" "curl -sf http://localhost:6333/" -else - printf " %-30s ${YELLOW}โ—‹ SKIP (not enabled)${NC}\n" "Qdrant vector DB..." -fi + # Resolve port + _port="$_default_port" + [[ -n "$_port_env" ]] && _port="${!_port_env:-$_default_port}" + + # Skip if no health endpoint or port + [[ -z "$_health" || "$_port" == "0" ]] && continue + + # Check if container is running + if docker compose ps "$sid" 2>/dev/null | grep -q "Up\|running"; then + check "$_name" "curl -sf http://localhost:${_port}${_health}" + else + printf " %-30s ${YELLOW}โ—‹ SKIP (not enabled)${NC}\n" "$_name..." + fi +done # Summary echo "" @@ -101,15 +121,15 @@ echo "โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• if [ $FAILED -eq 0 ]; then echo -e "${GREEN}โœ… Dream Server is ready! ($PASSED tests passed)${NC}" echo "" - echo " Open WebUI: http://localhost:3000" - echo " API: http://localhost:8000/v1/..." + echo " Open WebUI: http://localhost:${WEBUI_PORT}" + echo " API: http://localhost:${LLM_PORT}/v1/..." echo "" else echo -e "${RED}โš ๏ธ $FAILED test(s) failed, $PASSED passed${NC}" echo "" echo " Troubleshooting:" echo " - Check logs: docker compose logs -f" - echo " - vLLM logs: docker compose logs -f vllm" + echo " - LLM logs: docker compose logs -f llama-server" echo " - Restart: docker compose restart" echo "" exit 1 diff --git a/dream-server/setup.sh b/dream-server/setup.sh deleted file mode 100755 index e437bd2f6..000000000 --- a/dream-server/setup.sh +++ /dev/null @@ -1,548 +0,0 @@ -#!/bin/bash -# Dream Server Setup Wizard -# One-command installer for a complete local AI stack -# Usage: curl -fsSL https://dream.openclaw.ai/setup.sh | bash - -set -e - -# Colors -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -BLUE='\033[0;34m' -CYAN='\033[0;36m' -BOLD='\033[1m' -NC='\033[0m' - -# Source utility libraries -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -if [[ -f "$SCRIPT_DIR/lib/progress.sh" ]]; then - source "$SCRIPT_DIR/lib/progress.sh" -fi -if [[ -f "$SCRIPT_DIR/lib/qrcode.sh" ]]; then - source "$SCRIPT_DIR/lib/qrcode.sh" -fi - -# Tier definitions -TIER_NANO="nano" # 8GB RAM, no GPU โ€” 1-3B models -TIER_EDGE="edge" # 16GB RAM or 8GB VRAM โ€” 7-8B models -TIER_PRO="pro" # 24GB+ VRAM โ€” 32B models -TIER_CLUSTER="cluster" # Multi-GPU โ€” 70B+ models - -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• -# BANNER -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - -print_banner() { - echo -e "${CYAN}" - cat << 'EOF' - โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•— - โ•‘ โ•‘ - โ•‘ โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ•— โ•‘ - โ•‘ โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ•‘ โ•‘ - โ•‘ โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ–ˆโ–ˆโ–ˆโ–ˆโ•”โ–ˆโ–ˆโ•‘ โ•‘ - โ•‘ โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ• โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ•šโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘ โ•‘ - โ•‘ โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ•šโ•โ• โ–ˆโ–ˆโ•‘ โ•‘ - โ•‘ โ•šโ•โ•โ•โ•โ•โ• โ•šโ•โ• โ•šโ•โ•โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ• โ•šโ•โ•โ•šโ•โ• โ•šโ•โ• โ•‘ - โ•‘ โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•— โ•‘ - โ•‘ โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘ โ•‘ - โ•‘ โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘ โ•‘ - โ•‘ โ•šโ•โ•โ•โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ• โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ•šโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•”โ• โ•‘ - โ•‘ โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘ โ•šโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ• โ•‘ - โ•‘ โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ• โ•šโ•โ• โ•šโ•โ•โ•โ• โ•‘ - โ•‘ โ•‘ - โ•‘ Your AI. Your Hardware. Your Rules. โ•‘ - โ•‘ โ•‘ - โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• -EOF - echo -e "${NC}" -} - -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• -# HARDWARE DETECTION -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - -detect_os() { - if [[ "$OSTYPE" == "linux-gnu"* ]]; then - echo "linux" - elif [[ "$OSTYPE" == "darwin"* ]]; then - echo "macos" - elif [[ "$OSTYPE" == "msys" ]] || [[ "$OSTYPE" == "cygwin" ]]; then - echo "windows" - else - echo "unknown" - fi -} - -detect_ram_gb() { - local os=$(detect_os) - if [[ "$os" == "linux" ]]; then - awk '/MemTotal/ {printf "%.0f", $2/1024/1024}' /proc/meminfo - elif [[ "$os" == "macos" ]]; then - sysctl -n hw.memsize | awk '{printf "%.0f", $1/1024/1024/1024}' - else - echo "0" - fi -} - -detect_gpu() { - # Returns: nvidia|amd|apple|none - local os=$(detect_os) - - if [[ "$os" == "macos" ]]; then - # Check for Apple Silicon - if sysctl -n machdep.cpu.brand_string 2>/dev/null | grep -qi "apple"; then - echo "apple" - return - fi - fi - - # Check for NVIDIA - if command -v nvidia-smi &>/dev/null; then - if nvidia-smi &>/dev/null; then - echo "nvidia" - return - fi - fi - - # Check for AMD ROCm - if command -v rocm-smi &>/dev/null; then - echo "amd" - return - fi - - echo "none" -} - -detect_vram_gb() { - local gpu=$(detect_gpu) - - case "$gpu" in - nvidia) - nvidia-smi --query-gpu=memory.total --format=csv,noheader,nounits 2>/dev/null | head -1 | awk '{printf "%.0f", $1/1024}' - ;; - apple) - # Apple Silicon shares unified memory โ€” report total RAM - detect_ram_gb - ;; - amd) - rocm-smi --showmeminfo vram 2>/dev/null | grep 'Total' | awk '{printf "%.0f", $3/1024/1024/1024}' - ;; - *) - echo "0" - ;; - esac -} - -detect_gpu_count() { - local gpu=$(detect_gpu) - - case "$gpu" in - nvidia) - nvidia-smi --query-gpu=name --format=csv,noheader 2>/dev/null | wc -l - ;; - apple) - echo "1" # Apple Silicon is unified - ;; - amd) - rocm-smi --showid 2>/dev/null | grep 'GPU' | wc -l - ;; - *) - echo "0" - ;; - esac -} - -detect_cpu_cores() { - local os=$(detect_os) - if [[ "$os" == "linux" ]]; then - nproc 2>/dev/null || echo "4" - elif [[ "$os" == "macos" ]]; then - sysctl -n hw.ncpu 2>/dev/null || echo "4" - else - echo "4" - fi -} - -detect_disk_free_gb() { - local target_dir="${1:-$HOME}" - df -BG "$target_dir" 2>/dev/null | tail -1 | awk '{gsub(/G/,""); print $4}' -} - -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• -# TIER SELECTION -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - -recommend_tier() { - local ram_gb=$1 - local vram_gb=$2 - local gpu_count=$3 - - # Multi-GPU โ†’ Cluster - if [[ $gpu_count -gt 1 ]] && [[ $vram_gb -ge 20 ]]; then - echo "$TIER_CLUSTER" - return - fi - - # High VRAM โ†’ Pro - if [[ $vram_gb -ge 20 ]]; then - echo "$TIER_PRO" - return - fi - - # Medium VRAM or good RAM โ†’ Edge - if [[ $vram_gb -ge 8 ]] || [[ $ram_gb -ge 16 ]]; then - echo "$TIER_EDGE" - return - fi - - # Fallback โ†’ Nano - echo "$TIER_NANO" -} - -tier_description() { - local tier=$1 - case "$tier" in - nano) - echo "Nano (1-3B models) โ€” Good for: simple chat, summarization" - ;; - edge) - echo "Edge (7-8B models) โ€” Good for: coding, reasoning, general use" - ;; - pro) - echo "Pro (32B models) โ€” Good for: complex tasks, tool use, agents" - ;; - cluster) - echo "Cluster (70B+ models) โ€” Good for: everything, enterprise scale" - ;; - esac -} - -tier_model() { - local tier=$1 - case "$tier" in - nano) - echo "Qwen2.5-1.5B-Instruct" - ;; - edge) - echo "Qwen2.5-7B-Instruct-AWQ" - ;; - pro) - echo "Qwen2.5-32B-Instruct-AWQ" - ;; - cluster) - echo "Qwen2.5-72B-Instruct-AWQ" - ;; - esac -} - -tier_model_size_gb() { - local tier=$1 - case "$tier" in - nano) echo "2" ;; - edge) echo "5" ;; - pro) echo "18" ;; - cluster) echo "40" ;; - esac -} - -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• -# DEPENDENCY CHECKS -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - -check_docker() { - if ! command -v docker &>/dev/null; then - return 1 - fi - if ! docker info &>/dev/null; then - return 2 # Docker exists but not running/accessible - fi - return 0 -} - -check_nvidia_docker() { - if ! docker info 2>/dev/null | grep -q "nvidia"; then - # Try explicit check - if ! docker run --rm --gpus all nvidia/cuda:12.0-base-ubuntu22.04 nvidia-smi &>/dev/null 2>&1; then - return 1 - fi - fi - return 0 -} - -install_docker() { - local os=$(detect_os) - echo -e "${YELLOW}Installing Docker...${NC}" - - if [[ "$os" == "linux" ]]; then - curl -fsSL https://get.docker.com | sh - sudo usermod -aG docker "$USER" - echo -e "${GREEN}Docker installed. You may need to log out and back in.${NC}" - elif [[ "$os" == "macos" ]]; then - echo -e "${YELLOW}Please install Docker Desktop from: https://docker.com/products/docker-desktop${NC}" - return 1 - fi -} - -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• -# TUI COMPONENTS -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - -print_section() { - echo -e "\n${BOLD}${BLUE}โ•โ•โ• $1 โ•โ•โ•${NC}\n" -} - -print_check() { - echo -e " ${GREEN}โœ“${NC} $1" -} - -print_warn() { - echo -e " ${YELLOW}โš ${NC} $1" -} - -print_error() { - echo -e " ${RED}โœ—${NC} $1" -} - -print_info() { - echo -e " ${CYAN}โ„น${NC} $1" -} - -confirm() { - local prompt="$1" - local default="${2:-y}" - - if [[ "$default" == "y" ]]; then - prompt="$prompt [Y/n] " - else - prompt="$prompt [y/N] " - fi - - read -p "$prompt" response - response=${response:-$default} - - [[ "$response" =~ ^[Yy]$ ]] -} - -select_tier() { - local recommended=$1 - - echo -e "\n${BOLD}Available tiers:${NC}\n" - echo -e " ${CYAN}1)${NC} $(tier_description nano)" - echo -e " ${CYAN}2)${NC} $(tier_description edge)" - echo -e " ${CYAN}3)${NC} $(tier_description pro)" - echo -e " ${CYAN}4)${NC} $(tier_description cluster)" - - echo "" - - local default_num - case "$recommended" in - nano) default_num=1 ;; - edge) default_num=2 ;; - pro) default_num=3 ;; - cluster) default_num=4 ;; - esac - - read -p "Select tier [$default_num]: " choice - choice=${choice:-$default_num} - - case "$choice" in - 1) echo "$TIER_NANO" ;; - 2) echo "$TIER_EDGE" ;; - 3) echo "$TIER_PRO" ;; - 4) echo "$TIER_CLUSTER" ;; - *) echo "$recommended" ;; - esac -} - -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• -# MAIN WIZARD -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• - -main() { - print_banner - - print_section "Hardware Detection" - - local os=$(detect_os) - local ram_gb=$(detect_ram_gb) - local gpu=$(detect_gpu) - local vram_gb=$(detect_vram_gb) - local gpu_count=$(detect_gpu_count) - local cpu_cores=$(detect_cpu_cores) - local disk_free=$(detect_disk_free_gb "$HOME") - - echo -e " ${BOLD}System:${NC} $os" - echo -e " ${BOLD}RAM:${NC} ${ram_gb}GB" - echo -e " ${BOLD}CPU Cores:${NC} $cpu_cores" - echo -e " ${BOLD}GPU:${NC} $gpu ($gpu_count GPU(s), ${vram_gb}GB VRAM)" - echo -e " ${BOLD}Free Disk:${NC} ${disk_free}GB" - - # Recommend tier - local recommended=$(recommend_tier "$ram_gb" "$vram_gb" "$gpu_count") - echo -e "\n ${GREEN}Recommended tier:${NC} $(tier_description $recommended)" - - print_section "Tier Selection" - - local selected_tier=$(select_tier "$recommended") - local model=$(tier_model "$selected_tier") - local model_size=$(tier_model_size_gb "$selected_tier") - - echo -e "\n Selected: ${BOLD}$(tier_description $selected_tier)${NC}" - echo -e " Model: ${CYAN}$model${NC} (~${model_size}GB)" - - # Check disk space - if [[ $disk_free -lt $((model_size + 10)) ]]; then - print_error "Not enough disk space. Need ~$((model_size + 10))GB, have ${disk_free}GB" - exit 1 - fi - - print_section "Dependency Check" - - # Docker - if check_docker; then - print_check "Docker installed and running" - else - print_warn "Docker not found or not running" - if confirm "Install Docker?"; then - install_docker || exit 1 - else - print_error "Docker is required" - exit 1 - fi - fi - - # NVIDIA Docker (if NVIDIA GPU) - if [[ "$gpu" == "nvidia" ]]; then - if check_nvidia_docker; then - print_check "NVIDIA Container Toolkit installed" - else - print_warn "NVIDIA Container Toolkit not found" - echo -e " ${YELLOW}Install with: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html${NC}" - fi - fi - - print_section "Installation" - - # Initialize time estimates for selected tier - if type init_phase_estimates &>/dev/null; then - init_phase_estimates "$selected_tier" - local total_estimate=$((${PHASE_ESTIMATES[docker_pull]:-0} + ${PHASE_ESTIMATES[model_download]:-0} + ${PHASE_ESTIMATES[startup]:-0})) - local total_duration=$(format_duration $total_estimate) - echo -e " ${CYAN}Estimated total time: ~$total_duration${NC}" - fi - - local install_dir="${DREAM_SERVER_DIR:-$HOME/dream-server}" - read -p "Install directory [$install_dir]: " custom_dir - install_dir="${custom_dir:-$install_dir}" - - echo -e "\n${BOLD}Ready to install:${NC}" - echo -e " โ€ข Directory: $install_dir" - echo -e " โ€ข Tier: $selected_tier" - echo -e " โ€ข Model: $model" - echo -e " โ€ข Download size: ~${model_size}GB" - - if ! confirm "\nProceed with installation?"; then - echo -e "${YELLOW}Installation cancelled.${NC}" - exit 0 - fi - - # Create directory - mkdir -p "$install_dir" - cd "$install_dir" - - # Export config for docker-compose - cat > .env << EOF -DREAM_TIER=$selected_tier -DREAM_MODEL=$model -DREAM_GPU=$gpu -DREAM_VRAM=$vram_gb -EOF - - print_check "Configuration saved" - - # Select compose file based on tier - echo -e "\n${CYAN}Selecting compose configuration...${NC}" - - local compose_file - case "$selected_tier" in - nano|edge) - compose_file="docker-compose.edge.yml" - echo -e " ${BLUE}โ†’ Using edge configuration (Ollama + Piper)${NC}" - ;; - pro) - compose_file="docker-compose.yml" - echo -e " ${BLUE}โ†’ Using pro configuration (vLLM + Kokoro)${NC}" - ;; - cluster) - compose_file="docker-compose.yml" - echo -e " ${BLUE}โ†’ Using cluster configuration (vLLM + multi-GPU)${NC}" - ;; - *) - compose_file="docker-compose.yml" - ;; - esac - - # Verify compose file exists - if [[ ! -f "$SCRIPT_DIR/$compose_file" ]]; then - echo -e "${YELLOW}โš  Compose file not found locally. Downloading...${NC}" - curl -fsSL "https://raw.githubusercontent.com/Light-Heart-Labs/Lighthouse-AI/main/dream-server/$compose_file" -o "$SCRIPT_DIR/$compose_file" || { - echo -e "${RED}โœ— Failed to download compose file${NC}" - exit 1 - } - fi - - # Export for later use - export COMPOSE_FILE="$SCRIPT_DIR/$compose_file" - - print_check "Compose file ready: $compose_file" - - # Pull images - if type print_phase &>/dev/null; then - print_phase "docker_pull" "Pulling Docker images" - else - echo -e "\n${CYAN}Pulling Docker images (this may take a while)...${NC}" - fi - - if type docker_pull_with_progress &>/dev/null; then - docker_pull_with_progress "$COMPOSE_FILE" 2>/dev/null || true - else - docker compose -f "$COMPOSE_FILE" pull 2>/dev/null || true - fi - - print_check "Images pulled" - - # Start services - echo -e "\n${CYAN}Starting services...${NC}" - docker compose -f "$COMPOSE_FILE" up -d 2>/dev/null || { - echo -e "${YELLOW}โš  Failed to start services. Run manually:${NC}" - echo -e " docker compose -f $compose_file up -d" - } - - print_section "Setup Complete!" - - # Use fancy success card if available - if type print_success_card &>/dev/null; then - print_success_card "$selected_tier" "$model" "http://localhost:3001" "http://localhost:8000/v1" - else - echo -e "${GREEN}Dream Server is starting up!${NC}\n" - echo -e " ${BOLD}Dashboard:${NC} http://localhost:3001" - echo -e " ${BOLD}API:${NC} http://localhost:8000/v1" - echo -e " ${BOLD}Voice:${NC} http://localhost:3001/voice" - echo "" - fi - - echo -e " ${CYAN}First startup downloads the model (~${model_size}GB).${NC}" - echo -e " ${CYAN}Monitor progress: docker compose logs -f${NC}" - echo "" - echo -e "${BOLD}Next steps:${NC}" - echo -e " 1. Wait for model download to complete" - echo -e " 2. Open the Dashboard URL in your browser" - echo -e " 3. Start chatting!" - echo "" -} - -# Run if executed directly -if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then - main "$@" -fi diff --git a/dream-server/status.sh b/dream-server/status.sh deleted file mode 100644 index b4b7aaca0..000000000 --- a/dream-server/status.sh +++ /dev/null @@ -1,69 +0,0 @@ -#!/bin/bash -# Dream Server Status Check -# Quick health check for all services - -set -e - -INSTALL_DIR="${INSTALL_DIR:-$HOME/dream-server}" - -# Colors -GREEN='\033[0;32m' -RED='\033[0;31m' -YELLOW='\033[1;33m' -CYAN='\033[0;36m' -NC='\033[0m' - -echo "" -echo -e "${CYAN}โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”${NC}" -echo -e "${CYAN} Dream Server Status${NC}" -echo -e "${CYAN}โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”${NC}" -echo "" - -# Source .env for port variables -source "$INSTALL_DIR/.env" 2>/dev/null || true - -check_service() { - local name=$1 - local url=$2 - local port_var=$3 - local port_value="${!port_var:-$3}" - - if curl -sf "$url" > /dev/null 2>&1; then - echo -e " ${GREEN}โœ“${NC} $name (port $port_value)" - return 0 - else - echo -e " ${RED}โœ—${NC} $name (port $port_value) - not responding" - return 1 - fi -} - -echo -e "${CYAN}Services:${NC}" -check_service "Open WebUI" "http://localhost:${WEBUI_PORT:-3000}" "WEBUI_PORT" || true -check_service "n8n" "http://localhost:${N8N_PORT:-5678}" "N8N_PORT" || true -check_service "vLLM" "http://localhost:${VLLM_PORT:-8000}/health" "VLLM_PORT" || true -check_service "Qdrant" "http://localhost:${QDRANT_PORT:-6333}" "QDRANT_PORT" || true -check_service "Whisper" "http://localhost:${WHISPER_PORT:-9000}" "WHISPER_PORT" || true -check_service "TTS (Kokoro)" "http://localhost:${TTS_PORT:-8880}" "TTS_PORT" || true -check_service "Embeddings" "http://localhost:${EMBEDDINGS_PORT:-8090}" "EMBEDDINGS_PORT" || true - -echo "" -echo -e "${CYAN}Containers:${NC}" -cd "$INSTALL_DIR" 2>/dev/null && docker compose ps --format "table {{.Name}}\t{{.Status}}\t{{.Ports}}" 2>/dev/null || echo " Could not check containers" - -echo "" -if command -v nvidia-smi &> /dev/null; then - echo -e "${CYAN}GPU:${NC}" - nvidia-smi --query-gpu=name,memory.used,memory.total,utilization.gpu,temperature.gpu --format=csv,noheader 2>/dev/null | while read line; do - echo " $line" - done -fi - -echo "" -echo -e "${CYAN}Disk Usage:${NC}" -if [ -d "$INSTALL_DIR" ]; then - du -sh "$INSTALL_DIR"/* 2>/dev/null | head -10 -else - echo " Install directory not found: $INSTALL_DIR" -fi - -echo "" diff --git a/dream-server/test-concurrency.py b/dream-server/test-concurrency.py deleted file mode 100755 index c43e979d2..000000000 --- a/dream-server/test-concurrency.py +++ /dev/null @@ -1,195 +0,0 @@ -#!/usr/bin/env python3 -""" -Concurrency Test - 5 Parallel Requests -Tests system under load with concurrent API calls -""" - -import requests -import threading -import time -from concurrent.futures import ThreadPoolExecutor, as_completed - -VLLM_URL = "http://localhost:8000" -DASHBOARD_URL = "http://localhost:3002" - -class ConcurrencyTester: - def __init__(self): - self.results = [] - self.lock = threading.Lock() - - def log(self, message): - print(f"[CONCURRENCY] {message}") - - def single_vllm_request(self, request_id): - """Single vLLM request for concurrency testing""" - try: - payload = { - "messages": [ - {"role": "user", "content": f"Request {request_id}: What is 2+2?"} - ], - "max_tokens": 50 - } - - start_time = time.time() - response = requests.post(f"{VLLM_URL}/v1/chat/completions", - json=payload, timeout=30) - latency = time.time() - start_time - - if response.status_code == 200: - return { - 'request_id': request_id, - 'status': 'SUCCESS', - 'latency': latency, - 'response': response.json() - } - else: - return { - 'request_id': request_id, - 'status': 'HTTP_ERROR', - 'latency': latency, - 'error': response.status_code - } - - except Exception as e: - return { - 'request_id': request_id, - 'status': 'EXCEPTION', - 'latency': time.time() - start_time, - 'error': str(e) - } - - def single_dashboard_request(self, request_id): - """Single dashboard API request""" - try: - start_time = time.time() - response = requests.get(f"{DASHBOARD_URL}/api/status", timeout=10) - latency = time.time() - start_time - - if response.status_code == 200: - return { - 'request_id': request_id, - 'endpoint': 'dashboard', - 'status': 'SUCCESS', - 'latency': latency - } - else: - return { - 'request_id': request_id, - 'endpoint': 'dashboard', - 'status': 'HTTP_ERROR', - 'latency': latency, - 'error': response.status_code - } - - except Exception as e: - return { - 'request_id': request_id, - 'endpoint': 'dashboard', - 'status': 'EXCEPTION', - 'latency': time.time() - start_time, - 'error': str(e) - } - - def test_concurrent_vllm(self): - """Test 5 concurrent vLLM requests""" - self.log("Testing 5 concurrent vLLM requests...") - - results = [] - start_time = time.time() - - with ThreadPoolExecutor(max_workers=5) as executor: - futures = [executor.submit(self.single_vllm_request, i) for i in range(1, 6)] - - for future in as_completed(futures): - result = future.result() - results.append(result) - - total_time = time.time() - start_time - - return results, total_time - - def test_mixed_load(self): - """Test mixed load: 3 vLLM + 2 dashboard requests""" - self.log("Testing mixed load: 3 vLLM + 2 dashboard requests...") - - results = [] - start_time = time.time() - - with ThreadPoolExecutor(max_workers=5) as executor: - # Submit 3 vLLM requests - vllm_futures = [executor.submit(self.single_vllm_request, i) for i in range(1, 4)] - - # Submit 2 dashboard requests - dashboard_futures = [executor.submit(self.single_dashboard_request, i) for i in range(4, 6)] - - # Collect all results - all_futures = vllm_futures + dashboard_futures - for future in as_completed(all_futures): - result = future.result() - results.append(result) - - total_time = time.time() - start_time - - return results, total_time - - def analyze_results(self, results, total_time): - """Analyze concurrency test results""" - success_count = sum(1 for r in results if r['status'] == 'SUCCESS') - total_requests = len(results) - - if success_count > 0: - latencies = [r['latency'] for r in results if r['status'] == 'SUCCESS'] - avg_latency = sum(latencies) / len(latencies) - min_latency = min(latencies) - max_latency = max(latencies) - else: - avg_latency = min_latency = max_latency = 0 - - return { - 'total_requests': total_requests, - 'successful_requests': success_count, - 'success_rate': (success_count / total_requests) * 100, - 'total_time': total_time, - 'avg_latency': avg_latency, - 'min_latency': min_latency, - 'max_latency': max_latency - } - - def run_all(self): - """Run all concurrency tests""" - self.log("Starting Concurrency Tests") - - # Test 1: 5 concurrent vLLM requests - vllm_results, vllm_total = self.test_concurrent_vllm() - vllm_analysis = self.analyze_results(vllm_results, vllm_total) - - # Test 2: Mixed load - mixed_results, mixed_total = self.test_mixed_load() - mixed_analysis = self.analyze_results(mixed_results, mixed_total) - - return { - 'vllm_concurrent': { - 'results': vllm_results, - 'analysis': vllm_analysis - }, - 'mixed_load': { - 'results': mixed_results, - 'analysis': mixed_analysis - } - } - -if __name__ == "__main__": - tester = ConcurrencyTester() - results = tester.run_all() - - print("\nConcurrency Test Results:") - - for test_name, data in results.items(): - analysis = data['analysis'] - print(f"\n{test_name.replace('_', ' ').title()}:") - print(f" Total Requests: {analysis['total_requests']}") - print(f" Successful: {analysis['successful_requests']}") - print(f" Success Rate: {analysis['success_rate']:.1f}%") - print(f" Total Time: {analysis['total_time']:.3f}s") - print(f" Avg Latency: {analysis['avg_latency']:.3f}s") - print(f" Min/Max Latency: {analysis['min_latency']:.3f}s / {analysis['max_latency']:.3f}s") \ No newline at end of file diff --git a/dream-server/test-rag-pipeline.py b/dream-server/test-rag-pipeline.py deleted file mode 100755 index a0cf1bdd4..000000000 --- a/dream-server/test-rag-pipeline.py +++ /dev/null @@ -1,125 +0,0 @@ -#!/usr/bin/env python3 -""" -RAG Pipeline Integration Test -Tests document โ†’ embed โ†’ query โ†’ answer flow -""" - -import requests -import json -import time -import sys -from pathlib import Path - -# Service endpoints -QDRANT_URL = "http://localhost:6333" -VLLM_URL = "http://localhost:8000" -UPLOAD_URL = "http://localhost:3002/api/documents/upload" - -class RAGTester: - def __init__(self): - self.results = [] - - def log(self, message): - print(f"[RAG] {message}") - - def test_qdrant_health(self): - """Test Qdrant vector database""" - try: - response = requests.get(f"{QDRANT_URL}/collections", timeout=10) - return response.status_code == 200, response.elapsed.total_seconds() - except Exception as e: - return False, 0 - - def test_document_upload(self): - """Test document upload and embedding""" - try: - # Create a simple test document - test_doc = "This is a test document about machine learning and artificial intelligence." - - # Try to upload via API - files = {'file': ('test.txt', test_doc.encode(), 'text/plain')} - response = requests.post(UPLOAD_URL, files=files, timeout=30) - - if response.status_code == 200: - return True, response.elapsed.total_seconds() - else: - # Fallback: simulate successful upload - return True, 0.5 - - except Exception as e: - # Simulate for testing - return True, 0.3 - - def test_embedding_generation(self): - """Test embedding generation""" - try: - # Test if embeddings service is available - embed_url = "http://localhost:9103/embed" - test_text = "What is machine learning?" - - response = requests.post(embed_url, json={"text": test_text}, timeout=10) - return response.status_code == 200, response.elapsed.total_seconds() - - except Exception as e: - return False, 0 - - def test_rag_query(self): - """Test complete RAG query""" - try: - # Test vLLM with RAG context - payload = { - "messages": [ - {"role": "user", "content": "What is machine learning?"} - ], - "max_tokens": 100 - } - - response = requests.post(f"{VLLM_URL}/v1/chat/completions", - json=payload, timeout=30) - - if response.status_code == 200: - data = response.json() - answer = data['choices'][0]['message']['content'] - return len(answer) > 20, response.elapsed.total_seconds() - else: - return False, 0 - - except Exception as e: - return False, 0 - - def run_all(self): - """Run all RAG tests""" - self.log("Starting RAG Pipeline Integration Tests") - - tests = [ - ("Qdrant Health", self.test_qdrant_health), - ("Document Upload", self.test_document_upload), - ("Embedding Generation", self.test_embedding_generation), - ("RAG Query", self.test_rag_query) - ] - - results = [] - total_time = 0 - - for test_name, test_func in tests: - self.log(f"Testing {test_name}...") - success, latency = test_func() - results.append({ - 'test': test_name, - 'status': 'PASS' if success else 'FAIL', - 'latency': f"{latency:.3f}s" - }) - total_time += latency - self.log(f" {'โœ“' if success else 'โœ—'} {test_name} ({latency:.3f}s)") - - return results, total_time - -if __name__ == "__main__": - tester = RAGTester() - results, total_time = tester.run_all() - - print("\nRAG Pipeline Test Results:") - for result in results: - print(f" {result['test']}: {result['status']} ({result['latency']})") - - print(f"\nTotal Pipeline Time: {total_time:.3f}s") \ No newline at end of file diff --git a/dream-server/test-stack.sh b/dream-server/test-stack.sh old mode 100755 new mode 100644 diff --git a/dream-server/test-tool-calling.py b/dream-server/test-tool-calling.py deleted file mode 100755 index 6a80e374a..000000000 --- a/dream-server/test-tool-calling.py +++ /dev/null @@ -1,157 +0,0 @@ -#!/usr/bin/env python3 -""" -Tool Calling Validation Test -Tests LLM ability to call tools/functions properly -""" - -import requests -import json -import time - -VLLM_URL = "http://localhost:8000" - -class ToolCallTester: - def __init__(self): - self.results = [] - - def log(self, message): - print(f"[TOOLS] {message}") - - def test_function_calling(self): - """Test function calling capability""" - try: - payload = { - "messages": [ - { - "role": "user", - "content": "What's the weather in New York? Use the weather tool." - } - ], - "tools": [ - { - "type": "function", - "function": { - "name": "get_weather", - "description": "Get current weather for a location", - "parameters": { - "type": "object", - "properties": { - "location": {"type": "string"} - }, - "required": ["location"] - } - } - } - ], - "max_tokens": 200 - } - - start_time = time.time() - response = requests.post(f"{VLLM_URL}/v1/chat/completions", - json=payload, timeout=30) - latency = time.time() - start_time - - if response.status_code == 200: - data = response.json() - - # Check if tool call was made - message = data['choices'][0]['message'] - has_tool_call = 'tool_calls' in message and len(message.get('tool_calls', [])) > 0 - - return has_tool_call, latency, message - else: - return False, latency, None - - except Exception as e: - return False, 0, None - - def test_tool_response(self): - """Test tool response handling""" - try: - # Simulate a tool call response - payload = { - "messages": [ - {"role": "user", "content": "What's 15 * 23?"}, - { - "role": "assistant", - "content": "", - "tool_calls": [ - { - "id": "calc_1", - "type": "function", - "function": { - "name": "calculate", - "arguments": "{\"expression\": \"15 * 23\"}" - } - } - ] - }, - { - "role": "tool", - "content": "345", - "tool_call_id": "calc_1" - } - ], - "max_tokens": 100 - } - - start_time = time.time() - response = requests.post(f"{VLLM_URL}/v1/chat/completions", - json=payload, timeout=30) - latency = time.time() - start_time - - if response.status_code == 200: - data = response.json() - answer = data['choices'][0]['message']['content'] - contains_result = "345" in answer - - return contains_result, latency - else: - return False, latency - - except Exception as e: - return False, 0 - - def run_all(self): - """Run all tool calling tests""" - self.log("Starting Tool Calling Validation Tests") - - tests = [ - ("Function Calling", self.test_function_calling), - ("Tool Response", self.test_tool_response) - ] - - results = [] - - for test_name, test_func in tests: - self.log(f"Testing {test_name}...") - - if test_name == "Function Calling": - success, latency, message = test_func() - results.append({ - 'test': test_name, - 'status': 'PASS' if success else 'FAIL', - 'latency': f"{latency:.3f}s", - 'details': str(message) if message else "No tool call made" - }) - else: - success, latency = test_func() - results.append({ - 'test': test_name, - 'status': 'PASS' if success else 'FAIL', - 'latency': f"{latency:.3f}s" - }) - - self.log(f" {'โœ“' if success else 'โœ—'} {test_name} ({latency:.3f}s)") - - return results - -if __name__ == "__main__": - tester = ToolCallTester() - results = tester.run_all() - - print("\nTool Calling Test Results:") - for result in results: - print(f" {result['test']}: {result['status']} ({result['latency']})") - if 'details' in result: - print(f" Details: {result['details']}") \ No newline at end of file diff --git a/dream-server/tests/WEBRTC-TEST-GUIDE.md b/dream-server/tests/WEBRTC-TEST-GUIDE.md deleted file mode 100644 index d53b6b2a3..000000000 --- a/dream-server/tests/WEBRTC-TEST-GUIDE.md +++ /dev/null @@ -1,166 +0,0 @@ -# WebRTC Voice Test Guide - -**Purpose:** Validate the full voice pipeline with real audio through a browser. - -Synthetic HTTP stress tests passed (100 concurrent, 100% success). This test validates: -1. WebRTC audio streaming works -2. Voice Activity Detection (VAD) triggers correctly -3. Real speech is transcribed accurately -4. LLM responses are coherent -5. TTS audio plays back in browser - -## Prerequisites - -- [ ] Dream Server running on your target machine -- [ ] Dashboard accessible at `http://:3001` -- [ ] Voice services healthy (check `/api/voice/status`) -- [ ] Browser with microphone access (Chrome/Firefox recommended) -- [ ] Quiet environment for testing - -## Quick Health Check - -```bash -# From any machine on the network -curl http://:3002/api/voice/status -``` - -Expected response: -```json -{ - "available": true, - "services": { - "stt": {"name": "Whisper", "status": "healthy", "port": 9000}, - "tts": {"name": "Kokoro", "status": "healthy", "port": 8880}, - "livekit": {"name": "LiveKit", "status": "healthy", "port": 7880} - }, - "message": "Voice ready" -} -``` - -## Test Procedure - -### 1. Open Dashboard Voice Page - -1. Navigate to `http://:3001/voice` -2. Grant microphone permission when prompted -3. Verify connection status shows "Connected" - -### 2. Basic Voice Test - -| Step | Action | Expected Result | -|------|--------|-----------------| -| 1 | Click the mic button | Button turns red/active | -| 2 | Say "Hello, how are you today?" | Transcription appears in UI | -| 3 | Wait for response | LLM response + TTS playback | -| 4 | Click mic to stop | Button returns to idle | - -### 3. Latency Measurement - -Time the following: -- **STT latency:** End of speech โ†’ transcription appears -- **LLM latency:** Transcription appears โ†’ response text appears -- **TTS latency:** Response text โ†’ audio starts playing -- **Total E2E:** End of speech โ†’ audio starts - -**Acceptable thresholds:** -- STT: < 500ms -- LLM: < 2000ms -- TTS: < 500ms -- Total E2E: < 3000ms - -### 4. VAD Validation - -Test voice activity detection: - -| Test | Action | Expected | -|------|--------|----------| -| Silence | Stay quiet for 5s | No false triggers | -| Background noise | Type on keyboard | No false triggers | -| Soft speech | Whisper a phrase | Should trigger (or not, depending on threshold) | -| Normal speech | Speak normally | Triggers immediately | -| Interruption | Speak while TTS playing | TTS should stop | - -### 5. Multi-Turn Conversation - -1. Ask: "What's the capital of France?" -2. Wait for response -3. Follow up: "What's its population?" -4. Verify context is maintained (should know you're asking about Paris) - -### 6. Error Handling - -| Test | Action | Expected | -|------|--------|----------| -| Network drop | Disconnect WiFi mid-speech | Graceful error message | -| Long silence | Hold mic for 30s without speaking | Timeout or graceful handling | -| Very long input | Speak for 60+ seconds | Should handle or truncate gracefully | - -## Recording Results - -### Test Session Info - -- **Date:** _______________ -- **Tester:** _______________ -- **Browser:** _______________ -- **Network:** Local LAN / Remote / VPN - -### Results - -| Test | Pass/Fail | Notes | -|------|-----------|-------| -| Dashboard loads | | | -| Mic permission granted | | | -| Connection established | | | -| Basic voice works | | | -| Transcription accurate | | | -| LLM response coherent | | | -| TTS plays back | | | -| Latency acceptable | | | -| VAD no false triggers | | | -| Multi-turn works | | | -| Interruption works | | | - -### Latency Measurements - -| Metric | Value | -|--------|-------| -| STT | ___ms | -| LLM | ___ms | -| TTS | ___ms | -| Total E2E | ___ms | - -### Issues Found - -1. _______________ -2. _______________ -3. _______________ - -## Troubleshooting - -### No audio input detected -- Check browser microphone permissions -- Try a different browser -- Verify mic works in other apps - -### Connection failed -- Check LiveKit is running: `curl http://localhost:7880` -- Check token endpoint: `curl -X POST http://localhost:3002/api/voice/token -H "Content-Type: application/json" -d '{"room":"test","identity":"user"}'` - -### Transcription wrong/empty -- Check Whisper service: `curl http://localhost:9000/health` -- Try speaking louder/clearer -- Check VAD threshold settings - -### No audio playback -- Check browser audio permissions -- Verify TTS service: `curl http://localhost:8880/health` -- Check browser console for errors - -### High latency -- Check GPU utilization during inference -- Verify vLLM is using GPU (not CPU) -- Check network latency if remote - ---- - -**After testing:** Update STATUS.md with results and any issues found. diff --git a/dream-server/tests/clean-test-install.sh b/dream-server/tests/clean-test-install.sh deleted file mode 100755 index 73ed57472..000000000 --- a/dream-server/tests/clean-test-install.sh +++ /dev/null @@ -1,329 +0,0 @@ -#!/usr/bin/env bash -# ============================================================ -# Dream Server โ€” Clean Test Install Script -# Removes all artifacts from a previous install so install.sh -# can be tested from scratch on the same machine. -# -# Levels: -# (default) Remove Dream Server artifacts only -# --full Also remove ALL Docker images/cache and -# uninstall Docker, Docker Compose, and -# NVIDIA Container Toolkit -# ============================================================ -set -euo pipefail - -RED='\033[0;31m' -YELLOW='\033[1;33m' -GREEN='\033[0;32m' -CYAN='\033[0;36m' -NC='\033[0m' - -INSTALL_DIR="${INSTALL_DIR:-$HOME/dream-server}" -FULL_CLEAN=false -AUTO_YES=false - -for arg in "$@"; do - case "$arg" in - --full) FULL_CLEAN=true ;; - --yes|-y) AUTO_YES=true ;; - esac -done - -echo -e "${CYAN}โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—${NC}" -echo -e "${CYAN}โ•‘ Dream Server โ€” Clean Test Install โ•‘${NC}" -if $FULL_CLEAN; then -echo -e "${CYAN}โ•‘ FULL MODE: dependencies will be removed โ•‘${NC}" -else -echo -e "${CYAN}โ•‘ Removes all artifacts for fresh test โ•‘${NC}" -fi -echo -e "${CYAN}โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•${NC}" -echo "" - -# โ”€โ”€ Scan phase โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -echo -e "${YELLOW}Scanning for Dream Server artifacts...${NC}" -echo "" - -FOUND=0 - -# 1. Running containers -CONTAINERS=$(docker ps -a --filter "name=dream-" --format "{{.Names}}" 2>/dev/null || true) -if [[ -n "$CONTAINERS" ]]; then - echo -e " ${CYAN}Containers:${NC}" - echo "$CONTAINERS" | sed 's/^/ /' - FOUND=1 -else - echo -e " ${GREEN}Containers:${NC} none" -fi - -# 2. Docker images (dream-specific) -IMAGES=$(docker images --format "{{.Repository}}:{{.Tag}}" 2>/dev/null | grep -E 'dream-server|dream-livekit' || true) -if [[ -n "$IMAGES" ]]; then - echo -e " ${CYAN}Images:${NC}" - echo "$IMAGES" | sed 's/^/ /' - FOUND=1 -else - echo -e " ${GREEN}Images:${NC} none" -fi - -# 2b. ALL Docker images (for --full mode display) -ALL_IMAGES=$(docker images --format "{{.Repository}}:{{.Tag}} ({{.Size}})" 2>/dev/null || true) -ALL_IMAGE_COUNT=$(docker images -q 2>/dev/null | wc -l || echo 0) -if $FULL_CLEAN && [[ "$ALL_IMAGE_COUNT" -gt 0 ]]; then - DOCKER_DISK=$(docker system df --format "{{.Size}}" 2>/dev/null | head -1 || echo "unknown") - echo -e " ${CYAN}All Docker images:${NC} ${ALL_IMAGE_COUNT} images (${DOCKER_DISK})" - FOUND=1 -fi - -# 3. Docker volumes -VOLUMES=$(docker volume ls --format "{{.Name}}" 2>/dev/null | grep -i dream || true) -if [[ -n "$VOLUMES" ]]; then - echo -e " ${CYAN}Volumes:${NC}" - echo "$VOLUMES" | sed 's/^/ /' - FOUND=1 -else - echo -e " ${GREEN}Volumes:${NC} none" -fi - -# 4. Install directory -if [[ -d "$INSTALL_DIR" ]]; then - SIZE=$(du -sh "$INSTALL_DIR" 2>/dev/null | cut -f1) - echo -e " ${CYAN}Install dir:${NC} $INSTALL_DIR ($SIZE)" - FOUND=1 -else - echo -e " ${GREEN}Install dir:${NC} not found" -fi - -# 5. Desktop shortcut -DESKTOP_FILE="$HOME/.local/share/applications/dream-server.desktop" -if [[ -f "$DESKTOP_FILE" ]]; then - echo -e " ${CYAN}Desktop shortcut:${NC} $DESKTOP_FILE" - FOUND=1 -else - echo -e " ${GREEN}Desktop shortcut:${NC} none" -fi - -# 6. GNOME favorites -FAVORITES=$(gsettings get org.gnome.shell favorite-apps 2>/dev/null || echo "[]") -if echo "$FAVORITES" | grep -q "dream-server"; then - echo -e " ${CYAN}GNOME sidebar:${NC} pinned" - FOUND=1 -else - echo -e " ${GREEN}GNOME sidebar:${NC} not pinned" -fi - -# 7. Docker network -NETWORKS=$(docker network ls --format "{{.Name}}" 2>/dev/null | grep -i dream || true) -if [[ -n "$NETWORKS" ]]; then - echo -e " ${CYAN}Networks:${NC}" - echo "$NETWORKS" | sed 's/^/ /' - FOUND=1 -else - echo -e " ${GREEN}Networks:${NC} none" -fi - -# 8. Systemd services (if any) -SERVICES=$(systemctl --user list-units --all 2>/dev/null | grep -i dream | awk '{print $1}' || true) -if [[ -n "$SERVICES" ]]; then - echo -e " ${CYAN}Systemd services:${NC}" - echo "$SERVICES" | sed 's/^/ /' - FOUND=1 -else - echo -e " ${GREEN}Systemd services:${NC} none" -fi - -# 9. Dependencies (--full mode) -if $FULL_CLEAN; then - echo "" - echo -e "${YELLOW}Scanning installer dependencies...${NC}" - echo "" - - HAS_DOCKER=false - HAS_COMPOSE=false - HAS_NVIDIA_CTK=false - - if command -v docker &>/dev/null; then - DOCKER_VER=$(docker --version 2>/dev/null | head -1) - echo -e " ${CYAN}Docker:${NC} $DOCKER_VER" - HAS_DOCKER=true - FOUND=1 - else - echo -e " ${GREEN}Docker:${NC} not installed" - fi - - if docker compose version &>/dev/null 2>&1; then - COMPOSE_VER=$(docker compose version 2>/dev/null | head -1) - echo -e " ${CYAN}Docker Compose:${NC} $COMPOSE_VER" - HAS_COMPOSE=true - FOUND=1 - else - echo -e " ${GREEN}Docker Compose:${NC} not installed" - fi - - if dpkg -l nvidia-container-toolkit &>/dev/null 2>&1 || command -v nvidia-ctk &>/dev/null; then - CTK_VER=$(nvidia-ctk --version 2>/dev/null | head -1 || echo "installed") - echo -e " ${CYAN}NVIDIA Container Toolkit:${NC} $CTK_VER" - HAS_NVIDIA_CTK=true - FOUND=1 - else - echo -e " ${GREEN}NVIDIA Container Toolkit:${NC} not installed" - fi -fi - -echo "" - -if [[ "$FOUND" -eq 0 ]]; then - echo -e "${GREEN}No Dream Server artifacts found. Machine is clean.${NC}" - exit 0 -fi - -# โ”€โ”€ Confirmation โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if ! $AUTO_YES; then - echo -e "${RED}This will REMOVE everything listed above.${NC}" - if $FULL_CLEAN; then - echo -e "${RED}INCLUDING Docker, Docker Compose, and NVIDIA Container Toolkit.${NC}" - fi - echo -e "${YELLOW}Models in $INSTALL_DIR/models/ will be PRESERVED (moved to /tmp/dream-models-backup).${NC}" - echo "" - read -p "Proceed? [y/N] " -r - if [[ ! $REPLY =~ ^[Yy]$ ]]; then - echo "Aborted." - exit 1 - fi -fi - -echo "" -echo -e "${YELLOW}Cleaning...${NC}" - -# โ”€โ”€ Remove phase โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ - -# 1. Stop and remove containers -if [[ -n "$CONTAINERS" ]]; then - echo -n " Stopping containers... " - # Use compose if compose file exists, otherwise docker rm - if [[ -f "$INSTALL_DIR/docker-compose.yml" ]]; then - (cd "$INSTALL_DIR" && docker compose --profile openclaw --profile voice --profile workflows --profile rag --profile multi-model down --remove-orphans 2>/dev/null) || true - fi - # Force remove any stragglers - docker rm -f $CONTAINERS 2>/dev/null || true - echo -e "${GREEN}done${NC}" -fi - -# 2. Remove dream-specific images -if [[ -n "$IMAGES" ]]; then - echo -n " Removing Dream Server images... " - echo "$IMAGES" | xargs docker rmi -f 2>/dev/null || true - echo -e "${GREEN}done${NC}" -fi - -# 3. Remove volumes -if [[ -n "$VOLUMES" ]]; then - echo -n " Removing volumes... " - echo "$VOLUMES" | xargs docker volume rm -f 2>/dev/null || true - echo -e "${GREEN}done${NC}" -fi - -# 4. Remove networks -if [[ -n "$NETWORKS" ]]; then - echo -n " Removing networks... " - echo "$NETWORKS" | xargs docker network rm 2>/dev/null || true - echo -e "${GREEN}done${NC}" -fi - -# 5. Preserve models, remove install dir -if [[ -d "$INSTALL_DIR" ]]; then - # Backup models (they take forever to download) - if [[ -d "$INSTALL_DIR/models" ]] && [[ "$(ls -A "$INSTALL_DIR/models" 2>/dev/null)" ]]; then - echo -n " Backing up models to /tmp/dream-models-backup... " - sudo rm -rf /tmp/dream-models-backup 2>/dev/null || true - mv "$INSTALL_DIR/models" /tmp/dream-models-backup - echo -e "${GREEN}done${NC}" - fi - echo -n " Removing $INSTALL_DIR... " - # Use sudo because Docker containers create root-owned files in data dirs - sudo rm -rf "$INSTALL_DIR" - echo -e "${GREEN}done${NC}" -fi - -# 6. Remove desktop shortcut -if [[ -f "$DESKTOP_FILE" ]]; then - echo -n " Removing desktop shortcut... " - rm -f "$DESKTOP_FILE" - echo -e "${GREEN}done${NC}" -fi - -# 7. Unpin from GNOME -if echo "$FAVORITES" | grep -q "dream-server"; then - echo -n " Unpinning from GNOME sidebar... " - NEW_FAVS=$(echo "$FAVORITES" | sed "s/, 'dream-server.desktop'//g; s/'dream-server.desktop', //g; s/'dream-server.desktop'//g") - gsettings set org.gnome.shell favorite-apps "$NEW_FAVS" 2>/dev/null || true - echo -e "${GREEN}done${NC}" -fi - -# 8. Prune ALL Docker images and build cache -if $FULL_CLEAN; then - echo -n " Removing ALL Docker images and build cache... " - docker system prune -a --volumes -f &>/dev/null || true - echo -e "${GREEN}done${NC}" -else - echo -n " Pruning dangling images... " - docker image prune -f 2>/dev/null | tail -1 || true - echo "" -fi - -# โ”€โ”€ Full dependency removal โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if $FULL_CLEAN; then - echo "" - echo -e "${YELLOW}Removing installer dependencies...${NC}" - - # NVIDIA Container Toolkit - if $HAS_NVIDIA_CTK; then - echo -n " Removing NVIDIA Container Toolkit... " - sudo apt-get remove -y nvidia-container-toolkit &>/dev/null || true - sudo apt-get autoremove -y &>/dev/null || true - # Remove the nvidia-container-toolkit apt repo - sudo rm -f /etc/apt/sources.list.d/nvidia-container-toolkit.list 2>/dev/null || true - sudo rm -f /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg 2>/dev/null || true - echo -e "${GREEN}done${NC}" - fi - - # Docker (includes compose v2 plugin) - if $HAS_DOCKER; then - echo -n " Removing Docker Engine and Compose... " - sudo apt-get remove -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin &>/dev/null || true - sudo apt-get autoremove -y &>/dev/null || true - # Remove Docker apt repo - sudo rm -f /etc/apt/sources.list.d/docker.list 2>/dev/null || true - sudo rm -f /etc/apt/keyrings/docker.asc 2>/dev/null || true - # Remove Docker data (images, containers, volumes already gone) - sudo rm -rf /var/lib/docker /var/lib/containerd 2>/dev/null || true - # Remove Docker config - rm -rf "$HOME/.docker" 2>/dev/null || true - echo -e "${GREEN}done${NC}" - fi -fi - -echo "" -echo -e "${GREEN}โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—${NC}" -if $FULL_CLEAN; then -echo -e "${GREEN}โ•‘ Full clean complete. Bare metal ready. โ•‘${NC}" -else -echo -e "${GREEN}โ•‘ Clean complete. Ready for fresh install. โ•‘${NC}" -fi -echo -e "${GREEN}โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•${NC}" - -if [[ -d "/tmp/dream-models-backup" ]]; then - echo "" - echo -e "${CYAN}Models backed up to /tmp/dream-models-backup${NC}" - echo -e "${CYAN}The installer will detect and restore them automatically,${NC}" - echo -e "${CYAN}or you can manually move them back after install:${NC}" - echo -e "${CYAN} mv /tmp/dream-models-backup \$HOME/dream-server/models${NC}" -fi - -if $FULL_CLEAN; then - echo "" - echo -e "${YELLOW}Dependency status after clean:${NC}" - command -v docker &>/dev/null && echo -e " ${RED}Docker:${NC} still present (may need reboot)" || echo -e " ${GREEN}Docker:${NC} removed" - command -v nvidia-ctk &>/dev/null && echo -e " ${RED}NVIDIA CTK:${NC} still present" || echo -e " ${GREEN}NVIDIA CTK:${NC} removed" - echo "" - echo -e "${CYAN}The installer will re-install all dependencies from scratch.${NC}" -fi diff --git a/dream-server/tests/contracts/test-installer-contracts.sh b/dream-server/tests/contracts/test-installer-contracts.sh new file mode 100644 index 000000000..4d2d81b05 --- /dev/null +++ b/dream-server/tests/contracts/test-installer-contracts.sh @@ -0,0 +1,38 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" +cd "$ROOT_DIR" + +command -v jq >/dev/null 2>&1 || { + echo "[FAIL] jq is required" + exit 1 +} + +echo "[contract] backend contract files" +for f in config/backends/amd.json config/backends/nvidia.json config/backends/cpu.json config/backends/apple.json; do + test -f "$f" || { echo "[FAIL] missing $f"; exit 1; } + jq -e '.id and .llm_engine and .service_name and .public_api_port and .public_health_url and .provider_name and .provider_url' "$f" >/dev/null \ + || { echo "[FAIL] invalid backend contract: $f"; exit 1; } +done + +echo "[contract] hardware class mapping" +test -f config/hardware-classes.json || { echo "[FAIL] missing config/hardware-classes.json"; exit 1; } +jq -e '.version and (.classes | type=="array" and length>0)' config/hardware-classes.json >/dev/null \ + || { echo "[FAIL] invalid hardware-classes root structure"; exit 1; } + +for class_id in strix_unified nvidia_pro apple_silicon cpu_fallback; do + jq -e --arg id "$class_id" '.classes[] | select(.id==$id) | .recommended.backend and .recommended.tier and .recommended.compose_overlays' config/hardware-classes.json >/dev/null \ + || { echo "[FAIL] missing/invalid class: $class_id"; exit 1; } +done + +echo "[contract] capability profile schema has hardware_class" +jq -e '.properties.hardware_class and (.required | index("hardware_class"))' config/capability-profile.schema.json >/dev/null \ + || { echo "[FAIL] capability profile schema missing hardware_class"; exit 1; } + +echo "[contract] resolver scripts executable" +for s in scripts/build-capability-profile.sh scripts/classify-hardware.sh scripts/load-backend-contract.sh scripts/resolve-compose-stack.sh scripts/preflight-engine.sh scripts/dream-doctor.sh scripts/simulate-installers.sh; do + test -x "$s" || { echo "[FAIL] script not executable: $s"; exit 1; } +done + +echo "[PASS] installer contracts" diff --git a/dream-server/tests/contracts/test-preflight-fixtures.sh b/dream-server/tests/contracts/test-preflight-fixtures.sh new file mode 100644 index 000000000..8c85e01d4 --- /dev/null +++ b/dream-server/tests/contracts/test-preflight-fixtures.sh @@ -0,0 +1,96 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" +cd "$ROOT_DIR" + +require_jq() { + command -v jq >/dev/null 2>&1 || { + echo "[FAIL] jq is required" + exit 1 + } +} + +assert_eq() { + local got="$1" + local expected="$2" + local msg="$3" + if [[ "$got" != "$expected" ]]; then + echo "[FAIL] $msg (expected=$expected got=$got)" + exit 1 + fi +} + +require_jq + +tmpdir="$(mktemp -d)" +trap 'rm -rf "$tmpdir"' EXIT + +echo "[contract] preflight fixture: linux-nvidia-good" +scripts/preflight-engine.sh \ + --report "$tmpdir/linux-nvidia-good.json" \ + --tier T2 \ + --ram-gb 64 \ + --disk-gb 200 \ + --gpu-backend nvidia \ + --gpu-vram-mb 24576 \ + --gpu-name "RTX 4090" \ + --platform-id linux \ + --compose-overlays docker-compose.base.yml,docker-compose.nvidia.yml \ + --script-dir "$ROOT_DIR" \ + --env >/dev/null +blockers="$(jq -r '.summary.blockers' "$tmpdir/linux-nvidia-good.json")" +assert_eq "$blockers" "0" "linux-nvidia-good blockers" + +echo "[contract] preflight fixture: windows-mvp-good" +scripts/preflight-engine.sh \ + --report "$tmpdir/windows-mvp-good.json" \ + --tier T1 \ + --ram-gb 16 \ + --disk-gb 120 \ + --gpu-backend nvidia \ + --gpu-vram-mb 12288 \ + --gpu-name "RTX 3060" \ + --platform-id windows \ + --compose-overlays docker-compose.base.yml,docker-compose.nvidia.yml \ + --script-dir "$ROOT_DIR" \ + --env >/dev/null +blockers="$(jq -r '.summary.blockers' "$tmpdir/windows-mvp-good.json")" +assert_eq "$blockers" "0" "windows-mvp-good blockers" + +echo "[contract] preflight fixture: macos-mvp-good" +scripts/preflight-engine.sh \ + --report "$tmpdir/macos-mvp-good.json" \ + --tier T1 \ + --ram-gb 16 \ + --disk-gb 80 \ + --gpu-backend apple \ + --gpu-vram-mb 16384 \ + --gpu-name "Apple Silicon" \ + --platform-id macos \ + --compose-overlays docker-compose.base.yml,docker-compose.amd.yml \ + --script-dir "$ROOT_DIR" \ + --env >/dev/null +blockers="$(jq -r '.summary.blockers' "$tmpdir/macos-mvp-good.json")" +assert_eq "$blockers" "0" "macos-mvp-good blockers" + +echo "[contract] preflight fixture: disk-blocker" +scripts/preflight-engine.sh \ + --report "$tmpdir/disk-blocker.json" \ + --tier T3 \ + --ram-gb 64 \ + --disk-gb 20 \ + --gpu-backend nvidia \ + --gpu-vram-mb 24576 \ + --gpu-name "RTX 4090" \ + --platform-id linux \ + --compose-overlays docker-compose.base.yml,docker-compose.nvidia.yml \ + --script-dir "$ROOT_DIR" \ + --env >/dev/null +blockers="$(jq -r '.summary.blockers' "$tmpdir/disk-blocker.json")" +if [[ "$blockers" -lt 1 ]]; then + echo "[FAIL] disk-blocker expected >=1 blocker, got $blockers" + exit 1 +fi + +echo "[PASS] preflight fixture contracts" diff --git a/dream-server/tests/dashboard-load-test.py b/dream-server/tests/dashboard-load-test.py old mode 100755 new mode 100644 diff --git a/dream-server/tests/integration-test.sh b/dream-server/tests/integration-test.sh index 88cf12764..1fbb3d816 100644 --- a/dream-server/tests/integration-test.sh +++ b/dream-server/tests/integration-test.sh @@ -11,6 +11,23 @@ export TERM="${TERM:-xterm}" SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" PROJECT_DIR="$(dirname "$SCRIPT_DIR")" +cd "$PROJECT_DIR" +COMPOSE_FILE="" +COMPOSE_FLAGS="" +if [[ -f "docker-compose.base.yml" && -f "docker-compose.amd.yml" ]]; then + COMPOSE_FILE="docker-compose.amd.yml" + COMPOSE_FLAGS="-f docker-compose.base.yml -f docker-compose.amd.yml" + # Append enabled extension compose fragments + if [[ -d "extensions/services" ]]; then + for ext_dir in extensions/services/*/; do + [[ -f "${ext_dir}compose.yaml" ]] && COMPOSE_FLAGS="$COMPOSE_FLAGS -f ${ext_dir}compose.yaml" + [[ -f "${ext_dir}compose.amd.yaml" ]] && COMPOSE_FLAGS="$COMPOSE_FLAGS -f ${ext_dir}compose.amd.yaml" + done + fi +elif [[ -f "docker-compose.yml" ]]; then + COMPOSE_FILE="docker-compose.yml" + COMPOSE_FLAGS="-f docker-compose.yml" +fi # Colors RED='\033[0;31m' @@ -89,36 +106,42 @@ fi # ============================================ header "2/6" "Docker Compose Validation" -if [[ ! -f "$PROJECT_DIR/docker-compose.yml" ]]; then - fail "docker-compose.yml not found" +if [[ -z "$COMPOSE_FILE" ]]; then + fail "No compose file found (expected base+overlay or docker-compose.yml)" else - pass "docker-compose.yml exists" + pass "Compose file exists: $(basename "$COMPOSE_FILE")" + [[ -n "$COMPOSE_FLAGS" ]] && pass "Compose flags: $COMPOSE_FLAGS" # Syntax check with docker compose if command -v docker &> /dev/null; then - if docker compose -f "$PROJECT_DIR/docker-compose.yml" config > /dev/null 2>&1; then - pass "docker-compose.yml passes syntax validation" + if docker compose $COMPOSE_FLAGS config > /dev/null 2>&1; then + pass "Compose selection passes syntax validation" else # Try with env file fallback - if docker compose -f "$PROJECT_DIR/docker-compose.yml" --env-file "$PROJECT_DIR/.env.example" config > /dev/null 2>&1; then - pass "docker-compose.yml passes syntax validation (with .env.example)" + if [[ -f "$PROJECT_DIR/.env.example" ]] && docker compose $COMPOSE_FLAGS --env-file "$PROJECT_DIR/.env.example" config > /dev/null 2>&1; then + pass "Compose selection passes syntax validation (with .env.example)" else - fail "docker-compose.yml has syntax errors" "$(docker compose -f "$PROJECT_DIR/docker-compose.yml" config 2>&1 | head -3)" + fail "Compose selection has syntax errors" "$(docker compose $COMPOSE_FLAGS config 2>&1 | head -3)" fi fi # Verify core services are defined - compose_config=$(docker compose -f "$PROJECT_DIR/docker-compose.yml" --env-file "$PROJECT_DIR/.env.example" config 2>/dev/null || true) - for service in vllm webui; do - if echo "$compose_config" | grep -q "container_name:.*dream-${service}" 2>/dev/null || \ - grep -q "container_name:.*dream-${service}" "$PROJECT_DIR/docker-compose.yml" 2>/dev/null; then + compose_config=$(docker compose $COMPOSE_FLAGS --env-file "$PROJECT_DIR/.env.example" config 2>/dev/null || docker compose $COMPOSE_FLAGS config 2>/dev/null || true) + if [[ "$(basename "$COMPOSE_FILE")" == "docker-compose.amd.yml" ]]; then + core_services=("llama-server" "open-webui") + else + core_services=("llama-server" "webui") + fi + for service in "${core_services[@]}"; do + if echo "$compose_config" | grep -qE "^\\s{2}${service}:$" 2>/dev/null || \ + grep -qE "^[[:space:]]*${service}:" "$COMPOSE_FILE" 2>/dev/null; then pass "Core service defined: $service" else fail "Core service missing: $service" fi done else - skip "Docker not installed โ€” cannot validate docker-compose.yml syntax" + skip "Docker not installed โ€” cannot validate compose syntax" fi fi @@ -129,7 +152,7 @@ header "3/6" "Profile Configs" PROFILES_DIR="$PROJECT_DIR/config/profiles" if [[ ! -d "$PROFILES_DIR" ]]; then - fail "config/profiles/ directory not found" + skip "config/profiles/ directory not found (not required in Strix layout)" else pass "config/profiles/ directory exists" @@ -152,11 +175,11 @@ with open('$profile') as f: fail "Invalid YAML: $basename_profile" fi - # Check that profile defines a vllm service override - if grep -q "vllm" "$profile" 2>/dev/null; then - pass "Profile defines vllm config: $basename_profile" + # Check that profile defines a llama-server service override + if grep -q "llama-server" "$profile" 2>/dev/null; then + pass "Profile defines llama-server config: $basename_profile" else - fail "Profile missing vllm config: $basename_profile" + fail "Profile missing llama-server config: $basename_profile" fi done @@ -226,9 +249,12 @@ header "5/6" "Workflow JSON Files" WORKFLOWS_DIR="$PROJECT_DIR/workflows" if [[ ! -d "$WORKFLOWS_DIR" ]]; then - fail "workflows/ directory not found" + WORKFLOWS_DIR="$PROJECT_DIR/config/n8n" +fi +if [[ ! -d "$WORKFLOWS_DIR" ]]; then + fail "workflow directory not found (checked workflows/ and config/n8n/)" else - pass "workflows/ directory exists" + pass "Workflow directory exists: ${WORKFLOWS_DIR#$PROJECT_DIR/}" json_count=0 for wf in "$WORKFLOWS_DIR"/*.json; do @@ -243,7 +269,8 @@ else fail "Invalid JSON: $basename_wf" fi - # Check for n8n workflow structure (should have "nodes" key) + # Check for n8n workflow structure. + # Some JSON files (like catalog.json) are metadata manifests, not workflow exports. if python3 -c " import json, sys with open('$wf') as f: @@ -251,6 +278,13 @@ with open('$wf') as f: assert 'nodes' in d, 'missing nodes key' " 2>/dev/null; then pass "Has n8n structure (nodes): $basename_wf" + elif python3 -c " +import json, sys +with open('$wf') as f: + d = json.load(f) +assert 'workflows' in d or 'categories' in d, 'not a metadata manifest' +" 2>/dev/null; then + skip "Metadata manifest (not workflow export): $basename_wf" else fail "Missing n8n structure (nodes): $basename_wf" fi @@ -297,13 +331,18 @@ fi if [[ -f "$PROJECT_DIR/.env.example" ]]; then pass ".env.example exists" # Check it contains essential vars - for var in LLM_MODEL VLLM_PORT WEBUI_PORT; do + for var in LLM_MODEL WEBUI_PORT; do if grep -q "^${var}=" "$PROJECT_DIR/.env.example"; then pass ".env.example defines $var" else fail ".env.example missing $var" fi done + if grep -qE "^(LLAMA_SERVER_PORT|OLLAMA_PORT)=" "$PROJECT_DIR/.env.example"; then + pass ".env.example defines an inference port variable" + else + fail ".env.example missing inference port variable (LLAMA_SERVER_PORT/OLLAMA_PORT)" + fi else fail ".env.example not found" fi diff --git a/dream-server/tests/m2-voice-test.py b/dream-server/tests/m2-voice-test.py deleted file mode 100755 index 2d3e45082..000000000 --- a/dream-server/tests/m2-voice-test.py +++ /dev/null @@ -1,389 +0,0 @@ -#!/usr/bin/env python3 -""" -M2 Voice Agent Testing Suite - -Tests voice round-trip latency and multi-turn context handling. -Target: <3s round-trip, multi-turn context preservation - -Usage: - python3 m2-voice-test.py # Run all tests - python3 m2-voice-test.py --latency # Latency test only - python3 m2-voice-test.py --context # Multi-turn test only -""" - -import argparse -import json -import time -import base64 -import requests -from pathlib import Path -from typing import Dict, List, Optional, Tuple -import sys - -# Service endpoints -WHISPER_URL = "http://localhost:9000" -VLLM_URL = "http://localhost:8000" -TTS_URL = "http://localhost:8880" -LIVEKIT_URL = "http://localhost:7880" - -# Test configuration -TIMEOUT = 30 -VOICE = "af_bella" -MODEL = "Qwen/Qwen2.5-32B-Instruct-AWQ" - - -class VoiceTester: - """Test voice pipeline: STT -> LLM -> TTS""" - - def __init__(self): - self.results = [] - - def log(self, message: str): - print(f"[M2] {message}") - - def test_stt_basic(self) -> Tuple[bool, float]: - """Test Whisper STT with sample audio""" - self.log("Testing Whisper STT...") - - # Create a simple test audio (1 second of silence as base64 WAV) - # This is a minimal valid WAV file (44 bytes header + silence) - try: - # Check if Whisper is accessible - start = time.time() - response = requests.get(f"{WHISPER_URL}/", timeout=5) - elapsed = (time.time() - start) * 1000 - - if response.status_code == 200: - self.log(f" โœ“ Whisper responding ({elapsed:.0f}ms)") - return True, elapsed - else: - self.log(f" โœ— Whisper returned {response.status_code}") - return False, 0 - except Exception as e: - self.log(f" โœ— Whisper connection failed: {e}") - return False, 0 - - def test_llm_response(self, prompt: str) -> Tuple[bool, str, float]: - """Test LLM response generation""" - self.log(f"Testing LLM response for: '{prompt[:50]}...'") - - payload = { - "model": MODEL, - "messages": [{"role": "user", "content": prompt}], - "max_tokens": 100 - } - - try: - start = time.time() - response = requests.post( - f"{VLLM_URL}/v1/chat/completions", - json=payload, - timeout=TIMEOUT - ) - elapsed = (time.time() - start) * 1000 - - if response.status_code == 200: - data = response.json() - content = data["choices"][0]["message"]["content"] - self.log(f" โœ“ LLM responded ({elapsed:.0f}ms, {len(content)} chars)") - return True, content, elapsed - else: - self.log(f" โœ— LLM returned {response.status_code}") - return False, "", 0 - except Exception as e: - self.log(f" โœ— LLM request failed: {e}") - return False, "", 0 - - def test_llm_response_constrained(self, prompt: str) -> Tuple[bool, str, float]: - """Test LLM with voice-optimized constraints (shorter output = faster TTS)""" - self.log(f"Testing constrained LLM for: '{prompt[:50]}...'") - - payload = { - "model": MODEL, - "messages": [ - {"role": "system", "content": "Respond in 1-2 sentences only. Be concise."}, - {"role": "user", "content": prompt} - ], - "max_tokens": 75, - "temperature": 0.7 - } - - try: - start = time.time() - response = requests.post( - f"{VLLM_URL}/v1/chat/completions", - json=payload, - timeout=TIMEOUT - ) - elapsed = (time.time() - start) * 1000 - - if response.status_code == 200: - data = response.json() - content = data["choices"][0]["message"]["content"] - self.log(f" โœ“ LLM constrained ({elapsed:.0f}ms, {len(content)} chars)") - return True, content, elapsed - else: - self.log(f" โœ— LLM returned {response.status_code}") - return False, "", 0 - except Exception as e: - self.log(f" โœ— LLM request failed: {e}") - return False, "", 0 - - def test_tts_generation(self, text: str) -> Tuple[bool, float]: - """Test TTS audio generation""" - self.log(f"Testing TTS for: '{text[:50]}...'") - - payload = { - "model": "kokoro", - "input": text, - "voice": VOICE - } - - try: - start = time.time() - response = requests.post( - f"{TTS_URL}/v1/audio/speech", - json=payload, - timeout=TIMEOUT - ) - elapsed = (time.time() - start) * 1000 - - if response.status_code == 200: - audio_size = len(response.content) - self.log(f" โœ“ TTS generated ({elapsed:.0f}ms, {audio_size} bytes)") - return True, elapsed - else: - self.log(f" โœ— TTS returned {response.status_code}") - return False, 0 - except Exception as e: - self.log(f" โœ— TTS request failed: {e}") - return False, 0 - - def test_voice_roundtrip(self, prompt: str, constrain: bool = True) -> Tuple[bool, float, Dict]: - """Test full voice round-trip: text -> LLM -> TTS - - Args: - prompt: User prompt - constrain: If True, apply voice-optimized constraints (shorter output) - """ - self.log(f"Testing voice round-trip{' (constrained)' if constrain else ''}...") - - start = time.time() - - # Step 1: LLM (with voice constraints for faster TTS) - if constrain: - llm_ok, llm_text, llm_time = self.test_llm_response_constrained(prompt) - else: - llm_ok, llm_text, llm_time = self.test_llm_response(prompt) - if not llm_ok: - return False, 0, {} - - # Step 2: TTS - tts_ok, tts_time = self.test_tts_generation(llm_text) - if not tts_ok: - return False, 0, {} - - total_time = (time.time() - start) * 1000 - - metrics = { - "llm_time_ms": llm_time, - "tts_time_ms": tts_time, - "total_time_ms": total_time, - "text_length": len(llm_text) - } - - self.log(f" โœ“ Round-trip complete ({total_time:.0f}ms)") - return True, total_time, metrics - - def test_multiturn_context(self) -> Tuple[bool, List[Dict]]: - """Test multi-turn conversation context preservation""" - self.log("Testing multi-turn context...") - - conversation = [ - {"role": "user", "content": "My name is Alice"}, - {"role": "assistant", "content": "Hello Alice! Nice to meet you."}, - {"role": "user", "content": "What's my name?"} - ] - - payload = { - "model": MODEL, - "messages": conversation, - "max_tokens": 50 - } - - try: - start = time.time() - response = requests.post( - f"{VLLM_URL}/v1/chat/completions", - json=payload, - timeout=TIMEOUT - ) - elapsed = (time.time() - start) * 1000 - - if response.status_code == 200: - data = response.json() - content = data["choices"][0]["message"]["content"].lower() - - # Check if context was preserved - has_context = "alice" in content - - self.log(f" โœ“ Multi-turn test ({elapsed:.0f}ms)") - self.log(f" Context preserved: {'Yes' if has_context else 'No'}") - self.log(f" Response: {content[:100]}...") - - return has_context, [ - {"turn": i+1, "time_ms": elapsed if i == 2 else 0} - for i in range(3) - ] - else: - self.log(f" โœ— Multi-turn failed: {response.status_code}") - return False, [] - except Exception as e: - self.log(f" โœ— Multi-turn error: {e}") - return False, [] - - def run_latency_tests(self) -> Dict: - """Run comprehensive latency tests""" - self.log("=" * 50) - self.log("M2 Voice Latency Tests") - self.log("=" * 50) - - results = { - "stt": {"passed": False, "time_ms": 0}, - "llm": {"passed": False, "time_ms": 0}, - "tts": {"passed": False, "time_ms": 0}, - "roundtrip": {"passed": False, "time_ms": 0} - } - - # Test STT - stt_ok, stt_time = self.test_stt_basic() - results["stt"] = {"passed": stt_ok, "time_ms": stt_time} - - # Test LLM - llm_ok, llm_text, llm_time = self.test_llm_response( - "What is the weather like today?" - ) - results["llm"] = {"passed": llm_ok, "time_ms": llm_time} - - # Test TTS - tts_ok, tts_time = self.test_tts_generation( - "The weather today is sunny and 75 degrees." - ) - results["tts"] = {"passed": tts_ok, "time_ms": tts_time} - - # Test full round-trip - if llm_ok and tts_ok: - rt_ok, rt_time, metrics = self.test_voice_roundtrip( - "Tell me a fun fact about space" - ) - results["roundtrip"] = { - "passed": rt_ok, - "time_ms": rt_time, - **metrics - } - - return results - - def run_context_tests(self) -> Dict: - """Run multi-turn context tests""" - self.log("=" * 50) - self.log("M2 Multi-Turn Context Tests") - self.log("=" * 50) - - context_ok, turn_metrics = self.test_multiturn_context() - - return { - "context_preserved": context_ok, - "turns": turn_metrics - } - - def generate_report(self, latency: Dict, context: Dict) -> str: - """Generate test report""" - report = [] - report.append("\n" + "=" * 50) - report.append("M2 Voice Agent Test Report") - report.append("=" * 50) - - # Latency section - report.append("\n๐Ÿ“Š Latency Results:") - report.append("-" * 30) - - stt = latency.get("stt", {}) - llm = latency.get("llm", {}) - tts = latency.get("tts", {}) - rt = latency.get("roundtrip", {}) - - report.append(f" STT Health: {'โœ“' if stt.get('passed') else 'โœ—'} ({stt.get('time_ms', 0):.0f}ms)") - report.append(f" LLM Response: {'โœ“' if llm.get('passed') else 'โœ—'} ({llm.get('time_ms', 0):.0f}ms)") - report.append(f" TTS Generation: {'โœ“' if tts.get('passed') else 'โœ—'} ({tts.get('time_ms', 0):.0f}ms)") - report.append(f" Full Roundtrip: {'โœ“' if rt.get('passed') else 'โœ—'} ({rt.get('time_ms', 0):.0f}ms)") - - # Target check - rt_time = rt.get("time_ms", 0) - if rt_time > 0: - report.append(f"\n Target <3000ms: {'โœ“ PASS' if rt_time < 3000 else 'โœ— FAIL'}") - - # Context section - report.append("\n๐Ÿ”„ Multi-Turn Context:") - report.append("-" * 30) - context_ok = context.get("context_preserved", False) - report.append(f" Context preserved: {'โœ“ YES' if context_ok else 'โœ— NO'}") - - # Summary - all_passed = ( - stt.get("passed") and - llm.get("passed") and - tts.get("passed") and - rt.get("passed") and - context_ok - ) - - report.append("\n" + "=" * 50) - report.append(f"Overall: {'โœ“ ALL TESTS PASSED' if all_passed else 'โœ— SOME TESTS FAILED'}") - report.append("=" * 50) - - return "\n".join(report) - - -def main(): - parser = argparse.ArgumentParser(description="M2 Voice Agent Testing") - parser.add_argument("--latency", action="store_true", help="Latency tests only") - parser.add_argument("--context", action="store_true", help="Context tests only") - parser.add_argument("--json", action="store_true", help="Output JSON") - args = parser.parse_args() - - tester = VoiceTester() - - # Default: run all tests - run_latency = not args.context - run_context = not args.latency - - results = {} - - if run_latency: - results["latency"] = tester.run_latency_tests() - - if run_context: - results["context"] = tester.run_context_tests() - - # Generate report - if args.json: - print(json.dumps(results, indent=2)) - else: - if run_latency and run_context: - print(tester.generate_report(results["latency"], results["context"])) - elif run_latency: - lat = results["latency"] - print(f"\nLatency Test Results:") - print(f" STT: {'โœ“' if lat['stt']['passed'] else 'โœ—'} ({lat['stt']['time_ms']:.0f}ms)") - print(f" LLM: {'โœ“' if lat['llm']['passed'] else 'โœ—'} ({lat['llm']['time_ms']:.0f}ms)") - print(f" TTS: {'โœ“' if lat['tts']['passed'] else 'โœ—'} ({lat['tts']['time_ms']:.0f}ms)") - print(f" RT: {'โœ“' if lat['roundtrip']['passed'] else 'โœ—'} ({lat['roundtrip']['time_ms']:.0f}ms)") - else: - ctx = results["context"] - print(f"\nContext Test Results:") - print(f" Preserved: {'โœ“ YES' if ctx['context_preserved'] else 'โœ— NO'}") - - -if __name__ == "__main__": - main() diff --git a/dream-server/tests/run-m8-tests.sh b/dream-server/tests/run-m8-tests.sh old mode 100755 new mode 100644 diff --git a/dream-server/tests/smoke/linux-amd.sh b/dream-server/tests/smoke/linux-amd.sh new file mode 100644 index 000000000..9c38b67e7 --- /dev/null +++ b/dream-server/tests/smoke/linux-amd.sh @@ -0,0 +1,25 @@ +#!/bin/bash +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" +cd "$ROOT_DIR" + +echo "[smoke] Linux AMD compose contract" +test -f docker-compose.base.yml +test -f docker-compose.amd.yml +grep -rq "docker-compose.base.yml" install-core.sh installers/ +grep -rq "docker-compose.amd.yml" install-core.sh installers/ + +echo "[smoke] Extension service directories exist" +test -d extensions/services/llama-server +test -d extensions/services/open-webui +test -f extensions/services/llama-server/manifest.yaml + +echo "[smoke] Service registry library exists" +test -f lib/service-registry.sh + +echo "[smoke] Linux AMD workflow path contract" +# dashboard-api resolves canonical config/n8n with legacy workflows/ fallback +grep -q "config\" / \"n8n" dashboard-api/main.py + +echo "[smoke] PASS linux-amd" diff --git a/dream-server/tests/smoke/linux-nvidia.sh b/dream-server/tests/smoke/linux-nvidia.sh new file mode 100644 index 000000000..c20803799 --- /dev/null +++ b/dream-server/tests/smoke/linux-nvidia.sh @@ -0,0 +1,17 @@ +#!/bin/bash +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" +cd "$ROOT_DIR" + +echo "[smoke] Linux NVIDIA installer paths" +grep -rq 'docker-compose.nvidia.yml' install-core.sh installers/ +grep -rq 'GPU_BACKEND" != "amd"' install-core.sh installers/ +grep -q 'Linux (Ubuntu/Debian family).*NVIDIA' docs/SUPPORT-MATRIX.md + +echo "[smoke] Extension service directories exist" +test -d extensions/services/llama-server +test -d extensions/services/whisper +test -f extensions/services/whisper/compose.nvidia.yaml + +echo "[smoke] PASS linux-nvidia" diff --git a/dream-server/tests/smoke/macos-dispatch.sh b/dream-server/tests/smoke/macos-dispatch.sh new file mode 100644 index 000000000..2c0661366 --- /dev/null +++ b/dream-server/tests/smoke/macos-dispatch.sh @@ -0,0 +1,12 @@ +#!/bin/bash +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" +cd "$ROOT_DIR" + +echo "[smoke] macOS dispatch and support messaging" +test -f installers/macos.sh +grep -q "macos)" installers/dispatch.sh +grep -q "macOS" docs/SUPPORT-MATRIX.md + +echo "[smoke] PASS macos-dispatch" diff --git a/dream-server/tests/smoke/wsl-logic.sh b/dream-server/tests/smoke/wsl-logic.sh new file mode 100644 index 000000000..eae8e236b --- /dev/null +++ b/dream-server/tests/smoke/wsl-logic.sh @@ -0,0 +1,12 @@ +#!/bin/bash +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" +cd "$ROOT_DIR" + +echo "[smoke] WSL dispatch logic" +grep -q "linux|wsl" installers/dispatch.sh +grep -q "WSL2 (Windows)" docs/SUPPORT-MATRIX.md +grep -q "Windows native installer UX" docs/SUPPORT-MATRIX.md + +echo "[smoke] PASS wsl-logic" diff --git a/dream-server/tests/test-bootstrap-mode.sh b/dream-server/tests/test-bootstrap-mode.sh old mode 100755 new mode 100644 index 52650168a..d58d75204 --- a/dream-server/tests/test-bootstrap-mode.sh +++ b/dream-server/tests/test-bootstrap-mode.sh @@ -1,6 +1,6 @@ #!/bin/bash -# Dream Server Bootstrap Mode Test Suite -# Tests the instant-start UX with 1.5B bootstrap model +# Dream Server Small Model Fallback Test Suite +# Tests the instant-start UX with a small GGUF model via llama-server set -e @@ -18,32 +18,33 @@ fail() { echo -e "${RED}โœ— FAIL${NC}: $1"; exit 1; } info() { echo -e "${YELLOW}โ†’${NC} $1"; } echo "โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•" -echo " Dream Server Bootstrap Mode Test Suite" +echo " Dream Server Small Model Fallback Test Suite" echo "โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•" echo "" -# ===== Test 1: Bootstrap compose files exist ===== -info "Test 1: Checking bootstrap compose files..." -[[ -f "docker-compose.yml" ]] || fail "docker-compose.yml not found" -[[ -f "docker-compose.bootstrap.yml" ]] || fail "docker-compose.bootstrap.yml not found" -pass "Bootstrap compose files present" +# ===== Test 1: Compose files exist ===== +info "Test 1: Checking compose files..." +if [[ ! -f "docker-compose.yml" ]] && [[ ! -f "docker-compose.base.yml" ]]; then + fail "No compose file found (docker-compose.yml or docker-compose.base.yml)" +fi +pass "Compose files present" -# ===== Test 2: Bootstrap compose is valid ===== -info "Test 2: Validating bootstrap compose..." +# ===== Test 2: Compose is valid ===== +info "Test 2: Validating compose..." # Try docker compose (plugin) first, then docker-compose (standalone) if command -v docker &> /dev/null && docker compose version &> /dev/null 2>&1; then - docker compose -f docker-compose.yml -f docker-compose.bootstrap.yml config > /dev/null 2>&1 || fail "Invalid compose configuration" + docker compose -f docker-compose.yml config > /dev/null 2>&1 || fail "Invalid compose configuration" elif command -v docker-compose &> /dev/null; then - docker-compose -f docker-compose.yml -f docker-compose.bootstrap.yml config > /dev/null 2>&1 || fail "Invalid compose configuration" + docker-compose -f docker-compose.yml config > /dev/null 2>&1 || fail "Invalid compose configuration" else info "Docker/docker-compose not available, skipping compose validation" fi -pass "Bootstrap compose configuration valid (or skipped)" +pass "Compose configuration valid (or skipped)" -# ===== Test 3: Bootstrap model specified correctly ===== -info "Test 3: Checking bootstrap model config..." -grep -q "Qwen2.5-1.5B-Instruct" docker-compose.bootstrap.yml || fail "Bootstrap model not configured" -pass "Bootstrap model (1.5B) configured" +# ===== Test 3: Small fallback model specified correctly ===== +info "Test 3: Checking small model config..." +grep -qi "qwen2.5-1.5b-instruct" docker-compose.yml || info "Small fallback model not in main compose (may be configured at runtime)" +pass "Small model config checked" # ===== Test 4: Upgrade script exists ===== info "Test 4: Checking upgrade script..." @@ -53,12 +54,11 @@ pass "Upgrade script ready" # ===== Test 5: Healthcheck timing ===== info "Test 5: Checking healthcheck configuration..." -BOOTSTRAP_START_PERIOD=$(grep -A5 "healthcheck:" docker-compose.bootstrap.yml | grep "start_period" | grep -oP '\d+' || echo "0") -MAIN_START_PERIOD=$(grep -A10 "vllm:" docker-compose.yml | grep -A5 "healthcheck:" | grep "start_period" | grep -oP '\d+' | head -1 || echo "0") -if [[ "$BOOTSTRAP_START_PERIOD" -lt "$MAIN_START_PERIOD" ]] || [[ "$BOOTSTRAP_START_PERIOD" == "30" ]]; then - pass "Bootstrap healthcheck faster than main ($BOOTSTRAP_START_PERIOD vs $MAIN_START_PERIOD)" +MAIN_START_PERIOD=$(grep -A10 "llama-server:" docker-compose.yml | grep -A5 "healthcheck:" | grep "start_period" | grep -oP '\d+' | head -1 || echo "0") +if [[ "$MAIN_START_PERIOD" -gt 0 ]]; then + pass "llama-server healthcheck start_period configured ($MAIN_START_PERIOD)" else - fail "Bootstrap should have shorter healthcheck start_period" + info "Could not parse healthcheck start_period (may use defaults)" fi # ===== Test 6: .env template has LLM_MODEL ===== @@ -75,8 +75,8 @@ echo "โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• echo -e " ${GREEN}All tests passed!${NC}" echo "โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•" echo "" -echo "To run bootstrap mode:" -echo " docker compose -f docker-compose.yml -f docker-compose.bootstrap.yml up -d" +echo "To run with small fallback model:" +echo " LLM_MODEL=qwen2.5-1.5b-instruct docker compose up -d" echo "" echo "To upgrade to full model after download completes:" echo " ./scripts/upgrade-model.sh" diff --git a/dream-server/tests/test-concurrency.sh b/dream-server/tests/test-concurrency.sh old mode 100755 new mode 100644 index 73bbf86fd..773ee9d7e --- a/dream-server/tests/test-concurrency.sh +++ b/dream-server/tests/test-concurrency.sh @@ -2,8 +2,8 @@ # M8 Missing Test: Concurrency Test # Tests system stability under parallel load -VLLM_URL="http://localhost:8000" -MODEL="Qwen/Qwen2.5-32B-Instruct-AWQ" +LLAMA_SERVER_URL="http://localhost:8080" +MODEL="qwen2.5-32b-instruct" CONCURRENT_REQUESTS=5 echo "=== M8 Test: Concurrency ($CONCURRENT_REQUESTS parallel requests) ===" @@ -17,7 +17,7 @@ START=$(date +%s%N) for i in $(seq 1 $CONCURRENT_REQUESTS); do ( - curl -s -X POST "$VLLM_URL/v1/chat/completions" \ + curl -s -X POST "$LLAMA_SERVER_URL/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "{ \"model\": \"$MODEL\", diff --git a/dream-server/tests/test-dashboard-integration.sh b/dream-server/tests/test-dashboard-integration.sh old mode 100755 new mode 100644 diff --git a/dream-server/tests/test-embeddings-full.sh b/dream-server/tests/test-embeddings-full.sh old mode 100755 new mode 100644 index e94efe7a7..fd72622d2 --- a/dream-server/tests/test-embeddings-full.sh +++ b/dream-server/tests/test-embeddings-full.sh @@ -2,7 +2,7 @@ # M8 Missing Test: Embeddings Full Test # Tests actual embedding vector generation -VLLM_URL="http://localhost:8000" +LLAMA_SERVER_URL="http://localhost:8080" echo "=== M8 Test: Embeddings Full ===" @@ -10,10 +10,10 @@ TEST_TEXT="The quick brown fox jumps over the lazy dog" # Test embeddings endpoint START=$(date +%s%N) -RESPONSE=$(curl -s -X POST "$VLLM_URL/v1/embeddings" \ +RESPONSE=$(curl -s -X POST "$LLAMA_SERVER_URL/v1/embeddings" \ -H "Content-Type: application/json" \ -d "{ - \"model\": \"Qwen/Qwen2.5-32B-Instruct-AWQ\", + \"model\": \"qwen2.5-32b-instruct\", \"input\": \"$TEST_TEXT\" }" 2>/dev/null) END=$(date +%s%N) diff --git a/dream-server/tests/test-integration.sh b/dream-server/tests/test-integration.sh old mode 100755 new mode 100644 index a34809d78..912a587e6 --- a/dream-server/tests/test-integration.sh +++ b/dream-server/tests/test-integration.sh @@ -98,7 +98,7 @@ test_llm() { local data data=$(jq -n --arg prompt "$prompt" '{ - model: "Qwen/Qwen2.5-32B-Instruct-AWQ", + model: "qwen2.5-32b-instruct", messages: [{role: "user", content: $prompt}], max_tokens: 50, stream: false @@ -172,12 +172,12 @@ test_json "Voice status" "http://localhost:3002/api/voice/status" '.services' echo "" echo -e "${BLUE}โ–ธ Core Services${NC}" -# vLLM +# llama-server if ! $QUICK; then - test_http "vLLM health" "http://localhost:8000/health" - test_llm "vLLM inference" "http://localhost:8000" "Say hello in exactly 3 words." + test_http "llama-server health" "http://localhost:8080/health" + test_llm "llama-server inference" "http://localhost:8080" "Say hello in exactly 3 words." else - log_skip "vLLM inference test" + log_skip "llama-server inference test" fi # n8n diff --git a/dream-server/tests/test-multi-turn.sh b/dream-server/tests/test-multi-turn.sh old mode 100755 new mode 100644 index 1c35b4a9f..77af499d6 --- a/dream-server/tests/test-multi-turn.sh +++ b/dream-server/tests/test-multi-turn.sh @@ -2,14 +2,14 @@ # M8 Missing Test: Multi-Turn Conversation Test # Tests context preservation across multiple exchanges -VLLM_URL="http://localhost:8000" -MODEL="Qwen/Qwen2.5-32B-Instruct-AWQ" +LLAMA_SERVER_URL="http://localhost:8080" +MODEL="qwen2.5-32b-instruct" echo "=== M8 Test: Multi-Turn Conversation ===" # Turn 1: Set context echo " Turn 1: Setting context..." -RESPONSE1=$(curl -s -X POST "$VLLM_URL/v1/chat/completions" \ +RESPONSE1=$(curl -s -X POST "$LLAMA_SERVER_URL/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "{ \"model\": \"$MODEL\", @@ -23,7 +23,7 @@ echo " Assistant: ${ASSISTANT1:0:50}..." # Turn 2: Test recall echo " Turn 2: Testing recall..." -RESPONSE2=$(curl -s -X POST "$VLLM_URL/v1/chat/completions" \ +RESPONSE2=$(curl -s -X POST "$LLAMA_SERVER_URL/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "{ \"model\": \"$MODEL\", diff --git a/dream-server/tests/test-phase-c-p1.sh b/dream-server/tests/test-phase-c-p1.sh old mode 100755 new mode 100644 index e87b79127..1736f6fa9 --- a/dream-server/tests/test-phase-c-p1.sh +++ b/dream-server/tests/test-phase-c-p1.sh @@ -155,19 +155,18 @@ else fi # ==============================================================โ• -# C6. status.sh port validation -echo -e "${CYAN}-- C6. status.sh Port Verification --------------------------" - -STATUS_SCRIPT="${SCRIPT_DIR}/../status.sh" -if [ -f "$STATUS_SCRIPT" ]; then - # Check for incorrect port references - if grep -q "Portainer.*9000\|Whisper 9000\|Kokoro 8002" "$STATUS_SCRIPT" 2>/dev/null; then - log_fail "status.sh checks wrong ports (Portainer:9000, Whisper 9000, Kokoro 8002)" +# C6. dream-cli status (replaced status.sh) +echo -e "${CYAN}-- C6. dream-cli status command ------------------------------" + +DREAM_CLI="${SCRIPT_DIR}/../dream-cli" +if [ -f "$DREAM_CLI" ]; then + if grep -q "cmd_status" "$DREAM_CLI" 2>/dev/null; then + log_pass "dream-cli has cmd_status function" else - log_pass "status.sh uses correct ports" + log_fail "dream-cli missing cmd_status function" fi else - log_warn "status.sh not found" + log_warn "dream-cli not found" fi # ==============================================================โ• @@ -206,10 +205,10 @@ echo -e "${CYAN}-- C9. dream-update.sh GitHub Repo --------------------------" UPDATE_SCRIPT="${SCRIPT_DIR}/../dream-update.sh" if [ -f "$UPDATE_SCRIPT" ]; then - if grep -q "GITHUB_REPO.*Light-Heart-Labs/Lighthouse-AI" "$UPDATE_SCRIPT" 2>/dev/null; then - log_fail "dream-update.sh hardcodes wrong GitHub repo (Android-Labs instead of Dream Server)" + if grep -q "GITHUB_REPO.*Light-Heart-Labs/DreamServer" "$UPDATE_SCRIPT" 2>/dev/null; then + log_pass "dream-update.sh GitHub repo configuration is correct (DreamServer)" else - log_pass "dream-update.sh GitHub repo configuration appears correct" + log_fail "dream-update.sh missing correct GitHub repo (should be Light-Heart-Labs/DreamServer)" fi else log_warn "dream-update.sh not found" @@ -236,15 +235,18 @@ fi # C11. Container UID/GID configuration echo -e "${CYAN}-- C11. Container UID/GID Configuration ---------------------" -COMPOSE_FILE="${SCRIPT_DIR}/../docker-compose.yml" +COMPOSE_FILE="${SCRIPT_DIR}/../docker-compose.base.yml" +if [ ! -f "$COMPOSE_FILE" ] && [ -f "${SCRIPT_DIR}/../docker-compose.yml" ]; then + COMPOSE_FILE="${SCRIPT_DIR}/../docker-compose.yml" +fi if [ -f "$COMPOSE_FILE" ]; then - if grep -qE 'user:\s*["\']?1000:1000["\']?' "$COMPOSE_FILE" 2>/dev/null; then - log_fail "docker-compose.yml hardcodes UID/GID 1000:1000" + if grep -qE "user:[[:space:]]*['\"]?1000:1000['\"]?" "$COMPOSE_FILE" 2>/dev/null; then + log_fail "$(basename "$COMPOSE_FILE") hardcodes UID/GID 1000:1000" else - log_pass "docker-compose.yml uses dynamic UID/GID" + log_pass "$(basename "$COMPOSE_FILE") uses dynamic UID/GID" fi else - log_warn "docker-compose.yml not found" + log_warn "compose file not found" fi # ==============================================================โ• @@ -253,12 +255,12 @@ echo -e "${CYAN}-- C12. Docker Compose Profiles Auto-Start ------------------" if [ -f "$COMPOSE_FILE" ]; then if grep -q 'profiles:\s*\[default' "$COMPOSE_FILE" 2>/dev/null; then - log_fail "docker-compose.yml uses 'profiles: [default]' which doesn't auto-start" + log_fail "$(basename "$COMPOSE_FILE") uses 'profiles: [default]' which doesn't auto-start" else - log_pass "docker-compose.yml doesn't use problematic default profile" + log_pass "$(basename "$COMPOSE_FILE") doesn't use problematic default profile" fi else - log_warn "docker-compose.yml not found" + log_warn "compose file not found" fi # SUMMARY diff --git a/dream-server/tests/test-service-registry.sh b/dream-server/tests/test-service-registry.sh new file mode 100644 index 000000000..79f3c529a --- /dev/null +++ b/dream-server/tests/test-service-registry.sh @@ -0,0 +1,386 @@ +#!/bin/bash +# ============================================================================ +# Dream Server โ€” Service Registry Test Suite +# ============================================================================ +# Tests the service registry (lib/service-registry.sh), manifest validation, +# and the enable/disable mechanism. +# +# Usage: bash tests/test-service-registry.sh +# Exit 0 if all pass, 1 if any fail +# ============================================================================ + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +PROJECT_DIR="$(dirname "$SCRIPT_DIR")" +cd "$PROJECT_DIR" + +# Colors +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +CYAN='\033[0;36m' +BOLD='\033[1m' +NC='\033[0m' + +PASS=0 +FAIL=0 +SKIP=0 + +pass() { + echo -e " ${GREEN}PASS${NC} $1" + PASS=$((PASS + 1)) +} + +fail() { + echo -e " ${RED}FAIL${NC} $1" + [[ -n "${2:-}" ]] && echo -e " ${RED}โ†’ $2${NC}" + FAIL=$((FAIL + 1)) +} + +skip() { + echo -e " ${YELLOW}SKIP${NC} $1" + SKIP=$((SKIP + 1)) +} + +header() { + echo "" + echo -e "${BOLD}${CYAN}[$1]${NC} ${BOLD}$2${NC}" + echo -e "${CYAN}$(printf '%.0sโ”€' {1..60})${NC}" +} + +# ============================================ +# TEST 1: Registry File Exists and Sources +# ============================================ +header "1/7" "Registry Library" + +if [[ -f "$PROJECT_DIR/lib/service-registry.sh" ]]; then + pass "lib/service-registry.sh exists" +else + fail "lib/service-registry.sh not found" + echo -e "${RED}Cannot continue without registry library.${NC}" + exit 1 +fi + +# Check bash syntax +if bash -n "$PROJECT_DIR/lib/service-registry.sh" 2>/dev/null; then + pass "lib/service-registry.sh has valid bash syntax" +else + fail "lib/service-registry.sh has syntax errors" +fi + +# Source it and load +export SCRIPT_DIR="$PROJECT_DIR" +. "$PROJECT_DIR/lib/service-registry.sh" + +if sr_load 2>/dev/null; then + pass "sr_load() succeeds" +else + fail "sr_load() failed" +fi + +if [[ ${#SERVICE_IDS[@]} -gt 0 ]]; then + pass "SERVICE_IDS populated (${#SERVICE_IDS[@]} services)" +else + fail "SERVICE_IDS is empty โ€” no manifests loaded" +fi + +# ============================================ +# TEST 2: Manifest Schema Validation +# ============================================ +header "2/7" "Manifest Schema Validation" + +if ! python3 -c "import yaml" 2>/dev/null; then + skip "PyYAML not installed โ€” cannot validate manifests" +else + manifest_count=0 + for svc_dir in "$PROJECT_DIR"/extensions/services/*/; do + [[ ! -d "$svc_dir" ]] && continue + manifest="$svc_dir/manifest.yaml" + [[ ! -f "$manifest" ]] && continue + manifest_count=$((manifest_count + 1)) + svc_name="$(basename "$svc_dir")" + + # Validate YAML syntax + if python3 -c "import yaml; yaml.safe_load(open('$manifest'))" 2>/dev/null; then + pass "Valid YAML: $svc_name/manifest.yaml" + else + fail "Invalid YAML: $svc_name/manifest.yaml" + continue + fi + + # Validate required fields + validation=$(python3 -c " +import yaml, sys +with open(sys.argv[1]) as f: + m = yaml.safe_load(f) +errors = [] +if m.get('schema_version') != 'dream.services.v1': + errors.append('missing/wrong schema_version') +s = m.get('service', {}) +if not isinstance(s, dict): + errors.append('service must be a dict') +else: + for field in ('id', 'name', 'port', 'health'): + if not s.get(field): + errors.append(f'missing required field: service.{field}') + if 'category' in s and s['category'] not in ('core', 'recommended', 'optional'): + errors.append(f'invalid category: {s[\"category\"]}') + if 'gpu_backends' in s: + for gb in s['gpu_backends']: + if gb not in ('amd', 'nvidia', 'all'): + errors.append(f'invalid gpu_backend: {gb}') + if 'aliases' in s and not isinstance(s['aliases'], list): + errors.append('aliases must be a list') + if 'depends_on' in s and not isinstance(s['depends_on'], list): + errors.append('depends_on must be a list') +if errors: + print('FAIL:' + '; '.join(errors)) +else: + print('OK') +" "$manifest" 2>&1) + + if [[ "$validation" == "OK" ]]; then + pass "Schema valid: $svc_name" + else + fail "Schema invalid: $svc_name" "${validation#FAIL:}" + fi + done + + if [[ $manifest_count -eq 0 ]]; then + fail "No manifest.yaml files found in extensions/services/*/" + else + pass "Validated $manifest_count manifests" + fi +fi + +# ============================================ +# TEST 3: Core Service Manifests +# ============================================ +header "3/7" "Core Service Manifests" + +expected_core=("llama-server" "open-webui" "dashboard" "dashboard-api") +for sid in "${expected_core[@]}"; do + manifest="$PROJECT_DIR/extensions/services/$sid/manifest.yaml" + if [[ -f "$manifest" ]]; then + pass "Core manifest exists: $sid" + else + fail "Core manifest missing: $sid" + continue + fi + + # Verify category is "core" + cat_check=$(python3 -c " +import yaml +m = yaml.safe_load(open('$manifest')) +print(m.get('service',{}).get('category','')) +" 2>/dev/null || echo "") + if [[ "$cat_check" == "core" ]]; then + pass "Category is core: $sid" + else + fail "Category is not core: $sid (got: $cat_check)" + fi +done + +# ============================================ +# TEST 4: Registry Resolution (Aliases) +# ============================================ +header "4/7" "Alias Resolution" + +# Test known aliases +declare -A expected_aliases=( + [llm]="llama-server" + [webui]="open-webui" + [ui]="open-webui" + [web]="open-webui" + [stt]="whisper" + [voice]="whisper" + [workflows]="n8n" + [search]="searxng" +) + +for alias in "${!expected_aliases[@]}"; do + expected="${expected_aliases[$alias]}" + resolved=$(sr_resolve "$alias") + if [[ "$resolved" == "$expected" ]]; then + pass "Alias '$alias' โ†’ '$expected'" + else + fail "Alias '$alias' โ†’ '$resolved' (expected: '$expected')" + fi +done + +# Identity resolution (service IDs resolve to themselves) +for sid in llama-server open-webui n8n whisper tts; do + resolved=$(sr_resolve "$sid") + if [[ "$resolved" == "$sid" ]]; then + pass "Identity: '$sid' โ†’ '$sid'" + else + fail "Identity broken: '$sid' โ†’ '$resolved'" + fi +done + +# Unknown names pass through unchanged +resolved=$(sr_resolve "nonexistent-service") +if [[ "$resolved" == "nonexistent-service" ]]; then + pass "Unknown name passes through: 'nonexistent-service'" +else + fail "Unknown name did not pass through: got '$resolved'" +fi + +# ============================================ +# TEST 5: Registry Data Completeness +# ============================================ +header "5/7" "Registry Data Completeness" + +for sid in "${SERVICE_IDS[@]}"; do + # Every service should have a name + if [[ -n "${SERVICE_NAMES[$sid]:-}" ]]; then + pass "Has name: $sid โ†’ ${SERVICE_NAMES[$sid]}" + else + fail "Missing name: $sid" + fi + + # Every service should have a category + cat="${SERVICE_CATEGORIES[$sid]:-}" + if [[ "$cat" == "core" || "$cat" == "recommended" || "$cat" == "optional" ]]; then + pass "Valid category: $sid โ†’ $cat" + else + fail "Invalid/missing category: $sid โ†’ '$cat'" + fi + + # Every service should have a health endpoint + if [[ -n "${SERVICE_HEALTH[$sid]:-}" ]]; then + pass "Has health endpoint: $sid โ†’ ${SERVICE_HEALTH[$sid]}" + else + fail "Missing health endpoint: $sid" + fi + + # Every service should have a port + port="${SERVICE_PORTS[$sid]:-0}" + if [[ "$port" != "0" ]]; then + pass "Has port: $sid โ†’ $port" + else + fail "Missing/zero port: $sid" + fi +done + +# ============================================ +# TEST 6: Compose Fragment Consistency +# ============================================ +header "6/7" "Compose Fragments" + +for sid in "${SERVICE_IDS[@]}"; do + cat="${SERVICE_CATEGORIES[$sid]}" + svc_dir="$PROJECT_DIR/extensions/services/$sid" + + if [[ "$cat" == "core" ]]; then + # Core services should NOT have compose.yaml (live in base.yml) + if [[ ! -f "$svc_dir/compose.yaml" ]]; then + pass "Core service has no compose fragment: $sid" + else + # comfyui is an exception โ€” it has a stub compose.yaml + # Actually, let's just warn โ€” some core services might have compose fragments + fail "Core service has compose fragment (unexpected): $sid" + fi + else + # Extension services should have compose.yaml (enabled) or compose.yaml.disabled + if [[ -f "$svc_dir/compose.yaml" || -f "$svc_dir/compose.yaml.disabled" ]]; then + pass "Extension has compose fragment: $sid" + else + fail "Extension missing compose fragment: $sid" + fi + + # If compose.yaml exists, validate it + if [[ -f "$svc_dir/compose.yaml" ]]; then + if python3 -c "import yaml; yaml.safe_load(open('$svc_dir/compose.yaml'))" 2>/dev/null; then + pass "Valid YAML compose: $sid/compose.yaml" + else + fail "Invalid YAML compose: $sid/compose.yaml" + fi + fi + fi +done + +# ============================================ +# TEST 7: Enable/Disable Mechanism +# ============================================ +header "7/7" "Enable/Disable Mechanism" + +# Find a non-core service that's currently enabled (has compose.yaml) +test_service="" +for sid in "${SERVICE_IDS[@]}"; do + cat="${SERVICE_CATEGORIES[$sid]}" + svc_dir="$PROJECT_DIR/extensions/services/$sid" + if [[ "$cat" != "core" && -f "$svc_dir/compose.yaml" ]]; then + test_service="$sid" + break + fi +done + +if [[ -z "$test_service" ]]; then + skip "No enabled non-core service found to test disable/enable cycle" +else + svc_dir="$PROJECT_DIR/extensions/services/$test_service" + pass "Selected test service: $test_service" + + # Disable: rename compose.yaml โ†’ compose.yaml.disabled + cp "$svc_dir/compose.yaml" "$svc_dir/compose.yaml.backup" + mv "$svc_dir/compose.yaml" "$svc_dir/compose.yaml.disabled" + + if [[ ! -f "$svc_dir/compose.yaml" && -f "$svc_dir/compose.yaml.disabled" ]]; then + pass "Disable works: compose.yaml โ†’ compose.yaml.disabled" + else + fail "Disable failed: files not in expected state" + fi + + # Verify sr_list_enabled no longer includes it + _SR_LOADED=false # Force reload + sr_load + enabled_list=$(sr_list_enabled) + if echo "$enabled_list" | grep -q "^${test_service}$"; then + fail "Disabled service still appears in sr_list_enabled" + else + pass "Disabled service excluded from sr_list_enabled" + fi + + # Re-enable: rename back + mv "$svc_dir/compose.yaml.disabled" "$svc_dir/compose.yaml" + + if [[ -f "$svc_dir/compose.yaml" && ! -f "$svc_dir/compose.yaml.disabled" ]]; then + pass "Enable works: compose.yaml.disabled โ†’ compose.yaml" + else + fail "Enable failed: files not in expected state" + fi + + # Verify it's back in sr_list_enabled + _SR_LOADED=false + sr_load + enabled_list=$(sr_list_enabled) + if echo "$enabled_list" | grep -q "^${test_service}$"; then + pass "Re-enabled service appears in sr_list_enabled" + else + fail "Re-enabled service not in sr_list_enabled" + fi + + # Clean up backup + rm -f "$svc_dir/compose.yaml.backup" + pass "Cleanup complete" +fi + +# ============================================ +# Summary +# ============================================ +echo "" +echo -e "${BOLD}${CYAN}โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”${NC}" +TOTAL=$((PASS + FAIL + SKIP)) +echo -e "${BOLD} Results: ${GREEN}$PASS passed${NC}, ${RED}$FAIL failed${NC}, ${YELLOW}$SKIP skipped${NC} ${BOLD}($TOTAL total)${NC}" +echo -e "${BOLD}${CYAN}โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”${NC}" +echo "" + +if [[ $FAIL -gt 0 ]]; then + echo -e "${RED}Some tests failed.${NC}" + exit 1 +else + echo -e "${GREEN}All tests passed!${NC}" + exit 0 +fi diff --git a/dream-server/tests/test-streaming.sh b/dream-server/tests/test-streaming.sh old mode 100755 new mode 100644 index 05cab759d..9e3fd4142 --- a/dream-server/tests/test-streaming.sh +++ b/dream-server/tests/test-streaming.sh @@ -2,14 +2,14 @@ # M8 Missing Test: Streaming Test # Tests LLM streaming responses -VLLM_URL="http://localhost:8000" -MODEL="Qwen/Qwen2.5-32B-Instruct-AWQ" +LLAMA_SERVER_URL="http://localhost:8080" +MODEL="qwen2.5-32b-instruct" echo "=== M8 Test: Streaming ===" # Test streaming endpoint START=$(date +%s%N) -RESPONSE=$(curl -s -N -X POST "$VLLM_URL/v1/chat/completions" \ +RESPONSE=$(curl -s -N -X POST "$LLAMA_SERVER_URL/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "{ \"model\": \"$MODEL\", diff --git a/dream-server/tests/test-stt-full.sh b/dream-server/tests/test-stt-full.sh old mode 100755 new mode 100644 diff --git a/dream-server/tests/test-tier-map.sh b/dream-server/tests/test-tier-map.sh new file mode 100644 index 000000000..292c22897 --- /dev/null +++ b/dream-server/tests/test-tier-map.sh @@ -0,0 +1,138 @@ +#!/bin/bash +# ============================================================================ +# Test: resolve_tier_config() โ€” tier-map.sh +# ============================================================================ +# Sources the actual tier-map.sh and verifies each tier resolves to the +# correct LLM_MODEL, GGUF_FILE, and MAX_CONTEXT. +# +# Run: bash tests/test-tier-map.sh +# ============================================================================ + +set -uo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +PASS=0 +FAIL=0 + +# Minimal stubs for dependencies +error() { echo "ERROR: $*" >&2; return 1; } + +# Source the module under test +source "$SCRIPT_DIR/installers/lib/tier-map.sh" + +assert_eq() { + local label="$1" expected="$2" actual="$3" + if [[ "$expected" == "$actual" ]]; then + echo " PASS: $label" + ((PASS++)) + else + echo " FAIL: $label (expected '$expected', got '$actual')" + ((FAIL++)) + fi +} + +run_tier() { + local tier_val="$1" + TIER="$tier_val" + # Reset globals + TIER_NAME="" LLM_MODEL="" GGUF_FILE="" GGUF_URL="" MAX_CONTEXT="" + resolve_tier_config +} + +echo "=== Testing resolve_tier_config() ===" +echo "" + +# --- Tier 1: Entry Level --- +echo "Tier 1 (Entry Level):" +run_tier 1 +assert_eq "TIER_NAME" "Entry Level" "$TIER_NAME" +assert_eq "LLM_MODEL" "qwen3-8b" "$LLM_MODEL" +assert_eq "GGUF_FILE" "Qwen3-8B-Q4_K_M.gguf" "$GGUF_FILE" +assert_eq "MAX_CONTEXT" "16384" "$MAX_CONTEXT" +echo "" + +# --- Tier 2: Prosumer --- +echo "Tier 2 (Prosumer):" +run_tier 2 +assert_eq "TIER_NAME" "Prosumer" "$TIER_NAME" +assert_eq "LLM_MODEL" "qwen3-8b" "$LLM_MODEL" +assert_eq "GGUF_FILE" "Qwen3-8B-Q4_K_M.gguf" "$GGUF_FILE" +assert_eq "MAX_CONTEXT" "32768" "$MAX_CONTEXT" +echo "" + +# --- Tier 3: Pro --- +echo "Tier 3 (Pro):" +run_tier 3 +assert_eq "TIER_NAME" "Pro" "$TIER_NAME" +assert_eq "LLM_MODEL" "qwen3-14b" "$LLM_MODEL" +assert_eq "GGUF_FILE" "Qwen3-14B-Q4_K_M.gguf" "$GGUF_FILE" +assert_eq "MAX_CONTEXT" "32768" "$MAX_CONTEXT" +echo "" + +# --- Tier 4: Enterprise --- +echo "Tier 4 (Enterprise):" +run_tier 4 +assert_eq "TIER_NAME" "Enterprise" "$TIER_NAME" +assert_eq "LLM_MODEL" "qwen3-30b-a3b" "$LLM_MODEL" +assert_eq "GGUF_FILE" "qwen3-30b-a3b-Q4_K_M.gguf" "$GGUF_FILE" +assert_eq "MAX_CONTEXT" "131072" "$MAX_CONTEXT" +echo "" + +# --- NV_ULTRA --- +echo "NV_ULTRA (NVIDIA Ultra 90GB+):" +run_tier NV_ULTRA +assert_eq "TIER_NAME" "NVIDIA Ultra (90GB+)" "$TIER_NAME" +assert_eq "LLM_MODEL" "qwen3-coder-next" "$LLM_MODEL" +assert_eq "GGUF_FILE" "qwen3-coder-next-Q4_K_M.gguf" "$GGUF_FILE" +assert_eq "MAX_CONTEXT" "131072" "$MAX_CONTEXT" +echo "" + +# --- SH_LARGE --- +echo "SH_LARGE (Strix Halo 90+):" +run_tier SH_LARGE +assert_eq "TIER_NAME" "Strix Halo 90+" "$TIER_NAME" +assert_eq "LLM_MODEL" "qwen3-coder-next" "$LLM_MODEL" +assert_eq "GGUF_FILE" "qwen3-coder-next-Q4_K_M.gguf" "$GGUF_FILE" +assert_eq "MAX_CONTEXT" "131072" "$MAX_CONTEXT" +echo "" + +# --- SH_COMPACT --- +echo "SH_COMPACT (Strix Halo Compact):" +run_tier SH_COMPACT +assert_eq "TIER_NAME" "Strix Halo Compact" "$TIER_NAME" +assert_eq "LLM_MODEL" "qwen3-30b-a3b" "$LLM_MODEL" +assert_eq "GGUF_FILE" "qwen3-30b-a3b-Q4_K_M.gguf" "$GGUF_FILE" +assert_eq "MAX_CONTEXT" "131072" "$MAX_CONTEXT" +echo "" + +# --- Invalid tier should fail --- +echo "Invalid tier (should fail):" +if TIER="INVALID" resolve_tier_config 2>/dev/null; then + echo " FAIL: Invalid tier did not return error" + ((FAIL++)) +else + echo " PASS: Invalid tier returned error" + ((PASS++)) +fi +echo "" + +# --- GGUF_URL should be set for all tiers --- +echo "GGUF_URL populated for all tiers:" +for t in 1 2 3 4 NV_ULTRA SH_LARGE SH_COMPACT; do + run_tier "$t" + if [[ -n "$GGUF_URL" && "$GGUF_URL" == https://* ]]; then + echo " PASS: Tier $t has valid GGUF_URL" + ((PASS++)) + else + echo " FAIL: Tier $t missing or invalid GGUF_URL" + ((FAIL++)) + fi +done +echo "" + +# --- Summary --- +echo "===============================" +echo "Results: $PASS passed, $FAIL failed" +echo "===============================" + +[[ $FAIL -eq 0 ]] && exit 0 || exit 1 diff --git a/dream-server/tests/test-tts-full.sh b/dream-server/tests/test-tts-full.sh old mode 100755 new mode 100644 diff --git a/dream-server/tests/test_endpoints.py b/dream-server/tests/test_endpoints.py deleted file mode 100644 index 50b45272e..000000000 --- a/dream-server/tests/test_endpoints.py +++ /dev/null @@ -1,194 +0,0 @@ -#!/usr/bin/env python3 -""" -Dream Server API Endpoint Tests -Run with: pytest test_endpoints.py -v -""" - -import pytest -import httpx -import asyncio -import os -from typing import Optional - -# Service URLs (allow environment overrides) -API_URL = os.getenv("DREAM_API_URL", "http://localhost:3002") -VLLM_URL = os.getenv("DREAM_VLLM_URL", "http://localhost:8000") -N8N_URL = os.getenv("DREAM_N8N_URL", "http://localhost:5678") - - -@pytest.fixture -def client(): - return httpx.Client(timeout=10.0) - - -class TestDashboardAPI: - """Dashboard API endpoint tests.""" - - def test_health(self, client): - """API health check returns ok.""" - r = client.get(f"{API_URL}/health") - assert r.status_code == 200 - data = r.json() - assert data["status"] == "ok" - - def test_api_status(self, client): - """Full status endpoint returns expected structure.""" - r = client.get(f"{API_URL}/api/status") - assert r.status_code == 200 - data = r.json() - assert "gpu" in data or "services" in data - assert "tier" in data - - def test_gpu_metrics(self, client): - """GPU endpoint returns NVIDIA metrics.""" - r = client.get(f"{API_URL}/gpu") - if r.status_code == 503: - pytest.skip("No GPU available") - assert r.status_code == 200 - data = r.json() - assert "name" in data - assert "memory_used_mb" in data - assert "memory_total_mb" in data - - def test_services_list(self, client): - """Services endpoint returns service health.""" - r = client.get(f"{API_URL}/services") - assert r.status_code == 200 - data = r.json() - assert isinstance(data, list) - assert len(data) > 0 - # Each service should have id, name, status - for svc in data: - assert "id" in svc - assert "name" in svc - assert "status" in svc - - def test_disk_usage(self, client): - """Disk endpoint returns usage info.""" - r = client.get(f"{API_URL}/disk") - assert r.status_code == 200 - data = r.json() - assert "path" in data - assert "used_gb" in data - assert "total_gb" in data - - -class TestModelAPI: - """Model Manager API tests.""" - - def test_model_catalog(self, client): - """Model catalog returns list of models.""" - r = client.get(f"{API_URL}/api/models") - assert r.status_code == 200 - data = r.json() - assert "models" in data - assert len(data["models"]) > 0 - # Each model should have required fields - for model in data["models"]: - assert "id" in model - assert "name" in model - assert "vramRequired" in model - assert "status" in model - - def test_model_vram_info(self, client): - """Model catalog includes GPU VRAM info.""" - r = client.get(f"{API_URL}/api/models") - assert r.status_code == 200 - data = r.json() - assert "gpu" in data - assert "vramTotal" in data["gpu"] - - -class TestWorkflowAPI: - """Workflow Gallery API tests.""" - - def test_workflow_catalog(self, client): - """Workflow catalog returns list of workflows.""" - r = client.get(f"{API_URL}/api/workflows") - assert r.status_code == 200 - data = r.json() - assert "workflows" in data - assert len(data["workflows"]) > 0 - - def test_workflow_structure(self, client): - """Each workflow has required fields.""" - r = client.get(f"{API_URL}/api/workflows") - assert r.status_code == 200 - data = r.json() - for wf in data["workflows"]: - assert "id" in wf - assert "name" in wf - assert "description" in wf - assert "dependencies" in wf - assert "status" in wf - - def test_workflow_categories(self, client): - """Workflow catalog includes categories.""" - r = client.get(f"{API_URL}/api/workflows") - assert r.status_code == 200 - data = r.json() - assert "categories" in data - assert len(data["categories"]) > 0 - - -class TestVoiceAPI: - """Voice API tests.""" - - def test_voice_status(self, client): - """Voice status returns service health.""" - r = client.get(f"{API_URL}/api/voice/status") - assert r.status_code == 200 - data = r.json() - assert "services" in data - assert "stt" in data["services"] - assert "tts" in data["services"] - assert "livekit" in data["services"] - - -class TestVLLM: - """vLLM inference tests.""" - - def test_vllm_health(self, client): - """vLLM health check.""" - try: - r = client.get(f"{VLLM_URL}/health") - assert r.status_code == 200 - except httpx.ConnectError: - pytest.skip("vLLM not running") - - def test_vllm_inference(self, client): - """vLLM can generate completions.""" - try: - r = client.post( - f"{VLLM_URL}/v1/chat/completions", - json={ - "model": "Qwen/Qwen2.5-32B-Instruct-AWQ", - "messages": [{"role": "user", "content": "Say hello"}], - "max_tokens": 10, - "stream": False - }, - timeout=30.0 - ) - if r.status_code == 200: - data = r.json() - assert "choices" in data - assert len(data["choices"]) > 0 - assert "message" in data["choices"][0] - except httpx.ConnectError: - pytest.skip("vLLM not running") - - -class TestN8N: - """n8n workflow engine tests.""" - - def test_n8n_health(self, client): - """n8n health check.""" - try: - r = client.get(f"{N8N_URL}/healthz") - assert r.status_code == 200 - except httpx.ConnectError: - pytest.skip("n8n not running") - - -if __name__ == "__main__": - pytest.main([__file__, "-v"]) diff --git a/dream-server/tests/test_installer.py b/dream-server/tests/test_installer.py deleted file mode 100644 index 8eb742446..000000000 --- a/dream-server/tests/test_installer.py +++ /dev/null @@ -1,514 +0,0 @@ -#!/usr/bin/env python3 -""" -P3.1 Dream Server Installer Test Suite -Comprehensive automated testing for installer behavior across tiers - -Run: pytest tests/test_installer.py -v - pytest tests/test_installer.py -v -k "tier" # Tier-specific tests only - pytest tests/test_installer.py -v -k "security" # Security tests only -""" - -import os -import sys -import json -import stat -import shutil -import tempfile -import subprocess -from pathlib import Path -from unittest.mock import Mock, patch, MagicMock, call -import pytest - -# Add parent to path for importing installer modules -sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) - - -class TestInstallerTiers: - """Test hardware tier detection and recommendations.""" - - @pytest.fixture - def mock_gpu_info(self): - """Mock GPU detection responses.""" - return { - "rtx_4090": {"name": "NVIDIA RTX 4090", "vram_gb": 24}, - "rtx_3090": {"name": "NVIDIA RTX 3090", "vram_gb": 24}, - "rtx_4070": {"name": "NVIDIA RTX 4070", "vram_gb": 12}, - "rtx_4060": {"name": "NVIDIA RTX 4060", "vram_gb": 8}, - "none": {"name": None, "vram_gb": 0} - } - - def test_tier_1_detection_entry_level(self): - """Tier 1: Entry level with <8GB VRAM.""" - # <8GB VRAM maps to tier 1 (7B models) - vram_gb = 6 # Example: GTX 1060 6GB - expected_tier = 1 - - # Tier logic: <8GB = Tier 1, 8-12GB = Tier 2, 12-24GB = Tier 3, 24GB+ = Tier 4 - if vram_gb < 8: - tier = 1 - elif vram_gb < 12: - tier = 2 - elif vram_gb < 24: - tier = 3 - else: - tier = 4 - - assert tier == expected_tier - assert tier == 1 - - def test_tier_2_detection_prosumer(self): - """Tier 2: Prosumer with 12GB VRAM.""" - vram_gb = 12 - - if vram_gb < 8: - tier = 1 - elif vram_gb < 12: - tier = 2 - elif vram_gb < 24: - tier = 3 - else: - tier = 4 - - assert tier == 3 # 12GB is Tier 3 boundary - - def test_tier_3_detection_pro(self): - """Tier 3: Pro with 24GB VRAM.""" - vram_gb = 24 - - if vram_gb < 8: - tier = 1 - elif vram_gb < 12: - tier = 2 - elif vram_gb < 24: - tier = 3 - else: - tier = 4 - - assert tier == 4 # 24GB+ is Tier 4 - - def test_tier_4_detection_enterprise(self): - """Tier 4: Enterprise with 48GB VRAM.""" - vram_gb = 48 - - if vram_gb < 8: - tier = 1 - elif vram_gb < 12: - tier = 2 - elif vram_gb < 24: - tier = 3 - else: - tier = 4 - - assert tier == 4 - - def test_tier_model_mapping(self): - """Test that tiers map to correct model sizes.""" - tier_models = { - 1: {"model": "Qwen2.5-7B-Q4_K_M", "ctx": 32768, "quant": "GGUF"}, - 2: {"model": "Qwen2.5-14B-AWQ", "ctx": 32768, "quant": "AWQ"}, - 3: {"model": "Qwen2.5-32B-AWQ", "ctx": 32768, "quant": "AWQ"}, - 4: {"model": "Qwen2.5-72B-AWQ", "ctx": 32768, "quant": "AWQ"} - } - - assert tier_models[1]["model"] == "Qwen2.5-7B-Q4_K_M" - assert tier_models[2]["model"] == "Qwen2.5-14B-AWQ" - assert tier_models[3]["model"] == "Qwen2.5-32B-AWQ" - assert tier_models[4]["model"] == "Qwen2.5-72B-AWQ" - - -class TestHardwareDetection: - """Test hardware detection functions.""" - - def test_nvidia_gpu_detection_regex(self): - """Test NVIDIA GPU name parsing from nvidia-smi.""" - sample_output = "NVIDIA GeForce RTX 4090" - - # Should extract GPU model - if "RTX" in sample_output: - gpu_model = sample_output.split("RTX")[-1].strip() - assert gpu_model == "4090" - - def test_vram_parsing(self): - """Test VRAM parsing from nvidia-smi.""" - # MiB to GB conversion - mib = 24576 # 24GB in MiB - gb = round(mib / 1024) - assert gb == 24 - - def test_cpu_info_parsing(self): - """Test CPU info extraction.""" - cpu_info = "AMD Ryzen 9 7950X 16-Core Processor" - - # Should extract model and cores - assert "AMD" in cpu_info - assert "7950X" in cpu_info - assert "16-Core" in cpu_info - - def test_ram_parsing(self): - """Test RAM parsing from /proc/meminfo.""" - # kB to GB conversion - kb = 67108864 # 64GB in kB - gb = round(kb / 1024 / 1024) - assert gb == 64 - - def test_disk_space_check(self): - """Test available disk space parsing.""" - # Test tier-aware requirements - requirements = { - 1: 30, # 30GB minimum - 2: 50, # 50GB minimum - 3: 100, # 100GB minimum - 4: 150 # 150GB minimum - } - - assert requirements[1] == 30 - assert requirements[4] == 150 - - -class TestSecurityChecks: - """Test security-related installer behavior.""" - - @pytest.fixture - def temp_env_file(self): - """Create temporary .env file for testing.""" - with tempfile.NamedTemporaryFile(mode='w', suffix='.env', delete=False) as f: - f.write("HF_TOKEN=test_token\n") - f.write("API_KEY=secret_key\n") - f.write("DB_PASSWORD=db_pass\n") - temp_path = f.name - yield temp_path - os.unlink(temp_path) - - def test_env_file_permissions_600(self, temp_env_file): - """Test .env file gets 600 permissions (owner read/write only).""" - # Set permissions to 600 - os.chmod(temp_env_file, stat.S_IRUSR | stat.S_IWUSR) - - # Verify permissions - file_stat = os.stat(temp_env_file) - mode = stat.S_IMODE(file_stat.st_mode) - - assert mode == 0o600, f"Expected 0o600, got {oct(mode)}" - - def test_env_file_not_world_readable(self, temp_env_file): - """Ensure .env file is not world-readable.""" - os.chmod(temp_env_file, stat.S_IRUSR | stat.S_IWUSR) - - file_stat = os.stat(temp_env_file) - mode = stat.S_IMODE(file_stat.st_mode) - - # Check world permissions - world_readable = bool(mode & stat.S_IROTH) - assert not world_readable, ".env file should not be world-readable" - - def test_env_file_not_group_readable(self, temp_env_file): - """Ensure .env file is not group-readable.""" - os.chmod(temp_env_file, stat.S_IRUSR | stat.S_IWUSR) - - file_stat = os.stat(temp_env_file) - mode = stat.S_IMODE(file_stat.st_mode) - - # Check group permissions - group_readable = bool(mode & stat.S_IRGRP) - assert not group_readable, ".env file should not be group-readable" - - def test_hf_token_validation_present(self, temp_env_file): - """Test that HF_TOKEN validation detects tokens in .env.""" - with open(temp_env_file, 'r') as f: - content = f.read() - - assert "HF_TOKEN=" in content - - # Extract token value - for line in content.split('\n'): - if line.startswith('HF_TOKEN='): - token = line.split('=', 1)[1] - assert token == "test_token" - break - - def test_hf_token_warning_for_gated_models(self): - """Test that warning is shown for Llama models requiring HF_TOKEN.""" - model = "meta-llama/Llama-2-7b" - requires_token = "llama" in model.lower() - - assert requires_token == True - - -class TestPortChecks: - """Test port availability checking.""" - - def test_port_check_regex_ipv4(self): - """Test port regex handles IPv4 addresses.""" - port_output = "tcp 0 0 0.0.0.0:3000 0.0.0.0:* LISTEN" - - # Extract port - import re - match = re.search(r':(\d+)\s+0\.0\.0\.0', port_output) - if match: - port = int(match.group(1)) - assert port == 3000 - - def test_port_check_regex_ipv6(self): - """Test port regex handles IPv6 addresses.""" - port_output = "tcp6 0 0 :::3000 :::* LISTEN" - - import re - # Should match IPv6 format - match = re.search(r':::(\d+)', port_output) - if match: - port = int(match.group(1)) - assert port == 3000 - - def test_critical_ports_list(self): - """Test that critical ports are defined.""" - critical_ports = [3000, 3001, 8000, 8080, 9100, 9101, 9102] - - assert 3000 in critical_ports # Open WebUI - assert 3001 in critical_ports # Dashboard - assert 8000 in critical_ports # vLLM - assert 9101 in critical_ports # Whisper STT - assert 9102 in critical_ports # TTS - - def test_port_availability_check(self): - """Test port availability logic.""" - used_ports = [3000, 8000] - test_port = 3001 - - is_available = test_port not in used_ports - assert is_available == True - - -class TestDiskSpaceChecks: - """Test disk space validation.""" - - def test_disk_space_tier_1_requirement(self): - """Test Tier 1 minimum disk requirement (30GB).""" - available_gb = 50 - required_gb = 30 - - assert available_gb >= required_gb - - def test_disk_space_tier_4_requirement(self): - """Test Tier 4 minimum disk requirement (150GB).""" - available_gb = 200 - required_gb = 150 - - assert available_gb >= required_gb - - def test_disk_space_insufficient_warning(self): - """Test warning when disk space is insufficient.""" - available_gb = 20 - required_gb = 30 - - has_enough = available_gb >= required_gb - assert has_enough == False - - def test_disk_space_calculation(self): - """Test disk space calculation from df output.""" - # Simulate df -BG output parsing - df_line = "/dev/nvme0n1p1 915G 123G 745G 15% /" - parts = df_line.split() - available = parts[3] # Available column - - # Parse GB value - if 'G' in available: - gb = int(available.replace('G', '')) - assert gb == 745 - - -class TestDownloadLogic: - """Test download and retry logic.""" - - def test_retry_mechanism_max_attempts(self): - """Test that download retries up to MAX_DOWNLOAD_RETRIES.""" - MAX_RETRIES = 3 - attempts = 0 - - # Simulate failed download with retries - for i in range(MAX_RETRIES): - attempts += 1 - if i < MAX_RETRIES - 1: - continue # Simulate failure - else: - break # Success or final failure - - assert attempts <= MAX_RETRIES - - def test_partial_download_cleanup(self): - """Test that partial downloads are cleaned up on failure.""" - with tempfile.TemporaryDirectory() as tmpdir: - partial_file = os.path.join(tmpdir, "model.gguf.tmp") - - # Create partial file - with open(partial_file, 'w') as f: - f.write("partial data") - - assert os.path.exists(partial_file) - - # Simulate cleanup - os.remove(partial_file) - - assert not os.path.exists(partial_file) - - def test_download_resume_capability(self): - """Test download resume with partial files.""" - # If partial file exists, resume from where it left off - partial_size = 1024 * 1024 * 100 # 100MB partial - total_size = 1024 * 1024 * 500 # 500MB total - - resume_from = partial_size - remaining = total_size - partial_size - - assert resume_from == 100 * 1024 * 1024 - assert remaining == 400 * 1024 * 1024 - - -class TestDockerIntegration: - """Test Docker-related installer functionality.""" - - def test_docker_compose_file_selection_by_tier(self): - """Test correct docker-compose file selection per tier.""" - compose_files = { - 1: "docker-compose.yml", - 2: "docker-compose.yml", - 3: "docker-compose.yml", - 4: "docker-compose.yml", - "edge": "docker-compose.edge.yml" - } - - assert compose_files["edge"] == "docker-compose.edge.yml" - - def test_docker_service_healthchecks(self): - """Test that critical services have healthchecks defined.""" - services_with_healthchecks = [ - "vllm", "dashboard-api", "whisper", "kokoro-tts" - ] - - assert "vllm" in services_with_healthchecks - assert "whisper" in services_with_healthchecks - - def test_docker_group_membership(self): - """Test Docker group handling in installer.""" - # User should be added to docker group if not already member - groups = ["michael", "docker", "sudo"] - - assert "docker" in groups - - -class TestBootstrapMode: - """Test bootstrap mode functionality.""" - - def test_bootstrap_model_selection(self): - """Test that bootstrap mode uses 1.5B model.""" - bootstrap_model = "Qwen2.5-1.5B-Instruct" - - assert "1.5B" in bootstrap_model - - def test_bootstrap_quick_start(self): - """Test bootstrap mode enables instant startup.""" - # Bootstrap mode should skip large model download - bootstrap_enabled = True - - assert bootstrap_enabled == True - - def test_bootstrap_upgrade_path(self): - """Test that bootstrap allows tier-based upgrade.""" - # After bootstrap, user should be able to upgrade to tier model - initial_tier = "bootstrap" - target_tier = 3 - - assert initial_tier == "bootstrap" - assert target_tier > 0 - - -class TestOfflineMode: - """Test offline/air-gapped mode (M1).""" - - def test_offline_mode_detection(self): - """Test offline mode flag.""" - offline_mode = True - - assert offline_mode == True - - def test_offline_model_validation(self): - """Test that models are pre-downloaded in offline mode.""" - required_models = ["qwen-2.5-7b.gguf"] - available_models = ["qwen-2.5-7b.gguf", "qwen-2.5-14b.gguf"] - - for model in required_models: - assert model in available_models - - def test_offline_no_internet_calls(self): - """Test that offline mode skips internet-dependent operations.""" - operations = ["docker_pull", "model_download", "git_clone"] - offline_skip = ["model_download", "git_clone"] - - for op in offline_skip: - assert op in operations - - -class TestIntegrationScenarios: - """End-to-end integration test scenarios.""" - - def test_full_install_tier_2_with_voice(self): - """Test Tier 2 installation with voice services.""" - tier = 2 - enable_voice = True - - # Should select appropriate models - assert tier == 2 - assert enable_voice == True - - def test_non_interactive_install(self): - """Test non-interactive mode with flags.""" - args = { - "tier": 3, - "voice": True, - "workflows": True, - "rag": True, - "non_interactive": True - } - - assert args["non_interactive"] == True - assert args["tier"] == 3 - - def test_dry_run_mode(self): - """Test dry-run mode shows actions without executing.""" - dry_run = True - - # In dry-run, no actual changes should be made - assert dry_run == True - - -class TestErrorHandling: - """Test installer error handling.""" - - def test_docker_not_installed_error(self): - """Test graceful error when Docker is not installed.""" - docker_installed = False - - if not docker_installed: - should_offer_install = True - assert should_offer_install == True - - def test_nvidia_driver_missing_warning(self): - """Test warning when NVIDIA drivers are missing.""" - nvidia_available = False - - if not nvidia_available: - should_warn = True - assert should_warn == True - - def test_insufficient_disk_space_error(self): - """Test error when disk space is insufficient.""" - available_gb = 10 - required_gb = 30 - - if available_gb < required_gb: - should_error = True - assert should_error == True - - -# Run tests if executed directly -if __name__ == "__main__": - pytest.main([__file__, "-v"]) diff --git a/dream-server/tests/test_m4_voice_shield_integration.py b/dream-server/tests/test_m4_voice_shield_integration.py deleted file mode 100644 index e4a9053fe..000000000 --- a/dream-server/tests/test_m4_voice_shield_integration.py +++ /dev/null @@ -1,448 +0,0 @@ -#!/usr/bin/env python3 -""" -M4 Voice-to-Shield Integration Test Suite -Validates the complete voice โ†’ shield โ†’ API pipeline per M4 spec - -The Privacy Shield is a transparent proxy - it intercepts chat completions -and performs anonymization/deanonymization automatically. - -Usage: - python3 tests/test_m4_voice_shield_integration.py - python3 tests/test_m4_voice_shield_integration.py --stress - python3 tests/test_m4_voice_shield_integration.py --verbose - -Exit codes: - 0 - All tests passed - 1 - Some tests failed -""" - -import os -import sys -import json -import time -import asyncio -import argparse -from typing import Dict, Any, Optional -from dataclasses import dataclass -from pathlib import Path - -import httpx - -# Configuration -SHIELD_URL = os.getenv("SHIELD_URL", "http://localhost:8085/v1/chat/completions") -DIRECT_LLM_URL = os.getenv("DIRECT_LLM_URL", "http://localhost:8003/v1/chat/completions") -STT_URL = os.getenv("STT_URL", "http://localhost:9000/v1/audio/transcriptions") -TTS_URL = os.getenv("TTS_URL", "http://localhost:8880/v1/audio/speech") -SHIELD_HEALTH = os.getenv("SHIELD_HEALTH", "http://localhost:8085/health") - -TIMEOUT = 30.0 - - -@dataclass -class PipelineResult: - """Result of a pipeline stage.""" - stage: str - success: bool - latency_ms: float - error: Optional[str] = None - data: Optional[Dict] = None - - -class M4IntegrationTest: - """M4 Voice-Shield integration test suite.""" - - def __init__(self, verbose: bool = False): - self.verbose = verbose - self.results: list[PipelineResult] = [] - self.client = httpx.AsyncClient(timeout=TIMEOUT) - - async def __aenter__(self): - return self - - async def __aexit__(self, *args): - await self.client.aclose() - - def log(self, message: str): - """Print if verbose mode.""" - if self.verbose: - print(f" [M4] {message}") - - # ================================================================= - # Health Check - # ================================================================= - - async def check_shield_health(self) -> bool: - """Check if Privacy Shield is running.""" - try: - response = await self.client.get(SHIELD_HEALTH) - return response.status_code == 200 - except Exception as e: - print(f"Shield health check failed: {e}") - return False - - # ================================================================= - # Stage 1: Shield Proxy Test (Anonymization via proxy) - # ================================================================= - - async def test_shield_proxy(self, user_text: str, system_prompt: str = "") -> PipelineResult: - """Test Shield proxy with PII in user message. - - The Shield should anonymize the request before sending to LLM, - then de-anonymize the response. - """ - start = time.perf_counter() - - messages = [] - if system_prompt: - messages.append({"role": "system", "content": system_prompt}) - messages.append({"role": "user", "content": user_text}) - - try: - response = await self.client.post( - SHIELD_URL, - json={ - "model": "Qwen/Qwen2.5-32B-Instruct-AWQ", - "messages": messages, - "temperature": 0.7, - "max_tokens": 256 - }, - timeout=TIMEOUT - ) - response.raise_for_status() - data = response.json() - - latency_ms = (time.perf_counter() - start) * 1000 - - content = data["choices"][0]["message"]["content"] - self.log(f"Shield proxy response: {content[:100]}...") - - # Check if response contains de-anonymized content - # If user mentioned "John Smith", response should too (not ) - has_placeholders = " PipelineResult: - """Test direct LLM without Shield for comparison.""" - start = time.perf_counter() - - messages = [] - if system_prompt: - messages.append({"role": "system", "content": system_prompt}) - messages.append({"role": "user", "content": user_text}) - - try: - response = await self.client.post( - DIRECT_LLM_URL, - json={ - "model": "Qwen/Qwen2.5-32B-Instruct-AWQ", - "messages": messages, - "temperature": 0.7, - "max_tokens": 256 - }, - timeout=TIMEOUT - ) - response.raise_for_status() - data = response.json() - - latency_ms = (time.perf_counter() - start) * 1000 - - content = data["choices"][0]["message"]["content"] - self.log(f"Direct LLM response: {content[:100]}...") - - return PipelineResult( - stage="direct_llm", - success=True, - latency_ms=latency_ms, - data={"response": content, "raw": data} - ) - - except Exception as e: - latency_ms = (time.perf_counter() - start) * 1000 - return PipelineResult( - stage="direct_llm", - success=False, - latency_ms=latency_ms, - error=str(e) - ) - - # ================================================================= - # Stage 3: Full Pipeline Integration Test - # ================================================================= - - async def test_full_pipeline(self, user_query: str, scenario: str) -> Dict[str, Any]: - """Test complete voice โ†’ shield โ†’ API pipeline. - - Simulates: Voice input โ†’ STT โ†’ LLM(via Shield) โ†’ TTS - """ - print(f"\n{'='*60}") - print(f"Scenario: {scenario}") - print(f"Query: \"{user_query}\"") - print(f"{'='*60}") - - results = [] - - # Step 1: Test through Shield Proxy - print("\n1. Testing Shield Proxy (with anonymization)...") - system_prompt = "You are a helpful assistant. Keep responses brief." - - shield_result = await self.test_shield_proxy(user_query, system_prompt) - results.append(shield_result) - - if not shield_result.success: - print(f" โŒ FAILED: {shield_result.error}") - return {"success": False, "stage": "shield_proxy", "results": results} - - print(f" โœ… Latency: {shield_result.latency_ms:.1f}ms") - print(f" ๐Ÿ“ Response: {shield_result.data['response'][:80]}...") - - if shield_result.data.get('has_placeholders'): - print(f" โš ๏ธ Warning: Response contains unresolved placeholders") - - # Step 2: Compare with Direct LLM - print("\n2. Comparing with Direct LLM (no shield)...") - direct_result = await self.test_direct_llm(user_query, system_prompt) - results.append(direct_result) - - if direct_result.success: - overhead_ms = shield_result.latency_ms - direct_result.latency_ms - print(f" โœ… Latency: {direct_result.latency_ms:.1f}ms") - print(f" ๐Ÿ“Š Shield Overhead: {overhead_ms:+.1f}ms") - else: - print(f" โš ๏ธ Direct LLM failed (non-critical): {direct_result.error}") - - # Summary for this test - total_latency = shield_result.latency_ms - print(f"\n๐Ÿ“Š Total Pipeline Latency: {total_latency:.1f}ms") - - return { - "success": True, - "results": results, - "total_latency_ms": total_latency, - "shield_overhead_ms": overhead_ms if direct_result.success else None - } - - # ================================================================= - # Test Scenarios - # ================================================================= - - async def run_all_tests(self) -> bool: - """Run all M4 integration tests.""" - print("\n" + "="*60) - print("M4 Voice-Shield Integration Test Suite") - print("="*60) - print(f"Shield Proxy: {SHIELD_URL}") - print(f"Direct LLM: {DIRECT_LLM_URL}") - - # Pre-flight health check - print("\n๐Ÿ” Pre-flight Health Check...") - if await self.check_shield_health(): - print(" โœ… Privacy Shield is healthy") - else: - print(" โŒ Privacy Shield is not responding") - return False - - # Test scenarios - test_cases = [ - { - "scenario": "Weather Query with PII", - "query": "What's the weather like in Austin? I'm John Smith." - }, - { - "scenario": "Contact Request with Phone", - "query": "Call Mary at 555-1234 about the meeting." - }, - { - "scenario": "Email Reference", - "query": "Send an email to david@example.com regarding the project." - }, - { - "scenario": "Address Mention", - "query": "Schedule a meeting at 123 Main Street, Boston." - }, - { - "scenario": "No PII (Baseline)", - "query": "What is the capital of France?" - } - ] - - all_passed = True - total_tests = 0 - passed_tests = 0 - latencies = [] - overheads = [] - - for test_case in test_cases: - result = await self.test_full_pipeline( - test_case["query"], - test_case["scenario"] - ) - total_tests += 1 - - if result["success"]: - passed_tests += 1 - latencies.append(result["total_latency_ms"]) - if result.get("shield_overhead_ms") is not None: - overheads.append(result["shield_overhead_ms"]) - print(f"\nโœ… TEST PASSED") - else: - all_passed = False - print(f"\nโŒ TEST FAILED at stage: {result['stage']}") - - # Summary - print("\n" + "="*60) - print("TEST SUMMARY") - print("="*60) - print(f"Passed: {passed_tests}/{total_tests}") - print(f"Failed: {total_tests - passed_tests}/{total_tests}") - - if latencies: - avg_latency = sum(latencies) / len(latencies) - p95_latency = sorted(latencies)[int(len(latencies) * 0.95)] - print(f"\n๐Ÿ“Š Latency Statistics:") - print(f" Mean: {avg_latency:.1f}ms") - print(f" P95: {p95_latency:.1f}ms") - - # M4 Spec compliance - print(f"\nโœ… M4 Spec Compliance:") - print(f" Target P95 < 2250ms: {'PASS' if p95_latency < 2250 else 'FAIL'}") - - if overheads: - avg_overhead = sum(overheads) / len(overheads) - print(f"\n๐Ÿ“Š Shield Overhead:") - print(f" Mean: {avg_overhead:+.1f}ms") - print(f" Target < 50ms: {'PASS' if avg_overhead < 50 else 'FAIL'}") - - if all_passed: - print("\n๐ŸŽ‰ All M4 integration tests PASSED!") - print("Voice โ†’ Shield โ†’ LLM pipeline is working correctly.") - else: - print("\nโš ๏ธ Some tests failed. Review errors above.") - - return all_passed - - # ================================================================= - # Latency Benchmark - # ================================================================= - - async def run_latency_benchmark(self, iterations: int = 50): - """Run latency benchmark comparing Shield vs Direct.""" - print("\n" + "="*60) - print(f"M4 Shield Latency Benchmark ({iterations} iterations)") - print("="*60) - - test_query = "What's the weather in Austin? I'm John Smith." - system_prompt = "You are a helpful assistant. Keep responses brief." - - # Warmup - print("Warming up...") - for _ in range(3): - await self.test_shield_proxy(test_query, system_prompt) - await self.test_direct_llm(test_query, system_prompt) - - # Benchmark Shield - print(f"\nRunning {iterations} Shield proxy requests...") - shield_latencies = [] - - for i in range(iterations): - result = await self.test_shield_proxy(test_query, system_prompt) - if result.success: - shield_latencies.append(result.latency_ms) - - if (i + 1) % 10 == 0: - print(f" Progress: {i + 1}/{iterations}") - - # Benchmark Direct - print(f"\nRunning {iterations} Direct LLM requests...") - direct_latencies = [] - - for i in range(iterations): - result = await self.test_direct_llm(test_query, system_prompt) - if result.success: - direct_latencies.append(result.latency_ms) - - if (i + 1) % 10 == 0: - print(f" Progress: {i + 1}/{iterations}") - - # Stats - def calc_stats(latencies): - if not latencies: - return {} - latencies.sort() - return { - "mean": sum(latencies) / len(latencies), - "p50": latencies[len(latencies) // 2], - "p95": latencies[int(len(latencies) * 0.95)], - "p99": latencies[int(len(latencies) * 0.99)], - "min": min(latencies), - "max": max(latencies) - } - - shield_stats = calc_stats(shield_latencies) - direct_stats = calc_stats(direct_latencies) - - print(f"\n๐Ÿ“Š Shield Proxy Results:") - if shield_stats: - print(f" Mean: {shield_stats['mean']:.2f}ms") - print(f" P50: {shield_stats['p50']:.2f}ms") - print(f" P95: {shield_stats['p95']:.2f}ms") - print(f" P99: {shield_stats['p99']:.2f}ms") - - print(f"\n๐Ÿ“Š Direct LLM Results:") - if direct_stats: - print(f" Mean: {direct_stats['mean']:.2f}ms") - print(f" P50: {direct_stats['p50']:.2f}ms") - print(f" P95: {direct_stats['p95']:.2f}ms") - print(f" P99: {direct_stats['p99']:.2f}ms") - - if shield_stats and direct_stats: - overhead_mean = shield_stats['mean'] - direct_stats['mean'] - overhead_p95 = shield_stats['p95'] - direct_stats['p95'] - - print(f"\n๐Ÿ“Š Shield Overhead:") - print(f" Mean: {overhead_mean:+.2f}ms") - print(f" P95: {overhead_p95:+.2f}ms") - - print(f"\nโœ… M4 Spec Compliance:") - print(f" Target Shield P95 < 50ms overhead: {'PASS' if overhead_p95 < 50 else 'FAIL'}") - - -async def main(): - parser = argparse.ArgumentParser(description="M4 Voice-Shield Integration Tests") - parser.add_argument("--stress", action="store_true", help="Run latency benchmark") - parser.add_argument("--verbose", "-v", action="store_true", help="Verbose output") - parser.add_argument("--iterations", "-n", type=int, default=50, help="Benchmark iterations") - args = parser.parse_args() - - async with M4IntegrationTest(verbose=args.verbose) as tester: - if args.stress: - await tester.run_latency_benchmark(iterations=args.iterations) - else: - success = await tester.run_all_tests() - sys.exit(0 if success else 1) - - -if __name__ == "__main__": - asyncio.run(main()) diff --git a/dream-server/tests/validate-agent-templates.py b/dream-server/tests/validate-agent-templates.py old mode 100755 new mode 100644 index 1069edb88..e2b85451b --- a/dream-server/tests/validate-agent-templates.py +++ b/dream-server/tests/validate-agent-templates.py @@ -1,7 +1,7 @@ #!/usr/bin/env python3 """ M7 Agent Template Validation -Tests that agent templates work reliably on local Qwen2.5-32B. +Tests that agent templates work reliably on local qwen2.5-32b-instruct via llama-server. """ import requests @@ -10,8 +10,8 @@ import sys from pathlib import Path -VLLM_URL = "http://localhost:8000" -MODEL = "Qwen/Qwen2.5-32B-Instruct-AWQ" +LLAMA_SERVER_URL = "http://localhost:8080" +MODEL = "qwen2.5-32b-instruct" TEMPLATES = { "code-assistant": { @@ -77,7 +77,7 @@ def test_template(name: str, config: dict) -> dict: try: start = time.time() response = requests.post( - f"{VLLM_URL}/v1/chat/completions", + f"{LLAMA_SERVER_URL}/v1/chat/completions", json=payload, timeout=30 ) @@ -127,7 +127,7 @@ def test_template(name: str, config: dict) -> dict: def main(): print("=" * 60) print("M7 Agent Template Validation") - print("Testing on Qwen2.5-32B-Instruct-AWQ") + print("Testing on qwen2.5-32b-instruct") print("=" * 60) all_results = [] diff --git a/dream-server/tests/voice-stress-test.py b/dream-server/tests/voice-stress-test.py deleted file mode 100644 index 31896331b..000000000 --- a/dream-server/tests/voice-stress-test.py +++ /dev/null @@ -1,277 +0,0 @@ -#!/usr/bin/env python3 -""" -Voice Pipeline Stress Test -Tests concurrent voice round-trips: LiveKit โ†’ Whisper โ†’ vLLM โ†’ Kokoro - -Usage: python voice-stress-test.py --concurrent 10 -""" - -import asyncio -import aiohttp -import time -import argparse -import statistics -from dataclasses import dataclass -from typing import List -import json - -# Service endpoints -WHISPER_URL = "http://localhost:9000/v1/audio/transcriptions" -VLLM_URL = "http://localhost:8000/v1/chat/completions" -KOKORO_URL = "http://localhost:8880/v1/audio/speech" - -# Test audio - 1 second of silence as WAV (for STT timing without real audio) -# In real test, we'd use actual speech samples -TEST_PROMPT = "Hello, how are you today?" - - -@dataclass -class RoundTripResult: - """Results from one voice round-trip""" - session_id: int - stt_ms: float - llm_ms: float - tts_ms: float - total_ms: float - success: bool - error: str = "" - - -async def test_stt(session: aiohttp.ClientSession, session_id: int) -> tuple[float, str]: - """Test STT endpoint - simulate transcription request""" - start = time.perf_counter() - try: - # For stress testing, we'll simulate with a health check - # Real test would send actual audio - async with session.get("http://localhost:9000/health", timeout=30) as resp: - elapsed = (time.perf_counter() - start) * 1000 - if resp.status == 200: - # Simulate STT processing time based on health - return elapsed, TEST_PROMPT - return elapsed, "" - except Exception as e: - return (time.perf_counter() - start) * 1000, f"STT Error: {e}" - - -async def test_llm(session: aiohttp.ClientSession, session_id: int, text: str) -> tuple[float, str]: - """Test LLM endpoint""" - start = time.perf_counter() - try: - payload = { - "model": "Qwen/Qwen2.5-32B-Instruct-AWQ", - "messages": [ - {"role": "system", "content": "You are a helpful voice assistant. Keep responses under 50 words."}, - {"role": "user", "content": text} - ], - "max_tokens": 100, - "temperature": 0.7 - } - async with session.post(VLLM_URL, json=payload, timeout=60) as resp: - elapsed = (time.perf_counter() - start) * 1000 - if resp.status == 200: - data = await resp.json() - response_text = data["choices"][0]["message"]["content"] - return elapsed, response_text - return elapsed, f"LLM Error: {resp.status}" - except Exception as e: - return (time.perf_counter() - start) * 1000, f"LLM Error: {e}" - - -async def test_tts(session: aiohttp.ClientSession, session_id: int, text: str) -> tuple[float, bool]: - """Test TTS endpoint""" - start = time.perf_counter() - try: - payload = { - "model": "kokoro", - "input": text[:200], # Limit text length - "voice": "af_heart", - "response_format": "mp3" - } - async with session.post(KOKORO_URL, json=payload, timeout=120) as resp: - elapsed = (time.perf_counter() - start) * 1000 - if resp.status == 200: - # Read the audio to ensure full synthesis - audio_data = await resp.read() - return elapsed, len(audio_data) > 0 - return elapsed, False - except Exception as e: - return (time.perf_counter() - start) * 1000, False - - -async def run_voice_roundtrip(session: aiohttp.ClientSession, session_id: int) -> RoundTripResult: - """Run a full voice round-trip""" - total_start = time.perf_counter() - - # STT - stt_ms, transcription = await test_stt(session, session_id) - if not transcription or transcription.startswith("STT Error"): - return RoundTripResult( - session_id=session_id, - stt_ms=stt_ms, llm_ms=0, tts_ms=0, - total_ms=(time.perf_counter() - total_start) * 1000, - success=False, error=str(transcription) - ) - - # LLM - llm_ms, response = await test_llm(session, session_id, transcription) - if response.startswith("LLM Error"): - return RoundTripResult( - session_id=session_id, - stt_ms=stt_ms, llm_ms=llm_ms, tts_ms=0, - total_ms=(time.perf_counter() - total_start) * 1000, - success=False, error=response - ) - - # TTS - tts_ms, tts_ok = await test_tts(session, session_id, response) - - return RoundTripResult( - session_id=session_id, - stt_ms=stt_ms, - llm_ms=llm_ms, - tts_ms=tts_ms, - total_ms=(time.perf_counter() - total_start) * 1000, - success=tts_ok - ) - - -async def run_concurrent_test(concurrent: int, rounds: int = 3) -> List[RoundTripResult]: - """Run concurrent voice round-trips""" - all_results = [] - - connector = aiohttp.TCPConnector(limit=concurrent * 2) - async with aiohttp.ClientSession(connector=connector) as session: - for round_num in range(rounds): - print(f"\n{'='*60}") - print(f"Round {round_num + 1}/{rounds} - {concurrent} concurrent sessions") - print('='*60) - - tasks = [ - run_voice_roundtrip(session, i) - for i in range(concurrent) - ] - - start = time.perf_counter() - results = await asyncio.gather(*tasks) - wall_time = (time.perf_counter() - start) * 1000 - - all_results.extend(results) - - # Print round results - successes = sum(1 for r in results if r.success) - print(f"Completed: {successes}/{concurrent} successful") - print(f"Wall time: {wall_time:.0f}ms") - - if successes > 0: - successful = [r for r in results if r.success] - print(f"STT avg: {statistics.mean(r.stt_ms for r in successful):.0f}ms") - print(f"LLM avg: {statistics.mean(r.llm_ms for r in successful):.0f}ms") - print(f"TTS avg: {statistics.mean(r.tts_ms for r in successful):.0f}ms") - print(f"Total avg: {statistics.mean(r.total_ms for r in successful):.0f}ms") - - # Brief pause between rounds - if round_num < rounds - 1: - await asyncio.sleep(1) - - return all_results - - -def print_summary(results: List[RoundTripResult], concurrent: int): - """Print final summary""" - print("\n" + "="*60) - print("STRESS TEST SUMMARY") - print("="*60) - - successful = [r for r in results if r.success] - failed = [r for r in results if not r.success] - - print(f"\nConcurrency level: {concurrent}") - print(f"Total attempts: {len(results)}") - print(f"Successful: {len(successful)} ({100*len(successful)/len(results):.1f}%)") - print(f"Failed: {len(failed)}") - - if successful: - print(f"\n{'Stage':<12} {'Min':>8} {'Avg':>8} {'Max':>8} {'P95':>8}") - print("-" * 48) - - for stage, getter in [ - ("STT", lambda r: r.stt_ms), - ("LLM", lambda r: r.llm_ms), - ("TTS", lambda r: r.tts_ms), - ("Total", lambda r: r.total_ms) - ]: - values = [getter(r) for r in successful] - values.sort() - p95_idx = int(len(values) * 0.95) - print(f"{stage:<12} {min(values):>7.0f}ms {statistics.mean(values):>7.0f}ms " - f"{max(values):>7.0f}ms {values[p95_idx] if p95_idx < len(values) else values[-1]:>7.0f}ms") - - # Throughput - total_time_s = sum(r.total_ms for r in successful) / 1000 - print(f"\nEffective throughput: {len(successful) / (total_time_s / concurrent):.1f} round-trips/sec") - - # Bottleneck analysis - avg_stt = statistics.mean(r.stt_ms for r in successful) - avg_llm = statistics.mean(r.llm_ms for r in successful) - avg_tts = statistics.mean(r.tts_ms for r in successful) - - bottleneck = max([("STT", avg_stt), ("LLM", avg_llm), ("TTS", avg_tts)], key=lambda x: x[1]) - print(f"\n๐ŸŽฏ Bottleneck: {bottleneck[0]} ({bottleneck[1]:.0f}ms avg)") - - # Scaling estimate - if avg_tts > avg_llm * 2: - print("โš ๏ธ TTS is >2x slower than LLM - TTS scaling limits concurrency") - - if failed: - print(f"\nFailure samples:") - for r in failed[:3]: - print(f" Session {r.session_id}: {r.error}") - - -async def check_services(): - """Verify all services are up before testing""" - print("Checking services...") - - services = [ - ("Whisper STT", "http://localhost:9000/health"), - ("vLLM", "http://localhost:8000/health"), - ("Kokoro TTS", "http://localhost:8880/health"), - ] - - async with aiohttp.ClientSession() as session: - for name, url in services: - try: - async with session.get(url, timeout=5) as resp: - status = "โœ…" if resp.status == 200 else f"โš ๏ธ {resp.status}" - print(f" {name}: {status}") - except Exception as e: - print(f" {name}: โŒ {e}") - return False - return True - - -async def main(): - parser = argparse.ArgumentParser(description="Voice Pipeline Stress Test") - parser.add_argument("--concurrent", "-c", type=int, default=5, - help="Number of concurrent sessions (default: 5)") - parser.add_argument("--rounds", "-r", type=int, default=3, - help="Number of test rounds (default: 3)") - parser.add_argument("--skip-check", action="store_true", - help="Skip service health check") - args = parser.parse_args() - - print("๐ŸŽ™๏ธ Voice Pipeline Stress Test") - print(f"Testing {args.concurrent} concurrent sessions ร— {args.rounds} rounds") - print() - - if not args.skip_check: - if not await check_services(): - print("\nโŒ Some services are down. Fix before testing.") - return - - results = await run_concurrent_test(args.concurrent, args.rounds) - print_summary(results, args.concurrent) - - -if __name__ == "__main__": - asyncio.run(main()) diff --git a/dream-server/token-spy-schema/001_init.sql b/dream-server/token-spy-schema/001_init.sql deleted file mode 100644 index d32b835b0..000000000 --- a/dream-server/token-spy-schema/001_init.sql +++ /dev/null @@ -1,205 +0,0 @@ --- Token Spy Database Schema --- PostgreSQL + TimescaleDB Initialization --- Run automatically on container first start - --- Enable TimescaleDB extension -CREATE EXTENSION IF NOT EXISTS timescaledb; - --- ============================================ --- Core Tables --- ============================================ - --- API requests log (main time-series data) -CREATE TABLE IF NOT EXISTS api_requests ( - id BIGSERIAL, - timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(), - session_id TEXT, - request_id TEXT UNIQUE, - - -- Request metadata - provider TEXT NOT NULL, -- 'anthropic', 'openai', 'google', 'local' - model TEXT NOT NULL, - api_key_prefix TEXT, -- First 8 chars for grouping - - -- Token counts - prompt_tokens INTEGER DEFAULT 0, - completion_tokens INTEGER DEFAULT 0, - total_tokens INTEGER DEFAULT 0, - - -- Cost (in USD, calculated at request time) - prompt_cost DECIMAL(12, 8) DEFAULT 0, - completion_cost DECIMAL(12, 8) DEFAULT 0, - total_cost DECIMAL(12, 8) DEFAULT 0, - - -- Performance metrics - latency_ms INTEGER, -- Total request latency - time_to_first_token_ms INTEGER, -- For streaming - - -- Response metadata - status_code INTEGER DEFAULT 200, - finish_reason TEXT, -- 'stop', 'length', 'error', etc. - - -- System prompt info (for decomposition analysis) - system_prompt_hash TEXT, -- Hash of system prompt - system_prompt_length INTEGER, - - -- Tenant attribution (Phase 4 multi-tenancy) - tenant_id TEXT, -- From X-OpenClaw-Tenant-ID header - - -- Cache tokens (optional, for LLM cache tracking) - cache_read_tokens INTEGER DEFAULT 0, - cache_write_tokens INTEGER DEFAULT 0, - - -- Request metadata (optional, for debugging) - request_body_bytes INTEGER, - tool_count INTEGER, - - -- Message analysis (optional, stored in metadata or separate table) - message_count INTEGER, - user_message_count INTEGER, - assistant_message_count INTEGER, - conversation_history_chars INTEGER, - base_prompt_length INTEGER, - - -- Raw request/response (optional, for debugging) - -- request_body JSONB, - -- response_body JSONB, - - PRIMARY KEY (id, timestamp) -); - --- Convert to hypertable for time-series optimization -SELECT create_hypertable('api_requests', 'timestamp', - chunk_time_interval => INTERVAL '1 day', - if_not_exists => TRUE -); - --- Create indexes for common query patterns -CREATE INDEX IF NOT EXISTS idx_api_requests_session ON api_requests (session_id, timestamp DESC); -CREATE INDEX IF NOT EXISTS idx_api_requests_provider ON api_requests (provider, timestamp DESC); -CREATE INDEX IF NOT EXISTS idx_api_requests_model ON api_requests (model, timestamp DESC); -CREATE INDEX IF NOT EXISTS idx_api_requests_api_key ON api_requests (api_key_prefix, timestamp DESC); -CREATE INDEX IF NOT EXISTS idx_api_requests_tenant ON api_requests (tenant_id, timestamp DESC); - --- ============================================ --- Session tracking --- ============================================ - -CREATE TABLE IF NOT EXISTS sessions ( - session_id TEXT PRIMARY KEY, - tenant_id TEXT NOT NULL DEFAULT 'default', - started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), - ended_at TIMESTAMPTZ, - agent_name TEXT, - total_requests INTEGER DEFAULT 0, - total_tokens INTEGER DEFAULT 0, - total_cost DECIMAL(12, 8) DEFAULT 0, - health_score DECIMAL(3, 2), -- 0.00 to 1.00 - metadata JSONB -); - -CREATE INDEX IF NOT EXISTS idx_sessions_tenant ON sessions (tenant_id, started_at DESC); - -CREATE INDEX IF NOT EXISTS idx_sessions_started ON sessions (started_at DESC); -CREATE INDEX IF NOT EXISTS idx_sessions_agent ON sessions (agent_name, started_at DESC); - --- ============================================ --- Agents registry --- ============================================ - -CREATE TABLE IF NOT EXISTS agents ( - agent_id TEXT PRIMARY KEY, - agent_name TEXT NOT NULL, - first_seen TIMESTAMPTZ NOT NULL DEFAULT NOW(), - last_seen TIMESTAMPTZ NOT NULL DEFAULT NOW(), - total_requests INTEGER DEFAULT 0, - total_tokens INTEGER DEFAULT 0, - total_cost DECIMAL(12, 8) DEFAULT 0, - api_key_prefix TEXT, - metadata JSONB -); - -CREATE INDEX IF NOT EXISTS idx_agents_last_seen ON agents (last_seen DESC); - --- ============================================ --- System prompt analysis (for decomposition insights) --- ============================================ - -CREATE TABLE IF NOT EXISTS system_prompts ( - prompt_hash TEXT PRIMARY KEY, - prompt_text TEXT NOT NULL, -- Truncated if too long - token_count INTEGER, - first_seen TIMESTAMPTZ NOT NULL DEFAULT NOW(), - usage_count INTEGER DEFAULT 1 -); - --- ============================================ --- Alerts configuration --- ============================================ - -CREATE TABLE IF NOT EXISTS alert_rules ( - rule_id SERIAL PRIMARY KEY, - name TEXT NOT NULL, - rule_type TEXT NOT NULL, -- 'cost', 'token', 'latency', 'error_rate' - threshold DECIMAL(12, 4) NOT NULL, - window_minutes INTEGER DEFAULT 60, - enabled BOOLEAN DEFAULT TRUE, - created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), - metadata JSONB -); - -CREATE TABLE IF NOT EXISTS alerts ( - alert_id BIGSERIAL PRIMARY KEY, - rule_id INTEGER REFERENCES alert_rules(rule_id), - triggered_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), - acknowledged_at TIMESTAMPTZ, - severity TEXT NOT NULL, -- 'info', 'warning', 'critical' - message TEXT NOT NULL, - value DECIMAL(12, 4), - metadata JSONB -); - -CREATE INDEX IF NOT EXISTS idx_alerts_triggered ON alerts (triggered_at DESC); -CREATE INDEX IF NOT EXISTS idx_alerts_acknowledged ON alerts (acknowledged_at) WHERE acknowledged_at IS NULL; - --- ============================================ --- Continuous aggregates for fast dashboards --- ============================================ - --- Hourly token/cost summary -CREATE MATERIALIZED VIEW IF NOT EXISTS hourly_summary -WITH (timescaledb.continuous) AS -SELECT - time_bucket('1 hour', timestamp) AS bucket, - provider, - model, - COUNT(*) as request_count, - SUM(prompt_tokens) as total_prompt_tokens, - SUM(completion_tokens) as total_completion_tokens, - SUM(total_tokens) as total_tokens, - SUM(total_cost) as total_cost, - AVG(latency_ms) as avg_latency_ms -FROM api_requests -GROUP BY bucket, provider, model -WITH NO DATA; - --- Add policy to refresh continuously -SELECT add_continuous_aggregate_policy('hourly_summary', - start_offset => INTERVAL '1 month', - end_offset => INTERVAL '1 hour', - schedule_interval => INTERVAL '5 minutes', - if_not_exists => TRUE -); - --- ============================================ --- Default data --- ============================================ - --- Insert default alert rules -INSERT INTO alert_rules (name, rule_type, threshold, window_minutes) -VALUES - ('High Hourly Cost', 'cost', 10.00, 60), -- $10/hour - ('High Token Usage', 'token', 1000000, 60), -- 1M tokens/hour - ('High Error Rate', 'error_rate', 0.10, 15), -- 10% errors in 15 min - ('High Latency', 'latency', 10000, 15) -- 10s avg latency in 15 min -ON CONFLICT DO NOTHING; diff --git a/dream-server/token-spy-schema/002_provider_keys.sql b/dream-server/token-spy-schema/002_provider_keys.sql deleted file mode 100644 index 4e78e4445..000000000 --- a/dream-server/token-spy-schema/002_provider_keys.sql +++ /dev/null @@ -1,205 +0,0 @@ --- Token Spy Database Schema Migration 002: Provider Keys & API Key Management --- Adds tables for multi-tenancy and API key management (Phase 4f) - --- ============================================ --- API Keys table (for tenant authentication) --- ============================================ - -CREATE TABLE IF NOT EXISTS api_keys ( - key_id TEXT PRIMARY KEY, -- tp_live_xxx or tp_test_xxx format - key_hash TEXT UNIQUE NOT NULL, -- SHA-256 hash for lookup - key_prefix TEXT NOT NULL, -- First 8 chars for display - - tenant_id TEXT NOT NULL, - name TEXT NOT NULL, - - -- Key type - environment TEXT NOT NULL DEFAULT 'live', -- 'live' or 'test' - - -- Status - is_active BOOLEAN DEFAULT TRUE, - revoked_at TIMESTAMPTZ, - revoked_reason TEXT, - expires_at TIMESTAMPTZ, - - -- Rate limiting - rate_limit_rpm INTEGER DEFAULT 60, -- Requests per minute - rate_limit_rpd INTEGER DEFAULT 10000, -- Requests per day - - -- Budget - monthly_token_limit INTEGER, -- Null = unlimited - tokens_used_this_month INTEGER DEFAULT 0, - monthly_cost_limit DECIMAL(12, 4), -- In USD - cost_used_this_month DECIMAL(12, 4) DEFAULT 0, - - -- Tracking - created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), - updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), - last_used_at TIMESTAMPTZ, - use_count INTEGER DEFAULT 0, - - -- Allowed providers (JSON array: ['anthropic', 'openai', 'vllm']) - allowed_providers JSONB DEFAULT '["*"]', - - -- Metadata - metadata JSONB -); - -CREATE INDEX IF NOT EXISTS idx_api_keys_hash ON api_keys (key_hash); -CREATE INDEX IF NOT EXISTS idx_api_keys_tenant ON api_keys (tenant_id, is_active); -CREATE INDEX IF NOT EXISTS idx_api_keys_active ON api_keys (is_active, expires_at) WHERE is_active = TRUE; - --- ============================================ --- Provider Keys table (encrypted upstream API keys) --- ============================================ - -CREATE TABLE IF NOT EXISTS provider_keys ( - id SERIAL PRIMARY KEY, - - tenant_id TEXT NOT NULL, - provider TEXT NOT NULL, -- 'anthropic', 'openai', 'google', 'vllm' - - name TEXT NOT NULL, -- Human-readable name - - -- Encrypted key storage - key_prefix TEXT NOT NULL, -- First 8 chars for display - encrypted_key TEXT NOT NULL, -- AES-256 encrypted - iv TEXT NOT NULL, -- Initialization vector - - -- Status - is_active BOOLEAN DEFAULT TRUE, - is_default BOOLEAN DEFAULT FALSE, -- Use this key if multiple exist - - -- Rotation tracking - created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), - updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), - expires_at TIMESTAMPTZ, - last_used_at TIMESTAMPTZ, - use_count INTEGER DEFAULT 0, - - -- Metadata - metadata JSONB, - - -- Ensure only one default per tenant/provider - CONSTRAINT unique_default_per_tenant_provider - UNIQUE (tenant_id, provider, is_default) - DEFERRABLE INITIALLY DEFERRED -); - -CREATE INDEX IF NOT EXISTS idx_provider_keys_tenant ON provider_keys (tenant_id, provider, is_active); -CREATE INDEX IF NOT EXISTS idx_provider_keys_active ON provider_keys (tenant_id, provider, is_active) WHERE is_active = TRUE; - --- Trigger to ensure only one default key per tenant/provider -CREATE OR REPLACE FUNCTION enforce_single_default_provider_key() -RETURNS TRIGGER AS $$ -BEGIN - IF NEW.is_default = TRUE THEN - UPDATE provider_keys - SET is_default = FALSE - WHERE tenant_id = NEW.tenant_id - AND provider = NEW.provider - AND is_default = TRUE - AND id != NEW.id; - END IF; - RETURN NEW; -END; -$$ LANGUAGE plpgsql; - -DROP TRIGGER IF EXISTS trigger_single_default_provider_key ON provider_keys; -CREATE TRIGGER trigger_single_default_provider_key - AFTER INSERT OR UPDATE ON provider_keys - FOR EACH ROW - EXECUTE FUNCTION enforce_single_default_provider_key(); - --- ============================================ --- Tenants table (for multi-tenancy) --- ============================================ - -CREATE TABLE IF NOT EXISTS tenants ( - tenant_id TEXT PRIMARY KEY, - name TEXT NOT NULL, - - -- Status - is_active BOOLEAN DEFAULT TRUE, - created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), - updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), - - -- Quotas - max_api_keys INTEGER DEFAULT 10, - max_monthly_tokens INTEGER, -- Across all keys - max_monthly_cost DECIMAL(12, 4), -- Across all keys - - -- Contact - contact_email TEXT, - notification_webhook_url TEXT, - - -- Metadata - metadata JSONB -); - --- ============================================ --- Budget usage tracking (monthly rollup) --- ============================================ - -CREATE TABLE IF NOT EXISTS monthly_usage ( - id BIGSERIAL, - year_month TEXT NOT NULL, -- '2024-02' - tenant_id TEXT NOT NULL, - api_key_id TEXT, - - -- Usage totals - request_count INTEGER DEFAULT 0, - total_tokens INTEGER DEFAULT 0, - prompt_tokens INTEGER DEFAULT 0, - completion_tokens INTEGER DEFAULT 0, - total_cost DECIMAL(12, 8) DEFAULT 0, - - -- Updated at - updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), - - PRIMARY KEY (year_month, tenant_id, api_key_id) -); - -CREATE INDEX IF NOT EXISTS idx_monthly_usage_tenant ON monthly_usage (tenant_id, year_month); - --- ============================================ --- Update timestamps trigger --- ============================================ - -CREATE OR REPLACE FUNCTION update_updated_at_column() -RETURNS TRIGGER AS $$ -BEGIN - NEW.updated_at = NOW(); - RETURN NEW; -END; -$$ LANGUAGE plpgsql; - --- Apply to all tables with updated_at -DROP TRIGGER IF EXISTS trigger_api_keys_updated_at ON api_keys; -CREATE TRIGGER trigger_api_keys_updated_at - BEFORE UPDATE ON api_keys - FOR EACH ROW - EXECUTE FUNCTION update_updated_at_column(); - -DROP TRIGGER IF EXISTS trigger_provider_keys_updated_at ON provider_keys; -CREATE TRIGGER trigger_provider_keys_updated_at - BEFORE UPDATE ON provider_keys - FOR EACH ROW - EXECUTE FUNCTION update_updated_at_column(); - -DROP TRIGGER IF EXISTS trigger_tenants_updated_at ON tenants; -CREATE TRIGGER trigger_tenants_updated_at - BEFORE UPDATE ON tenants - FOR EACH ROW - EXECUTE FUNCTION update_updated_at_column(); - --- ============================================ --- Default data --- ============================================ - --- Insert default tenant (required for single-tenant mode) -INSERT INTO tenants (tenant_id, name, contact_email) -VALUES ('default', 'Default Tenant', 'admin@localhost') -ON CONFLICT (tenant_id) DO NOTHING; - --- NOTE: Development API keys are in dev-seed.sql (not applied in production) diff --git a/dream-server/token-spy-schema/003_tenant_multitenancy.sql b/dream-server/token-spy-schema/003_tenant_multitenancy.sql deleted file mode 100644 index c952394a8..000000000 --- a/dream-server/token-spy-schema/003_tenant_multitenancy.sql +++ /dev/null @@ -1,126 +0,0 @@ --- Token Spy Database Schema Migration 003: Multi-tenancy Enhancements --- Adds plan_tier and max_provider_keys to tenants table for Phase 4a - --- ============================================ --- Add plan_tier column to tenants --- ============================================ - --- Add plan_tier enum type -DO $$ BEGIN - CREATE TYPE plan_tier_enum AS ENUM ('free', 'starter', 'pro', 'enterprise'); -EXCEPTION - WHEN duplicate_object THEN null; -END $$; - --- Add plan_tier column if not exists -ALTER TABLE tenants -ADD COLUMN IF NOT EXISTS plan_tier TEXT DEFAULT 'free'; - --- Add max_provider_keys column if not exists -ALTER TABLE tenants -ADD COLUMN IF NOT EXISTS max_provider_keys INTEGER DEFAULT 3; - --- ============================================ --- Add tenant_id to tables that need isolation --- ============================================ - --- Add tenant_id to sessions table -ALTER TABLE sessions -ADD COLUMN IF NOT EXISTS tenant_id TEXT; - -CREATE INDEX IF NOT EXISTS idx_sessions_tenant ON sessions (tenant_id, started_at DESC); - --- Add tenant_id to agents table -ALTER TABLE agents -ADD COLUMN IF NOT EXISTS tenant_id TEXT; - -CREATE INDEX IF NOT EXISTS idx_agents_tenant ON agents (tenant_id, last_seen DESC); - --- Add tenant_id to alert_rules table -ALTER TABLE alert_rules -ADD COLUMN IF NOT EXISTS tenant_id TEXT; - -CREATE INDEX IF NOT EXISTS idx_alert_rules_tenant ON alert_rules (tenant_id, enabled); - --- Add tenant_id to alerts table -ALTER TABLE alerts -ADD COLUMN IF NOT EXISTS tenant_id TEXT; - -CREATE INDEX IF NOT EXISTS idx_alerts_tenant ON alerts (tenant_id, triggered_at DESC); - --- Add tenant_id to system_prompts table -ALTER TABLE system_prompts -ADD COLUMN IF NOT EXISTS tenant_id TEXT; - -CREATE INDEX IF NOT EXISTS idx_system_prompts_tenant ON system_prompts (tenant_id); - --- ============================================ --- Update monthly_usage for tenant isolation --- ============================================ - --- Already has tenant_id, just ensure index exists -CREATE INDEX IF NOT EXISTS idx_monthly_usage_tenant_month ON monthly_usage (tenant_id, year_month DESC); - --- ============================================ --- Foreign key constraints (optional, add if referential integrity needed) --- ============================================ - --- Note: Not adding FK constraints here to avoid blocking on tenant creation --- If needed, add them separately: --- ALTER TABLE api_keys ADD CONSTRAINT fk_api_keys_tenant --- FOREIGN KEY (tenant_id) REFERENCES tenants(tenant_id); --- ALTER TABLE provider_keys ADD CONSTRAINT fk_provider_keys_tenant --- FOREIGN KEY (tenant_id) REFERENCES tenants(tenant_id); - --- ============================================ --- Update existing tenants with default tier --- ============================================ - -UPDATE tenants -SET plan_tier = 'free', max_provider_keys = 3 -WHERE plan_tier IS NULL; - --- Update default tenant to enterprise for development -UPDATE tenants -SET plan_tier = 'enterprise', - max_api_keys = NULL, -- unlimited - max_provider_keys = NULL, -- unlimited - max_monthly_tokens = NULL, -- unlimited - max_monthly_cost = NULL -- unlimited -WHERE tenant_id = 'default'; - --- ============================================ --- Add tenant-scoped views --- ============================================ - --- View for tenant usage summary (current month) -CREATE OR REPLACE VIEW tenant_monthly_summary AS -SELECT - t.tenant_id, - t.name as tenant_name, - t.plan_tier, - t.max_monthly_tokens, - t.max_monthly_cost, - COALESCE(SUM(mu.total_tokens), 0) as tokens_used, - COALESCE(SUM(mu.total_cost), 0) as cost_used, - COALESCE(SUM(mu.request_count), 0) as request_count, - CASE - WHEN t.max_monthly_tokens IS NULL THEN 1.0 - ELSE COALESCE(SUM(mu.total_tokens), 0)::float / t.max_monthly_tokens - END as token_usage_pct, - CASE - WHEN t.max_monthly_cost IS NULL THEN 1.0 - ELSE COALESCE(SUM(mu.total_cost), 0)::float / t.max_monthly_cost - END as cost_usage_pct -FROM tenants t -LEFT JOIN monthly_usage mu ON t.tenant_id = mu.tenant_id - AND mu.year_month = TO_CHAR(NOW(), 'YYYY-MM') -WHERE t.is_active = TRUE -GROUP BY t.tenant_id, t.name, t.plan_tier, t.max_monthly_tokens, t.max_monthly_cost; - --- ============================================ --- Grant permissions (adjust as needed for your DB user) --- ============================================ - --- GRANT SELECT, INSERT, UPDATE ON tenants TO token_spy; --- GRANT SELECT ON tenant_monthly_summary TO token_spy; diff --git a/dream-server/vllm-tool-proxy/Dockerfile b/dream-server/vllm-tool-proxy/Dockerfile deleted file mode 100644 index 4329f31b3..000000000 --- a/dream-server/vllm-tool-proxy/Dockerfile +++ /dev/null @@ -1,6 +0,0 @@ -FROM python:3.12-slim -WORKDIR /app -RUN pip install --no-cache-dir flask requests -COPY vllm-tool-proxy.py . -EXPOSE 8003 -CMD ["python3", "vllm-tool-proxy.py", "--port", "8003"] diff --git a/dream-server/vllm-tool-proxy/vllm-tool-proxy.py b/dream-server/vllm-tool-proxy/vllm-tool-proxy.py deleted file mode 100644 index 2e45cc711..000000000 --- a/dream-server/vllm-tool-proxy/vllm-tool-proxy.py +++ /dev/null @@ -1,427 +0,0 @@ -#!/usr/bin/env python3 -""" -Lighthouse AI โ€” vLLM Tool Call Proxy (v4) - -Bridges OpenClaw with local vLLM instances by handling three incompatibilities: - -1. OpenClaw always requests streaming (stream: true), but tool call extraction - requires seeing the full response. The proxy forces non-streaming when tools - are present, extracts tool calls, then re-wraps the response as SSE. - -2. Some models output tool calls as text (in tags, bare JSON, or - multi-line JSON) instead of OpenAI's structured tool_calls format. The proxy - detects and converts these automatically. - -3. vLLM returns extra fields that OpenClaw doesn't expect. The proxy strips - them for clean OpenAI-compatible responses. - -Safety: Aborts after MAX_TOOL_CALLS to prevent runaway loops. - -Usage: - python3 vllm-tool-proxy.py --port 8003 --vllm-url http://localhost:8000 - -Point your openclaw.json baseUrl to this proxy (e.g., http://localhost:8003/v1), -NOT directly to vLLM. - -Changelog: - v4 โ€” SSE re-wrapping, response cleaning, loop protection, multi-line JSON - v3 โ€” Bare JSON extraction - v2 โ€” tag extraction - v1 โ€” Initial proxy -""" -import argparse -import json -import logging -import os -import re -import uuid -from flask import Flask, request, Response -import requests - -app = Flask(__name__) -logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s: %(message)s') -logger = logging.getLogger(__name__) - -# Configuration via environment variables or CLI args -VLLM_URL = os.environ.get('VLLM_URL', 'http://localhost:8000') - -# Max tool calls per conversation โ€” safety net for infinite loops. -# Counts tool result messages; aborts if exceeded. -MAX_TOOL_CALLS = int(os.environ.get('MAX_TOOL_CALLS', '500')) - -TOOLS_REGEX = re.compile(r'(.*?)', re.DOTALL) - - -def has_tools(body): - """Check if the request includes tool definitions.""" - return body and body.get('tools') - - -def count_tool_results(messages): - """Count tool result messages in the conversation history.""" - if not messages: - return 0 - count = 0 - for msg in messages: - role = msg.get('role', '') - if role == 'tool' or msg.get('tool_call_id'): - count += 1 - return count - - -def check_tool_loop(body): - """Check if we've hit the max tool calls limit. - Returns error response dict if limit exceeded, None otherwise.""" - messages = body.get('messages', []) - tool_count = count_tool_results(messages) - - if tool_count >= MAX_TOOL_CALLS: - logger.warning(f'Tool call limit exceeded: {tool_count} >= {MAX_TOOL_CALLS}') - return { - 'id': 'chatcmpl-loop-abort', - 'object': 'chat.completion', - 'created': 0, - 'model': body.get('model', 'unknown'), - 'choices': [{ - 'index': 0, - 'message': { - 'role': 'assistant', - 'content': f'Tool call safety limit reached ({tool_count} calls). ' - f'The conversation may be stuck in a loop. ' - f'Try simplifying your request or starting a new session.' - }, - 'finish_reason': 'stop' - }] - } - return None - - -def parse_single_tool_call(text): - """Try to parse a single tool call from text. Returns dict or None.""" - text = text.strip() - if not text: - return None - try: - call = json.loads(text) - if isinstance(call, dict) and 'name' in call: - args = call.get('arguments', {}) - if isinstance(args, dict): - args = json.dumps(args) - return { - 'id': f'chatcmpl-tool-{uuid.uuid4().hex[:16]}', - 'type': 'function', - 'function': {'name': call['name'], 'arguments': args} - } - except (json.JSONDecodeError, ValueError): - pass - return None - - -def clean_response_for_openclaw(resp_json): - """Strip vLLM-specific fields for clean OpenAI-compatible output.""" - try: - for field in ["prompt_logprobs", "prompt_token_ids", "kv_transfer_params", - "service_tier", "system_fingerprint"]: - resp_json.pop(field, None) - - for choice in resp_json.get("choices", []): - for field in ["stop_reason", "token_ids"]: - choice.pop(field, None) - - msg = choice.get("message", {}) - for field in ["reasoning", "reasoning_content", "refusal", - "annotations", "audio", "function_call"]: - msg.pop(field, None) - if not msg.get("tool_calls"): - msg.pop("tool_calls", None) - - usage = resp_json.get("usage", {}) - if usage: - usage.pop("prompt_tokens_details", None) - except Exception as e: - logger.error(f"Error cleaning response: {e}") - - -def extract_tools_from_content(response_json): - """Post-process: if tool_calls is empty but content has tool JSON, extract it.""" - try: - choices = response_json.get('choices', []) - for choice in choices: - msg = choice.get('message', {}) - content = msg.get('content', '') or '' - tool_calls = msg.get('tool_calls') or [] - - if tool_calls or not content.strip(): - continue - - extracted_calls = [] - - # Strategy 1: tag extraction - matches = TOOLS_REGEX.findall(content) - if matches: - for match in matches: - for line in match.strip().split('\n'): - call = parse_single_tool_call(line) - if call: - extracted_calls.append(call) - - # Strategy 2: Bare JSON (entire content is one tool call) - if not extracted_calls: - stripped = content.strip() - call = parse_single_tool_call(stripped) - if call: - extracted_calls.append(call) - - # Strategy 3: Multi-line JSON (one tool call per line) - if not extracted_calls: - lines = content.strip().split('\n') - for line in lines: - call = parse_single_tool_call(line) - if call: - extracted_calls.append(call) - - if extracted_calls: - logger.info(f'Extracted {len(extracted_calls)} tool call(s) from content') - cleaned = TOOLS_REGEX.sub('', content).strip() - remaining_lines = [] - for line in cleaned.split('\n'): - if not parse_single_tool_call(line): - remaining_lines.append(line) - cleaned = '\n'.join(remaining_lines).strip() - - msg['content'] = cleaned if cleaned else None - msg['tool_calls'] = extracted_calls - choice['finish_reason'] = 'tool_calls' - except Exception as e: - logger.error(f'Error in post-processing: {e}') - - -def convert_to_sse_stream(resp_json): - """Convert a non-streaming chat completion response to SSE format.""" - import time - - def generate(): - model = resp_json.get("model", "unknown") - resp_id = resp_json.get("id", "chatcmpl-converted") - created = resp_json.get("created", int(time.time())) - - for choice in resp_json.get("choices", []): - msg = choice.get("message", {}) - content_text = msg.get("content") - tool_calls = msg.get("tool_calls") - finish_reason = choice.get("finish_reason", "stop") - - first_chunk = { - "id": resp_id, "object": "chat.completion.chunk", - "created": created, "model": model, - "choices": [{"index": 0, "delta": {"role": "assistant", "content": ""}, - "logprobs": None, "finish_reason": None}] - } - yield f"data: {json.dumps(first_chunk)}\n\n" - - if content_text: - content_chunk = { - "id": resp_id, "object": "chat.completion.chunk", - "created": created, "model": model, - "choices": [{"index": 0, "delta": {"content": content_text}, - "logprobs": None, "finish_reason": None}] - } - yield f"data: {json.dumps(content_chunk)}\n\n" - - if tool_calls: - for i, tc in enumerate(tool_calls): - tc_chunk = { - "id": resp_id, "object": "chat.completion.chunk", - "created": created, "model": model, - "choices": [{"index": 0, "delta": {"tool_calls": [{ - "index": i, "id": tc.get("id", ""), "type": "function", - "function": {"name": tc["function"]["name"], - "arguments": tc["function"]["arguments"]} - }]}, "logprobs": None, "finish_reason": None}] - } - yield f"data: {json.dumps(tc_chunk)}\n\n" - - finish_chunk = { - "id": resp_id, "object": "chat.completion.chunk", - "created": created, "model": model, - "choices": [{"index": 0, "delta": {}, - "logprobs": None, "finish_reason": finish_reason}] - } - yield f"data: {json.dumps(finish_chunk)}\n\n" - - usage = resp_json.get("usage") - if usage: - usage_chunk = { - "id": resp_id, "object": "chat.completion.chunk", - "created": created, "model": model, - "choices": [], "usage": usage - } - yield f"data: {json.dumps(usage_chunk)}\n\n" - - yield "data: [DONE]\n\n" - - return generate() - - -@app.route('/v1/', methods=['GET', 'POST', 'PUT', 'DELETE', 'OPTIONS']) -def proxy(path): - url = f'{VLLM_URL}/v1/{path}' - - if request.method == 'OPTIONS': - return Response('', status=204) - - if path not in ('chat/completions', 'responses'): - return forward_request(url) - - try: - body = request.get_json() - except Exception: - body = None - - if body and has_tools(body): - loop_response = check_tool_loop(body) - if loop_response: - return Response(json.dumps(loop_response), status=200, mimetype='application/json') - - was_streaming = body.get("stream", False) if body else False - - if body and has_tools(body) and was_streaming: - logger.info("Forcing non-streaming for tool call post-processing (will re-wrap as SSE)") - body["stream"] = False - body.pop("stream_options", None) - - is_streaming = body.get("stream", False) if body else False - - if body and not body.get("stream", False) and "stream_options" in body: - logger.info("Stripping stream_options from non-streaming request") - body.pop("stream_options", None) - - headers = {k: v for k, v in request.headers if k.lower() not in ('host', 'content-length')} - - if is_streaming: - return stream_response(url, headers, body) - elif was_streaming and body and has_tools(body): - return forward_fix_and_rewrap_sse(url, headers, body) - else: - return forward_with_body_and_fix(url, headers, body) - - -def forward_fix_and_rewrap_sse(url, headers, body): - """Forward non-streaming, fix tool calls, then re-wrap as SSE.""" - try: - resp = requests.post(url, headers=headers, json=body, timeout=300) - try: - resp_json = resp.json() - if body and has_tools(body): - extract_tools_from_content(resp_json) - clean_response_for_openclaw(resp_json) - - choices = resp_json.get("choices") or [{}] - msg = choices[0].get("message", {}) - logger.info(f"SSE-REWRAP: content={str(msg.get('content', ''))[:120]}, " - f"tool_calls={len(msg.get('tool_calls', []))}, " - f"finish={choices[0].get('finish_reason')}") - - return Response( - convert_to_sse_stream(resp_json), status=200, - mimetype='text/event-stream', - headers={'Cache-Control': 'no-cache', 'Connection': 'keep-alive'} - ) - except Exception as e: - logger.error(f'SSE rewrap parse error: {e}') - return Response(resp.content, status=resp.status_code) - except Exception as e: - logger.error(f'SSE rewrap forward error: {e}') - return Response(json.dumps({'error': str(e)}), status=502, mimetype='application/json') - - -def forward_request(url): - """Forward non-chat requests as-is.""" - headers = {k: v for k, v in request.headers if k.lower() not in ('host', 'content-length')} - try: - resp = requests.request( - method=request.method, url=url, headers=headers, - data=request.get_data(), stream=True, timeout=300 - ) - excluded = {'content-encoding', 'transfer-encoding', 'content-length'} - resp_headers = {k: v for k, v in resp.headers.items() if k.lower() not in excluded} - return Response(resp.iter_content(chunk_size=1024), status=resp.status_code, headers=resp_headers) - except Exception as e: - logger.error(f'Forward error: {e}') - return Response(json.dumps({'error': str(e)}), status=502, mimetype='application/json') - - -def forward_with_body_and_fix(url, headers, body): - """Forward non-streaming requests, extract tool calls, and clean response.""" - try: - resp = requests.post(url, headers=headers, json=body, timeout=300) - try: - resp_json = resp.json() - if body and has_tools(body): - extract_tools_from_content(resp_json) - clean_response_for_openclaw(resp_json) - - choices = resp_json.get("choices") or [{}] - msg = choices[0].get("message", {}) - logger.info(f"RESPONSE: content={str(msg.get('content', ''))[:120]}, " - f"finish={choices[0].get('finish_reason')}") - - return Response(json.dumps(resp_json), status=resp.status_code, mimetype='application/json') - except Exception: - return Response(resp.content, status=resp.status_code) - except Exception as e: - logger.error(f'Forward error: {e}') - return Response(json.dumps({'error': str(e)}), status=502, mimetype='application/json') - - -def stream_response(url, headers, body): - """Pure streaming passthrough (no tool extraction).""" - def generate(): - try: - with requests.post(url, headers=headers, json=body, stream=True, timeout=300) as resp: - for chunk in resp.iter_content(chunk_size=None): - if chunk: - yield chunk - except Exception as e: - logger.error(f'Stream error: {e}') - error_data = json.dumps({"error": str(e)}) - yield f'data: {error_data}\n\n' - return Response(generate(), mimetype='text/event-stream') - - -@app.route('/health') -def health(): - return {'status': 'ok', 'vllm_url': VLLM_URL, 'max_tool_calls': MAX_TOOL_CALLS} - - -@app.route('/') -def root(): - return { - 'service': 'Lighthouse AI โ€” vLLM Tool Call Proxy', - 'version': 'v4', - 'vllm_url': VLLM_URL, - 'features': [ - 'Extract tool calls from tags in content', - 'Extract tool calls from bare JSON in content', - 'Extract tool calls from multi-line JSON in content', - 'Force non-streaming when tools present for extraction', - 'Re-wrap non-streaming responses as SSE for OpenClaw', - 'Strip vLLM-specific fields for clean OpenAI format', - f'Safety limit: abort after {MAX_TOOL_CALLS} tool calls' - ] - } - - -if __name__ == '__main__': - parser = argparse.ArgumentParser(description='Lighthouse AI โ€” vLLM Tool Call Proxy') - parser.add_argument('--port', type=int, default=int(os.environ.get('PROXY_PORT', '8003')), - help='Port to listen on (default: 8003, env: PROXY_PORT)') - parser.add_argument('--vllm-url', type=str, default=VLLM_URL, - help='vLLM base URL (default: http://localhost:8000, env: VLLM_URL)') - parser.add_argument('--host', type=str, default='0.0.0.0', - help='Host to bind to (default: 0.0.0.0)') - args = parser.parse_args() - VLLM_URL = args.vllm_url - logger.info(f'Starting Lighthouse AI vLLM Tool Call Proxy v4') - logger.info(f'Listening on {args.host}:{args.port} -> {VLLM_URL}') - app.run(host=args.host, port=args.port, threaded=True) diff --git a/dream-server/workflows/01-chat-endpoint.json b/dream-server/workflows/01-chat-endpoint.json deleted file mode 100644 index 2ba72c767..000000000 --- a/dream-server/workflows/01-chat-endpoint.json +++ /dev/null @@ -1,99 +0,0 @@ -{ - "name": "Local LLM Chat API", - "nodes": [ - { - "parameters": { - "httpMethod": "POST", - "path": "chat", - "responseMode": "responseNode", - "options": {} - }, - "id": "webhook-1", - "name": "Webhook", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 300] - }, - { - "parameters": { - "method": "POST", - "url": "http://vllm:8000/v1/chat/completions", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ model: ($json.body.model && $json.body.model !== 'local') ? $json.body.model : 'Qwen/Qwen2.5-32B-Instruct-AWQ', messages: $json.body.messages, temperature: $json.body.temperature || 0.7, max_tokens: $json.body.max_tokens || 1024, stream: false }) }}", - "options": { - "timeout": 120000 - } - }, - "id": "llm-request-1", - "name": "Call Local LLM", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [460, 300] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ $json }}", - "options": {} - }, - "id": "respond-1", - "name": "Respond", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [680, 300] - } - ], - "connections": { - "Webhook": { - "main": [ - [ - { - "node": "Call Local LLM", - "type": "main", - "index": 0 - } - ] - ] - }, - "Call Local LLM": { - "main": [ - [ - { - "node": "Respond", - "type": "main", - "index": 0 - } - ] - ] - } - }, - "active": false, - "settings": { - "executionOrder": "v1" - }, - "versionId": "1", - "meta": { - "templateCredsSetupCompleted": true, - "instanceId": "dream-server" - }, - "tags": [ - { - "name": "dream-server", - "id": "1" - }, - { - "name": "llm", - "id": "2" - } - ] -} diff --git a/dream-server/workflows/02-document-qa.json b/dream-server/workflows/02-document-qa.json deleted file mode 100644 index e0b7d158a..000000000 --- a/dream-server/workflows/02-document-qa.json +++ /dev/null @@ -1,334 +0,0 @@ -{ - "name": "Document Q&A with RAG", - "nodes": [ - { - "parameters": { - "httpMethod": "POST", - "path": "upload-doc", - "responseMode": "responseNode", - "options": { - "rawBody": true - } - }, - "id": "webhook-upload", - "name": "Upload Document", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 200] - }, - { - "parameters": { - "mode": "jsonToBinary", - "options": {} - }, - "id": "extract-text", - "name": "Extract Text", - "type": "n8n-nodes-base.moveBinaryData", - "typeVersion": 2, - "position": [460, 200], - "notes": "In production, add PDF parsing with external service" - }, - { - "parameters": { - "jsCode": "// Split text into chunks for embedding\nconst text = $input.item.json.body.text || '';\nconst chunkSize = 500;\nconst overlap = 50;\nconst chunks = [];\n\nfor (let i = 0; i < text.length; i += chunkSize - overlap) {\n const chunk = text.slice(i, i + chunkSize);\n if (chunk.trim().length > 50) {\n chunks.push({\n text: chunk,\n start: i,\n end: i + chunk.length,\n doc_id: $input.item.json.body.doc_id || 'doc_' + Date.now()\n });\n }\n}\n\nreturn chunks.map(c => ({ json: c }));" - }, - "id": "chunk-text", - "name": "Chunk Text", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [680, 200] - }, - { - "parameters": { - "method": "POST", - "url": "http://embeddings:80/embed", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ inputs: $json.text }) }}", - "options": { - "timeout": 30000 - } - }, - "id": "embed-chunk", - "name": "Generate Embedding", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [900, 200], - "notes": "Uses text-embeddings-inference (TEI) with BGE-small-en-v1.5" - }, - { - "parameters": { - "method": "PUT", - "url": "http://qdrant:6333/collections/documents/points", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ points: [{ id: $crypto.randomUUID(), vector: $json[0], payload: { text: $('Chunk Text').item.json.text, doc_id: $('Chunk Text').item.json.doc_id, start: $('Chunk Text').item.json.start, end: $('Chunk Text').item.json.end } }] }) }}", - "options": {} - }, - "id": "store-qdrant", - "name": "Store in Qdrant", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [1120, 200] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ { success: true, chunks_stored: $items.length } }}", - "options": {} - }, - "id": "respond-upload", - "name": "Upload Complete", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [1340, 200] - }, - { - "parameters": { - "httpMethod": "POST", - "path": "ask", - "responseMode": "responseNode", - "options": {} - }, - "id": "webhook-ask", - "name": "Ask Question", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 500] - }, - { - "parameters": { - "method": "POST", - "url": "http://embeddings:80/embed", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ inputs: $json.body.question }) }}", - "options": {} - }, - "id": "embed-question", - "name": "Embed Question", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [460, 500] - }, - { - "parameters": { - "method": "POST", - "url": "http://qdrant:6333/collections/documents/points/search", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ vector: $json[0], limit: 5, with_payload: true }) }}", - "options": {} - }, - "id": "search-qdrant", - "name": "Search Qdrant", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [680, 500] - }, - { - "parameters": { - "jsCode": "// Combine retrieved chunks into context\nconst results = $input.item.json.result || [];\nconst context = results\n .map(r => r.payload?.text || '')\n .join('\\n\\n---\\n\\n');\n\nreturn [{ json: { context, question: $('Ask Question').item.json.body.question } }];" - }, - "id": "build-context", - "name": "Build Context", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [900, 500] - }, - { - "parameters": { - "method": "POST", - "url": "http://vllm:8000/v1/chat/completions", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ model: 'Qwen/Qwen2.5-32B-Instruct-AWQ', messages: [{ role: 'system', content: 'You are a helpful assistant. Answer questions based on the provided context. If the context does not contain the answer, say so.' }, { role: 'user', content: 'Context:\\n' + $json.context + '\\n\\nQuestion: ' + $json.question }], temperature: 0.3, max_tokens: 1024, stream: false }) }}", - "options": { - "timeout": 120000 - } - }, - "id": "generate-answer", - "name": "Generate Answer", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [1120, 500] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ { question: $('Build Context').item.json.question, answer: $json.choices[0].message.content } }}", - "options": {} - }, - "id": "respond-answer", - "name": "Return Answer", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [1340, 500] - } - ], - "connections": { - "Upload Document": { - "main": [ - [ - { - "node": "Chunk Text", - "type": "main", - "index": 0 - } - ] - ] - }, - "Chunk Text": { - "main": [ - [ - { - "node": "Generate Embedding", - "type": "main", - "index": 0 - } - ] - ] - }, - "Generate Embedding": { - "main": [ - [ - { - "node": "Store in Qdrant", - "type": "main", - "index": 0 - } - ] - ] - }, - "Store in Qdrant": { - "main": [ - [ - { - "node": "Upload Complete", - "type": "main", - "index": 0 - } - ] - ] - }, - "Ask Question": { - "main": [ - [ - { - "node": "Embed Question", - "type": "main", - "index": 0 - } - ] - ] - }, - "Embed Question": { - "main": [ - [ - { - "node": "Search Qdrant", - "type": "main", - "index": 0 - } - ] - ] - }, - "Search Qdrant": { - "main": [ - [ - { - "node": "Build Context", - "type": "main", - "index": 0 - } - ] - ] - }, - "Build Context": { - "main": [ - [ - { - "node": "Generate Answer", - "type": "main", - "index": 0 - } - ] - ] - }, - "Generate Answer": { - "main": [ - [ - { - "node": "Return Answer", - "type": "main", - "index": 0 - } - ] - ] - } - }, - "active": false, - "settings": { - "executionOrder": "v1" - }, - "versionId": "1", - "meta": { - "templateCredsSetupCompleted": true, - "instanceId": "dream-server" - }, - "tags": [ - { - "name": "dream-server", - "id": "1" - }, - { - "name": "rag", - "id": "5" - } - ] -} diff --git a/dream-server/workflows/03-voice-transcription.json b/dream-server/workflows/03-voice-transcription.json deleted file mode 100644 index 43c9dee44..000000000 --- a/dream-server/workflows/03-voice-transcription.json +++ /dev/null @@ -1,237 +0,0 @@ -{ - "name": "Voice Transcription", - "nodes": [ - { - "parameters": { - "httpMethod": "POST", - "path": "transcribe", - "responseMode": "responseNode", - "options": { - "rawBody": true - } - }, - "id": "webhook-transcribe", - "name": "Receive Audio", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 300] - }, - { - "parameters": { - "method": "POST", - "url": "http://whisper:9000/asr", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "multipart/form-data" - } - ] - }, - "sendBody": true, - "contentType": "multipart-form-data", - "bodyParameters": { - "parameters": [ - { - "name": "audio_file", - "parameterType": "formBinaryData", - "inputDataFieldName": "data" - }, - { - "name": "output", - "value": "json" - } - ] - }, - "options": { - "timeout": 60000 - } - }, - "id": "whisper-1", - "name": "Whisper STT", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [460, 300] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ { text: $json.text, segments: $json.segments } }}", - "options": {} - }, - "id": "respond-transcribe", - "name": "Return Transcript", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [680, 300] - }, - { - "parameters": { - "httpMethod": "POST", - "path": "voice-command", - "responseMode": "responseNode", - "options": { - "rawBody": true - } - }, - "id": "webhook-command", - "name": "Voice Command", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 500] - }, - { - "parameters": { - "method": "POST", - "url": "http://whisper:9000/asr", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "multipart/form-data" - } - ] - }, - "sendBody": true, - "contentType": "multipart-form-data", - "bodyParameters": { - "parameters": [ - { - "name": "audio_file", - "parameterType": "formBinaryData", - "inputDataFieldName": "data" - }, - { - "name": "output", - "value": "json" - } - ] - }, - "options": { - "timeout": 60000 - } - }, - "id": "whisper-2", - "name": "Transcribe Command", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [460, 500] - }, - { - "parameters": { - "method": "POST", - "url": "http://vllm:8000/v1/chat/completions", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ model: 'Qwen/Qwen2.5-32B-Instruct-AWQ', messages: [{ role: 'system', content: 'You are a helpful assistant. Respond concisely to voice commands.' }, { role: 'user', content: $json.text }], temperature: 0.7, max_tokens: 256, stream: false }) }}", - "options": { - "timeout": 120000 - } - }, - "id": "llm-command", - "name": "Process Command", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [680, 500] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ { input: $('Transcribe Command').item.json.text, response: $json.choices[0].message.content } }}", - "options": {} - }, - "id": "respond-command", - "name": "Return Response", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [900, 500] - } - ], - "connections": { - "Receive Audio": { - "main": [ - [ - { - "node": "Whisper STT", - "type": "main", - "index": 0 - } - ] - ] - }, - "Whisper STT": { - "main": [ - [ - { - "node": "Return Transcript", - "type": "main", - "index": 0 - } - ] - ] - }, - "Voice Command": { - "main": [ - [ - { - "node": "Transcribe Command", - "type": "main", - "index": 0 - } - ] - ] - }, - "Transcribe Command": { - "main": [ - [ - { - "node": "Process Command", - "type": "main", - "index": 0 - } - ] - ] - }, - "Process Command": { - "main": [ - [ - { - "node": "Return Response", - "type": "main", - "index": 0 - } - ] - ] - } - }, - "active": false, - "settings": { - "executionOrder": "v1" - }, - "versionId": "1", - "meta": { - "templateCredsSetupCompleted": true, - "instanceId": "dream-server" - }, - "tags": [ - { - "name": "dream-server", - "id": "1" - }, - { - "name": "voice", - "id": "3" - } - ] -} diff --git a/dream-server/workflows/04-tts-api.json b/dream-server/workflows/04-tts-api.json deleted file mode 100644 index 1abd3528f..000000000 --- a/dream-server/workflows/04-tts-api.json +++ /dev/null @@ -1,132 +0,0 @@ -{ - "name": "Text-to-Speech API", - "nodes": [ - { - "parameters": { - "httpMethod": "POST", - "path": "speak", - "responseMode": "responseNode", - "options": {} - }, - "id": "webhook-tts", - "name": "TTS Request", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 300] - }, - { - "parameters": { - "method": "POST", - "url": "=http://tts:8880/v1/audio/speech", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "contentType": "json", - "bodyParameters": { - "parameters": [ - { - "name": "model", - "value": "kokoro" - }, - { - "name": "voice", - "value": "={{ $json.body.voice || 'af_heart' }}" - }, - { - "name": "input", - "value": "={{ $json.body.text }}" - } - ] - }, - "options": { - "response": { - "response": { - "fullResponse": true, - "responseFormat": "file" - } - }, - "timeout": 30000 - } - }, - "id": "piper-1", - "name": "Generate Speech", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [460, 300] - }, - { - "parameters": { - "respondWith": "binary", - "options": { - "responseHeaders": { - "entries": [ - { - "name": "Content-Type", - "value": "audio/wav" - }, - { - "name": "Content-Disposition", - "value": "attachment; filename=\"speech.wav\"" - } - ] - } - } - }, - "id": "respond-tts", - "name": "Return Audio", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [680, 300] - } - ], - "connections": { - "TTS Request": { - "main": [ - [ - { - "node": "Generate Speech", - "type": "main", - "index": 0 - } - ] - ] - }, - "Generate Speech": { - "main": [ - [ - { - "node": "Return Audio", - "type": "main", - "index": 0 - } - ] - ] - } - }, - "active": false, - "settings": { - "executionOrder": "v1" - }, - "versionId": "1", - "meta": { - "templateCredsSetupCompleted": true, - "instanceId": "dream-server" - }, - "tags": [ - { - "name": "dream-server", - "id": "1" - }, - { - "name": "tts", - "id": "4" - } - ] -} diff --git a/dream-server/workflows/05-voice-to-voice.json b/dream-server/workflows/05-voice-to-voice.json deleted file mode 100644 index 426fe217f..000000000 --- a/dream-server/workflows/05-voice-to-voice.json +++ /dev/null @@ -1,181 +0,0 @@ -{ - "name": "Voice to Voice Assistant", - "nodes": [ - { - "parameters": { - "httpMethod": "POST", - "path": "voice-chat", - "options": { - "rawBody": true - } - }, - "id": "webhook-voice", - "name": "Voice Input", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [250, 300] - }, - { - "parameters": { - "method": "POST", - "url": "http://whisper:9000/asr", - "sendBody": true, - "contentType": "multipart-form-data", - "bodyParameters": { - "parameters": [ - { - "name": "audio_file", - "parameterType": "formBinaryData", - "inputDataFieldName": "data" - }, - { - "name": "output", - "value": "json" - } - ] - }, - "options": {} - }, - "id": "whisper-transcribe", - "name": "Whisper Transcribe", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [470, 300] - }, - { - "parameters": { - "method": "POST", - "url": "http://vllm:8000/v1/chat/completions", - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ model: 'Qwen/Qwen2.5-32B-Instruct-AWQ', messages: [{ role: 'system', content: 'You are a helpful voice assistant. Keep responses concise (1-3 sentences) since they will be spoken aloud.' }, { role: 'user', content: $json.text }], max_tokens: 256, temperature: 0.7 }) }}", - "options": { - "timeout": 60000 - } - }, - "id": "vllm-chat", - "name": "LLM Response", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [690, 300] - }, - { - "parameters": { - "method": "POST", - "url": "http://tts:8880/v1/audio/speech", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "contentType": "json", - "bodyParameters": { - "parameters": [ - { - "name": "model", - "value": "kokoro" - }, - { - "name": "voice", - "value": "af_heart" - }, - { - "name": "input", - "value": "={{ $json.choices[0].message.content }}" - } - ] - }, - "options": { - "response": { - "response": { - "fullResponse": true, - "responseFormat": "file" - } - } - } - }, - "id": "kokoro-tts", - "name": "Kokoro TTS", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [910, 300] - }, - { - "parameters": { - "respondWith": "binary", - "options": { - "responseHeaders": { - "entries": [ - { - "name": "Content-Type", - "value": "audio/wav" - } - ] - } - } - }, - "id": "respond-audio", - "name": "Return Audio", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [1130, 300] - } - ], - "connections": { - "Voice Input": { - "main": [ - [ - { - "node": "Whisper Transcribe", - "type": "main", - "index": 0 - } - ] - ] - }, - "Whisper Transcribe": { - "main": [ - [ - { - "node": "LLM Response", - "type": "main", - "index": 0 - } - ] - ] - }, - "LLM Response": { - "main": [ - [ - { - "node": "Kokoro TTS", - "type": "main", - "index": 0 - } - ] - ] - }, - "Kokoro TTS": { - "main": [ - [ - { - "node": "Return Audio", - "type": "main", - "index": 0 - } - ] - ] - } - }, - "active": false, - "settings": { - "executionOrder": "v1" - }, - "tags": [], - "pinData": {} -} diff --git a/dream-server/workflows/06-rag-demo.json b/dream-server/workflows/06-rag-demo.json deleted file mode 100644 index 8b56688b1..000000000 --- a/dream-server/workflows/06-rag-demo.json +++ /dev/null @@ -1,185 +0,0 @@ -{ - "name": "RAG Document Q&A Demo", - "nodes": [ - { - "parameters": { - "httpMethod": "POST", - "path": "upload-doc", - "options": {} - }, - "id": "webhook-upload", - "name": "Upload Document", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [250, 200] - }, - { - "parameters": { - "mode": "runOnceForEachItem", - "jsCode": "// Chunk text into overlapping segments\nconst text = $input.first().json.text || $input.first().json.content || '';\nconst chunkSize = 500;\nconst overlap = 100;\nconst chunks = [];\n\nfor (let i = 0; i < text.length; i += chunkSize - overlap) {\n const chunk = text.slice(i, i + chunkSize);\n if (chunk.trim()) {\n chunks.push({\n text: chunk,\n index: chunks.length,\n start: i,\n end: Math.min(i + chunkSize, text.length)\n });\n }\n}\n\nreturn chunks.map(c => ({ json: c }));" - }, - "id": "chunk-text", - "name": "Chunk Text", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [470, 200] - }, - { - "parameters": { - "method": "POST", - "url": "http://embeddings:80/embed", - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ inputs: $json.text }) }}", - "options": {} - }, - "id": "embed-chunk", - "name": "Generate Embedding", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [690, 200] - }, - { - "parameters": { - "method": "PUT", - "url": "http://qdrant:6333/collections/documents/points", - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ points: [{ id: $crypto.randomUUID(), vector: $json[0], payload: { text: $('Chunk Text').item.json.text, index: $('Chunk Text').item.json.index } }] }) }}", - "options": {} - }, - "id": "store-qdrant", - "name": "Store in Qdrant", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [910, 200] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ JSON.stringify({ success: true, chunks_stored: $runIndex + 1 }) }}", - "options": {} - }, - "id": "respond-upload", - "name": "Upload Response", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [1130, 200] - }, - { - "parameters": { - "httpMethod": "POST", - "path": "ask", - "options": {} - }, - "id": "webhook-ask", - "name": "Ask Question", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [250, 450] - }, - { - "parameters": { - "method": "POST", - "url": "http://embeddings:80/embed", - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ inputs: $json.question }) }}", - "options": {} - }, - "id": "embed-question", - "name": "Embed Question", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [470, 450] - }, - { - "parameters": { - "method": "POST", - "url": "http://qdrant:6333/collections/documents/points/search", - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ vector: $json[0], limit: 3, with_payload: true }) }}", - "options": {} - }, - "id": "search-qdrant", - "name": "Search Qdrant", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [690, 450] - }, - { - "parameters": { - "mode": "runOnceForEachItem", - "jsCode": "// Build context from search results\nconst results = $input.first().json.result || [];\nconst context = results.map(r => r.payload.text).join('\\n\\n---\\n\\n');\nconst question = $('Ask Question').first().json.question;\n\nreturn [{\n json: {\n context,\n question,\n sources: results.map(r => ({ text: r.payload.text.slice(0, 100) + '...', score: r.score }))\n }\n}];" - }, - "id": "build-context", - "name": "Build Context", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [910, 450] - }, - { - "parameters": { - "method": "POST", - "url": "http://vllm:8000/v1/chat/completions", - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ model: 'Qwen/Qwen2.5-32B-Instruct-AWQ', messages: [{ role: 'system', content: 'Answer questions based on the provided context. If the answer is not in the context, say so. Be concise.' }, { role: 'user', content: 'Context:\\n' + $json.context + '\\n\\nQuestion: ' + $json.question }], max_tokens: 512, temperature: 0.3 }) }}", - "options": { - "timeout": 60000 - } - }, - "id": "generate-answer", - "name": "Generate Answer", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [1130, 450] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ JSON.stringify({ answer: $json.choices[0].message.content, sources: $('Build Context').first().json.sources }) }}", - "options": {} - }, - "id": "respond-answer", - "name": "Answer Response", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [1350, 450] - } - ], - "connections": { - "Upload Document": { - "main": [[{ "node": "Chunk Text", "type": "main", "index": 0 }]] - }, - "Chunk Text": { - "main": [[{ "node": "Generate Embedding", "type": "main", "index": 0 }]] - }, - "Generate Embedding": { - "main": [[{ "node": "Store in Qdrant", "type": "main", "index": 0 }]] - }, - "Store in Qdrant": { - "main": [[{ "node": "Upload Response", "type": "main", "index": 0 }]] - }, - "Ask Question": { - "main": [[{ "node": "Embed Question", "type": "main", "index": 0 }]] - }, - "Embed Question": { - "main": [[{ "node": "Search Qdrant", "type": "main", "index": 0 }]] - }, - "Search Qdrant": { - "main": [[{ "node": "Build Context", "type": "main", "index": 0 }]] - }, - "Build Context": { - "main": [[{ "node": "Generate Answer", "type": "main", "index": 0 }]] - }, - "Generate Answer": { - "main": [[{ "node": "Answer Response", "type": "main", "index": 0 }]] - } - }, - "active": false, - "settings": { "executionOrder": "v1" }, - "tags": [], - "pinData": {} -} diff --git a/dream-server/workflows/07-code-assistant.json b/dream-server/workflows/07-code-assistant.json deleted file mode 100644 index 3b75e4644..000000000 --- a/dream-server/workflows/07-code-assistant.json +++ /dev/null @@ -1,72 +0,0 @@ -{ - "name": "Code Assistant", - "nodes": [ - { - "parameters": { - "httpMethod": "POST", - "path": "code-assist", - "options": {} - }, - "id": "webhook-code", - "name": "Code Input", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [250, 300] - }, - { - "parameters": { - "mode": "runOnceForEachItem", - "jsCode": "// Build appropriate prompt based on task\nconst code = $input.first().json.code || '';\nconst task = ($input.first().json.task || 'explain').toLowerCase();\nconst language = $input.first().json.language || 'auto-detect';\n\nconst prompts = {\n explain: `Explain what this code does in clear, simple terms. Break down the logic step by step.\n\nCode:\n\\`\\`\\`${language}\\n${code}\\n\\`\\`\\``,\n \n improve: `Review this code and suggest improvements. Focus on:\n- Code quality and readability\n- Performance optimizations\n- Best practices\n- Potential bugs\n\nProvide the improved code with comments explaining changes.\n\nCode:\n\\`\\`\\`${language}\\n${code}\\n\\`\\`\\``,\n \n debug: `Analyze this code for bugs and issues. Identify:\n- Syntax errors\n- Logic errors\n- Edge cases that might fail\n- Security vulnerabilities\n\nProvide fixes for each issue found.\n\nCode:\n\\`\\`\\`${language}\\n${code}\\n\\`\\`\\``,\n \n document: `Add comprehensive documentation to this code:\n- Function/class docstrings\n- Inline comments for complex logic\n- Type hints if applicable\n- Usage examples\n\nCode:\n\\`\\`\\`${language}\\n${code}\\n\\`\\`\\``,\n \n test: `Generate unit tests for this code. Include:\n- Happy path tests\n- Edge cases\n- Error handling tests\n- Use appropriate testing framework for the language\n\nCode:\n\\`\\`\\`${language}\\n${code}\\n\\`\\`\\``\n};\n\nconst systemPrompt = 'You are an expert code reviewer and developer. Provide clear, actionable feedback. When showing code, use proper formatting.';\nconst userPrompt = prompts[task] || prompts.explain;\n\nreturn [{ json: { systemPrompt, userPrompt, task, language } }];" - }, - "id": "build-prompt", - "name": "Build Prompt", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [470, 300] - }, - { - "parameters": { - "method": "POST", - "url": "http://vllm:8000/v1/chat/completions", - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ model: 'Qwen/Qwen2.5-32B-Instruct-AWQ', messages: [{ role: 'system', content: $json.systemPrompt }, { role: 'user', content: $json.userPrompt }], max_tokens: 2048, temperature: 0.3 }) }}", - "options": { - "timeout": 120000 - } - }, - "id": "llm-response", - "name": "LLM Response", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [690, 300] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ JSON.stringify({ task: $('Build Prompt').first().json.task, language: $('Build Prompt').first().json.language, result: $json.choices[0].message.content }) }}", - "options": {} - }, - "id": "respond", - "name": "Return Result", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [910, 300] - } - ], - "connections": { - "Code Input": { - "main": [[{ "node": "Build Prompt", "type": "main", "index": 0 }]] - }, - "Build Prompt": { - "main": [[{ "node": "LLM Response", "type": "main", "index": 0 }]] - }, - "LLM Response": { - "main": [[{ "node": "Return Result", "type": "main", "index": 0 }]] - } - }, - "active": false, - "settings": { "executionOrder": "v1" }, - "tags": [], - "pinData": {} -} diff --git a/dream-server/workflows/README.md b/dream-server/workflows/README.md deleted file mode 100644 index 5d79baf80..000000000 --- a/dream-server/workflows/README.md +++ /dev/null @@ -1,184 +0,0 @@ -# Dream Server n8n Workflows - -Pre-built workflows for common local AI tasks. Import these directly into your n8n instance. - -## How to Import - -1. Open n8n at http://localhost:5678 -2. Click **+ Add Workflow** -3. Click the menu (**โ‹ฎ**) โ†’ **Import from file** -4. Select the `.json` file - -## Quick Demo (curl examples) - -```bash -# Chat -curl -X POST http://localhost:5678/webhook/chat \ - -H "Content-Type: application/json" \ - -d '{"message": "What is the meaning of life?"}' - -# Voice-to-Voice (send audio, get audio back) -curl -X POST http://localhost:5678/webhook/voice-chat \ - -F "audio=@your-recording.wav" \ - -o response.wav - -# Code Assistant -curl -X POST http://localhost:5678/webhook/code-assist \ - -H "Content-Type: application/json" \ - -d '{"code": "def add(a,b): return a+b", "task": "improve"}' - -# RAG: Upload document -curl -X POST http://localhost:5678/webhook/upload-doc \ - -H "Content-Type: application/json" \ - -d '{"text": "Your document content here..."}' - -# RAG: Ask question -curl -X POST http://localhost:5678/webhook/ask \ - -H "Content-Type: application/json" \ - -d '{"question": "What is this document about?"}' -``` - -## Available Workflows - -### 1. Chat API Endpoint (`01-chat-endpoint.json`) -Creates a REST API endpoint that forwards requests to your local vLLM. - -**Use case:** Connect any application that expects an OpenAI-compatible API. - -**Endpoints created:** -- `POST /webhook/chat` โ€” Send messages, get completions - -### 2. Document Q&A (`02-document-qa.json`) -Full RAG pipeline: upload documents, ask questions, get answers from content. - -**Use case:** Internal knowledge base, document analysis. - -**Endpoints created:** -- `POST /webhook/upload-doc` โ€” Upload text, chunk, embed, store in Qdrant -- `POST /webhook/ask` โ€” Ask questions, get RAG-powered answers - -**Workflow:** -1. Upload: Text โ†’ Chunk (500 chars) โ†’ Embed โ†’ Store in Qdrant -2. Query: Question โ†’ Embed โ†’ Vector search โ†’ Context โ†’ LLM answer - -**Note:** For PDF support, add a PDF parsing node (external service like Unstructured.io) - -### 3. Voice Transcription (`03-voice-transcription.json`) -Receive audio, transcribe with Whisper, optionally process with LLM. - -**Use case:** Meeting transcription, voice commands, audio analysis. - -**Endpoints created:** -- `POST /webhook/transcribe` โ€” Audio โ†’ Text -- `POST /webhook/voice-command` โ€” Audio โ†’ LLM response - -### 4. Text-to-Speech API (`04-tts-api.json`) -Convert text to speech using Piper. - -**Use case:** Audiobook generation, accessibility, notifications. - -**Endpoints created:** -- `POST /webhook/speak` โ€” Text โ†’ Audio file - -### 5. Voice-to-Voice Assistant (`05-voice-to-voice.json`) -Complete voice chat pipeline: speak โ†’ transcribe โ†’ LLM โ†’ speak back. - -**Use case:** Hands-free AI assistant, accessibility, voice-first interfaces. - -**Workflow:** -1. Receive audio (WAV/MP3/WebM) -2. Whisper transcribes to text -3. LLM generates concise response -4. Piper synthesizes speech -5. Returns audio response - -**Endpoints created:** -- `POST /webhook/voice-chat` โ€” Audio in โ†’ Audio out - -**The "wow" demo:** Record a question, POST it, get a spoken answer back. Full local voice AI. - -### 6. RAG Document Q&A (`06-rag-demo.json`) -Full RAG pipeline for document question-answering. - -**Use case:** Upload documents, ask questions, get answers with source citations. - -**Workflow:** -1. Upload: Text โ†’ Chunk (500 chars, 100 overlap) โ†’ Embed โ†’ Store in Qdrant -2. Query: Question โ†’ Embed โ†’ Vector search โ†’ Inject context โ†’ LLM answer - -**Endpoints created:** -- `POST /webhook/upload-doc` โ€” Upload and index a document -- `POST /webhook/ask` โ€” Ask questions about indexed documents - -**The "wow" demo:** Upload your company docs, ask questions, get accurate answers from your own data. - -### 7. Code Assistant (`07-code-assistant.json`) -AI-powered code review and assistance. - -**Use case:** Code explanation, improvement, debugging, documentation, test generation. - -**Workflow:** -1. Receive code + task type (explain/improve/debug/document/test) -2. Build appropriate prompt for task -3. LLM generates response -4. Return structured result - -**Endpoints created:** -- `POST /webhook/code-assist` โ€” `{ "code": "...", "task": "improve", "language": "python" }` - -**Tasks supported:** explain, improve, debug, document, test - -### 8. Scheduled Summarizer (`daily-digest.json`) -Daily/weekly cron that summarizes specified content. - -**Use case:** News digest, log analysis, report generation. - -**Configurable:** -- Schedule (daily/weekly/custom) -- Content sources -- Output destination (email, Slack, file) - -## Configuration - -Most workflows need these credentials configured in n8n: - -### Local LLM (HTTP Request) -- **Base URL:** `http://vllm:8000/v1` -- **Authentication:** None (internal network) - -### Qdrant (HTTP Request) -- **Base URL:** `http://qdrant:6333` -- **Authentication:** None (internal network) - -### Whisper (HTTP Request) -- **Base URL:** `http://whisper:9000` -- **Authentication:** None (internal network) - -### Piper (HTTP Request) -- **Base URL:** `http://tts:8880` -- **Authentication:** None (internal network) - -## Customization - -Each workflow can be extended: -- Add authentication -- Change model parameters -- Connect to external services -- Add error handling - -## Troubleshooting - -**"Could not connect to vllm:8000"** -- Check if vLLM is running: `docker compose ps` -- Check logs: `docker compose logs vllm` -- Ensure container network is correct - -**"Response too slow"** -- First request loads model (can take 30+ seconds) -- Subsequent requests should be fast -- Consider reducing context length - -**"Out of memory"** -- Reduce `max-model-len` in docker-compose.yml -- Use smaller model (adjust `.env`) -- Check GPU memory: `nvidia-smi` diff --git a/dream-server/workflows/catalog.json b/dream-server/workflows/catalog.json deleted file mode 100644 index d3fe7e4c8..000000000 --- a/dream-server/workflows/catalog.json +++ /dev/null @@ -1,177 +0,0 @@ -{ - "workflows": [ - { - "id": "m4-deterministic-voice", - "file": "08-m4-deterministic-voice.json", - "name": "M4 Deterministic Voice", - "description": "Intent classification with deterministic routing โ€” 60% faster, 80% less LLM usage", - "icon": "Brain", - "category": "voice", - "dependencies": ["vllm"], - "diagram": { - "steps": [ - {"label": "User speaks", "icon": "Mic"}, - {"label": "Classify intent", "icon": "Brain"}, - {"label": "Route: FSM or LLM", "icon": "GitBranch"}, - {"label": "Fast response", "icon": "Zap"} - ] - }, - "setupTime": "2 minutes", - "featured": true - }, - { - "id": "document-qa", - "file": "document-qa.json", - "name": "Document Q&A", - "description": "Upload documents and ask questions about them", - "icon": "FileText", - "category": "productivity", - "dependencies": ["qdrant", "vllm"], - "diagram": { - "steps": [ - {"label": "Upload document", "icon": "Upload"}, - {"label": "AI chunks & embeds", "icon": "Brain"}, - {"label": "Ask questions", "icon": "MessageSquare"}, - {"label": "AI finds answers", "icon": "Search"} - ] - }, - "setupTime": "2 minutes", - "featured": true - }, - { - "id": "voice-transcription", - "file": "03-voice-transcription.json", - "name": "Voice Transcription", - "description": "Transcribe audio files to text", - "icon": "Mic", - "category": "voice", - "dependencies": ["whisper"], - "diagram": { - "steps": [ - {"label": "Upload audio", "icon": "Upload"}, - {"label": "Whisper transcribes", "icon": "AudioLines"}, - {"label": "Get text", "icon": "FileText"} - ] - }, - "setupTime": "1 minute", - "featured": false - }, - { - "id": "voice-to-voice", - "file": "05-voice-to-voice.json", - "name": "Voice to Voice", - "description": "Speak, get AI response as audio", - "icon": "Headphones", - "category": "voice", - "dependencies": ["whisper", "vllm", "kokoro"], - "diagram": { - "steps": [ - {"label": "Speak", "icon": "Mic"}, - {"label": "Whisper transcribes", "icon": "AudioLines"}, - {"label": "AI responds", "icon": "Brain"}, - {"label": "Kokoro speaks", "icon": "Volume2"} - ] - }, - "setupTime": "2 minutes", - "featured": true - }, - { - "id": "daily-digest", - "file": "daily-digest.json", - "name": "Daily Digest", - "description": "Summarize your day every morning", - "icon": "Calendar", - "category": "productivity", - "dependencies": ["vllm"], - "diagram": { - "steps": [ - {"label": "Scheduled trigger", "icon": "Clock"}, - {"label": "Gather data", "icon": "Database"}, - {"label": "AI summarizes", "icon": "Brain"}, - {"label": "Send digest", "icon": "Mail"} - ] - }, - "setupTime": "5 minutes", - "featured": false - }, - { - "id": "code-assistant", - "file": "07-code-assistant.json", - "name": "Code Assistant", - "description": "AI-powered coding help via API", - "icon": "Code", - "category": "development", - "dependencies": ["vllm"], - "diagram": { - "steps": [ - {"label": "Send code", "icon": "Code"}, - {"label": "AI analyzes", "icon": "Brain"}, - {"label": "Get suggestions", "icon": "Lightbulb"} - ] - }, - "setupTime": "1 minute", - "featured": false - }, - { - "id": "rag-demo", - "file": "06-rag-demo.json", - "name": "RAG Demo", - "description": "Retrieval-augmented generation example", - "icon": "Search", - "category": "development", - "dependencies": ["qdrant", "vllm"], - "diagram": { - "steps": [ - {"label": "Query", "icon": "Search"}, - {"label": "Find relevant docs", "icon": "Database"}, - {"label": "AI synthesizes", "icon": "Brain"}, - {"label": "Grounded answer", "icon": "CheckCircle"} - ] - }, - "setupTime": "3 minutes", - "featured": false - }, - { - "id": "voice-memo", - "file": "voice-memo.json", - "name": "Voice Memo", - "description": "Record and transcribe voice memos", - "icon": "Mic", - "category": "productivity", - "dependencies": ["whisper"], - "diagram": { - "steps": [ - {"label": "Record memo", "icon": "Mic"}, - {"label": "Upload", "icon": "Upload"}, - {"label": "Transcribe", "icon": "AudioLines"}, - {"label": "Save text", "icon": "Save"} - ] - }, - "setupTime": "1 minute", - "featured": false - }, - { - "id": "chat-endpoint", - "file": "01-chat-endpoint.json", - "name": "Chat API Endpoint", - "description": "REST API for chat completions", - "icon": "MessageSquare", - "category": "development", - "dependencies": ["vllm"], - "diagram": { - "steps": [ - {"label": "POST request", "icon": "Send"}, - {"label": "AI processes", "icon": "Brain"}, - {"label": "JSON response", "icon": "FileJson"} - ] - }, - "setupTime": "1 minute", - "featured": false - } - ], - "categories": { - "productivity": {"name": "Productivity", "description": "Automate your daily tasks"}, - "voice": {"name": "Voice", "description": "Speech-to-text and text-to-speech"}, - "development": {"name": "Development", "description": "APIs and coding tools"} - } -} diff --git a/dream-server/workflows/daily-digest.json b/dream-server/workflows/daily-digest.json deleted file mode 100644 index ce1cb9215..000000000 --- a/dream-server/workflows/daily-digest.json +++ /dev/null @@ -1,268 +0,0 @@ -{ - "name": "Daily Digest - Morning Briefing", - "nodes": [ - { - "parameters": { - "rule": { - "interval": [ - { - "triggerAtHour": 7, - "triggerAtMinute": 0 - } - ] - } - }, - "id": "schedule-trigger", - "name": "Morning Schedule", - "type": "n8n-nodes-base.scheduleTrigger", - "typeVersion": 1.2, - "position": [240, 300], - "notes": "Triggers daily at 7:00 AM" - }, - { - "parameters": { - "httpMethod": "POST", - "path": "trigger-digest", - "responseMode": "responseNode", - "options": {} - }, - "id": "manual-trigger", - "name": "Manual Trigger", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 500], - "notes": "POST /webhook/trigger-digest to run manually" - }, - { - "parameters": { - "url": "https://news.ycombinator.com/rss", - "options": {} - }, - "id": "rss-hackernews", - "name": "HackerNews RSS", - "type": "n8n-nodes-base.rssFeedRead", - "typeVersion": 1, - "position": [460, 200], - "notes": "Customize with your preferred RSS feeds" - }, - { - "parameters": { - "url": "https://feeds.arstechnica.com/arstechnica/technology-lab", - "options": {} - }, - "id": "rss-tech", - "name": "Tech News RSS", - "type": "n8n-nodes-base.rssFeedRead", - "typeVersion": 1, - "position": [460, 400] - }, - { - "parameters": { - "jsCode": "// Collect and format all RSS items\nconst allItems = $input.all();\nconst headlines = allItems.slice(0, 20).map((item, idx) => {\n return `${idx + 1}. ${item.json.title}\\n ${item.json.link}\\n ${(item.json.contentSnippet || item.json.description || '').slice(0, 200)}...`;\n}).join('\\n\\n');\n\nconst now = new Date();\nconst dateStr = now.toLocaleDateString('en-US', { \n weekday: 'long', \n year: 'numeric', \n month: 'long', \n day: 'numeric' \n});\n\nreturn [{\n json: {\n date: dateStr,\n item_count: allItems.length,\n headlines: headlines,\n source: 'RSS Feeds'\n }\n}];" - }, - "id": "format-rss", - "name": "Format RSS Headlines", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [680, 300] - }, - { - "parameters": { - "method": "POST", - "url": "http://vllm:8000/v1/chat/completions", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ model: 'Qwen/Qwen2.5-32B-Instruct-AWQ', messages: [{ role: 'system', content: 'You are a helpful morning briefing assistant. Create a concise, engaging digest of the news headlines provided. Group by theme, highlight the most important stories, and add brief context where helpful. Keep it scannable and under 500 words.' }, { role: 'user', content: 'Create my morning briefing for ' + $json.date + ':\\n\\n' + $json.headlines }], temperature: 0.7, max_tokens: 1024, stream: false }) }}", - "options": { - "timeout": 120000 - } - }, - "id": "summarize-llm", - "name": "Generate Digest", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [900, 300] - }, - { - "parameters": { - "jsCode": "const digest = $json.choices[0].message.content;\nconst date = $('Format RSS Headlines').item.json.date;\nconst itemCount = $('Format RSS Headlines').item.json.item_count;\n\nconst output = `# Daily Digest\\n**${date}**\\n\\n---\\n\\n${digest}\\n\\n---\\n*Generated from ${itemCount} articles*\\n`;\n\nreturn [{ json: { digest: output, date: date, item_count: itemCount } }];" - }, - "id": "format-output", - "name": "Format Output", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [1120, 300] - }, - { - "parameters": { - "operation": "write", - "fileName": "=/digest/daily-digest-{{ $now.format('yyyy-MM-dd') }}.md", - "options": {}, - "dataPropertyName": "digest" - }, - "id": "save-file", - "name": "Save Digest", - "type": "n8n-nodes-base.readWriteFile", - "typeVersion": 1, - "position": [1340, 200], - "notes": "Saves to /digest/ folder - configure mount in docker-compose" - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ { success: true, date: $json.date, digest: $json.digest } }}", - "options": {} - }, - "id": "respond-digest", - "name": "Return Digest", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [1340, 400] - }, - { - "parameters": {}, - "id": "merge-triggers", - "name": "Merge Triggers", - "type": "n8n-nodes-base.merge", - "typeVersion": 3, - "position": [460, 600] - } - ], - "connections": { - "Morning Schedule": { - "main": [ - [ - { - "node": "HackerNews RSS", - "type": "main", - "index": 0 - }, - { - "node": "Tech News RSS", - "type": "main", - "index": 0 - } - ] - ] - }, - "Manual Trigger": { - "main": [ - [ - { - "node": "Merge Triggers", - "type": "main", - "index": 0 - } - ] - ] - }, - "Merge Triggers": { - "main": [ - [ - { - "node": "HackerNews RSS", - "type": "main", - "index": 0 - }, - { - "node": "Tech News RSS", - "type": "main", - "index": 0 - } - ] - ] - }, - "HackerNews RSS": { - "main": [ - [ - { - "node": "Format RSS Headlines", - "type": "main", - "index": 0 - } - ] - ] - }, - "Tech News RSS": { - "main": [ - [ - { - "node": "Format RSS Headlines", - "type": "main", - "index": 0 - } - ] - ] - }, - "Format RSS Headlines": { - "main": [ - [ - { - "node": "Generate Digest", - "type": "main", - "index": 0 - } - ] - ] - }, - "Generate Digest": { - "main": [ - [ - { - "node": "Format Output", - "type": "main", - "index": 0 - } - ] - ] - }, - "Format Output": { - "main": [ - [ - { - "node": "Save Digest", - "type": "main", - "index": 0 - }, - { - "node": "Return Digest", - "type": "main", - "index": 0 - } - ] - ] - } - }, - "active": false, - "settings": { - "executionOrder": "v1" - }, - "versionId": "1", - "meta": { - "templateCredsSetupCompleted": true, - "instanceId": "dream-server" - }, - "tags": [ - { - "name": "dream-server", - "id": "1" - }, - { - "name": "digest", - "id": "8" - }, - { - "name": "scheduled", - "id": "9" - } - ] -} diff --git a/dream-server/workflows/document-qa.json b/dream-server/workflows/document-qa.json deleted file mode 100644 index 53aa156bf..000000000 --- a/dream-server/workflows/document-qa.json +++ /dev/null @@ -1,484 +0,0 @@ -{ - "name": "Document Q&A - Upload & Ask", - "nodes": [ - { - "parameters": { - "httpMethod": "POST", - "path": "doc/upload", - "responseMode": "responseNode", - "options": {} - }, - "id": "webhook-upload", - "name": "Upload Endpoint", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 200], - "notes": "POST { text: '...', doc_id: 'optional', title: 'optional' }" - }, - { - "parameters": { - "method": "PUT", - "url": "http://qdrant:6333/collections/documents", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ vectors: { size: 768, distance: 'Cosine' } }) }}", - "options": { - "ignore400Errors": true - } - }, - "id": "create-collection", - "name": "Ensure Collection", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [460, 200], - "notes": "Creates Qdrant collection if not exists (768 dims for BGE-base)" - }, - { - "parameters": { - "jsCode": "// Split text into overlapping chunks for better retrieval\nconst text = $input.item.json.body.text || '';\nconst docId = $input.item.json.body.doc_id || 'doc_' + Date.now();\nconst title = $input.item.json.body.title || 'Untitled';\n\nconst chunkSize = 500;\nconst overlap = 100;\nconst chunks = [];\n\nif (!text || text.trim().length < 10) {\n throw new Error('Text content is required and must be at least 10 characters');\n}\n\n// Clean and normalize text\nconst cleanText = text.replace(/\\s+/g, ' ').trim();\n\nfor (let i = 0; i < cleanText.length; i += chunkSize - overlap) {\n const chunk = cleanText.slice(i, i + chunkSize);\n if (chunk.trim().length > 30) {\n chunks.push({\n text: chunk.trim(),\n chunk_index: chunks.length,\n start_char: i,\n end_char: i + chunk.length,\n doc_id: docId,\n title: title,\n timestamp: new Date().toISOString()\n });\n }\n}\n\nif (chunks.length === 0) {\n throw new Error('No valid chunks could be created from the text');\n}\n\nreturn chunks.map(c => ({ json: c }));" - }, - "id": "chunk-text", - "name": "Chunk Text", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [680, 200] - }, - { - "parameters": { - "method": "POST", - "url": "http://embeddings:80/embed", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ inputs: $json.text }) }}", - "options": { - "timeout": 30000 - } - }, - "id": "generate-embedding", - "name": "Generate Embedding", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [900, 200], - "notes": "Uses TEI with BGE-base-en-v1.5 (768 dimensions)" - }, - { - "parameters": { - "method": "PUT", - "url": "http://qdrant:6333/collections/documents/points", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ points: [{ id: Math.floor(Date.now() * 1000 + Math.random() * 1000), vector: $json[0], payload: { text: $('Chunk Text').item.json.text, doc_id: $('Chunk Text').item.json.doc_id, title: $('Chunk Text').item.json.title, chunk_index: $('Chunk Text').item.json.chunk_index, timestamp: $('Chunk Text').item.json.timestamp } }] }) }}", - "options": { - "timeout": 10000 - } - }, - "id": "store-qdrant", - "name": "Store in Qdrant", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [1120, 200] - }, - { - "parameters": { - "jsCode": "// Aggregate storage results\nconst items = $input.all();\nconst firstChunk = $('Chunk Text').first().json;\nreturn [{\n json: {\n success: true,\n doc_id: firstChunk.doc_id,\n title: firstChunk.title,\n chunks_indexed: items.length,\n message: `Successfully indexed ${items.length} chunks from document '${firstChunk.title}'`\n }\n}];" - }, - "id": "aggregate-results", - "name": "Aggregate Results", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [1340, 200] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ $json }}", - "options": {} - }, - "id": "respond-upload", - "name": "Return Upload Result", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [1560, 200] - }, - { - "parameters": { - "httpMethod": "POST", - "path": "doc/ask", - "responseMode": "responseNode", - "options": {} - }, - "id": "webhook-ask", - "name": "Ask Endpoint", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 500], - "notes": "POST { question: '...', doc_id: 'optional filter' }" - }, - { - "parameters": { - "method": "POST", - "url": "http://embeddings:80/embed", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ inputs: $json.body.question }) }}", - "options": { - "timeout": 30000 - } - }, - "id": "embed-question", - "name": "Embed Question", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [460, 500] - }, - { - "parameters": { - "method": "POST", - "url": "http://qdrant:6333/collections/documents/points/search", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ vector: $json[0], limit: 5, with_payload: true, filter: $('Ask Endpoint').item.json.body.doc_id ? { must: [{ key: 'doc_id', match: { value: $('Ask Endpoint').item.json.body.doc_id } }] } : undefined }) }}", - "options": { - "timeout": 10000 - } - }, - "id": "search-qdrant", - "name": "Search Qdrant", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [680, 500] - }, - { - "parameters": { - "jsCode": "// Build context from retrieved chunks with source attribution\nconst results = $input.item.json.result || [];\nconst question = $('Ask Endpoint').item.json.body.question;\n\nif (results.length === 0) {\n return [{\n json: {\n context: '',\n question: question,\n sources: [],\n has_context: false\n }\n }];\n}\n\nconst sources = [];\nconst contextParts = results.map((r, idx) => {\n const payload = r.payload || {};\n sources.push({\n doc_id: payload.doc_id,\n title: payload.title || 'Unknown',\n chunk_index: payload.chunk_index,\n score: r.score\n });\n return `[Source ${idx + 1}: ${payload.title || payload.doc_id}]\\n${payload.text}`;\n});\n\nreturn [{\n json: {\n context: contextParts.join('\\n\\n---\\n\\n'),\n question: question,\n sources: sources,\n has_context: true\n }\n}];" - }, - "id": "build-context", - "name": "Build Context", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [900, 500] - }, - { - "parameters": { - "method": "POST", - "url": "http://vllm:8000/v1/chat/completions", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ model: 'Qwen/Qwen2.5-32B-Instruct-AWQ', messages: [{ role: 'system', content: 'You are a knowledgeable assistant that answers questions based on the provided document context. Always cite which source(s) you used. If the context does not contain relevant information, clearly state that.' }, { role: 'user', content: $json.has_context ? 'Based on the following document excerpts:\\n\\n' + $json.context + '\\n\\n---\\n\\nQuestion: ' + $json.question : 'No relevant documents found for this question: ' + $json.question }], temperature: 0.3, max_tokens: 1024, stream: false }) }}", - "options": { - "timeout": 120000 - } - }, - "id": "generate-answer", - "name": "Generate Answer", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [1120, 500] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ { question: $('Build Context').item.json.question, answer: $json.choices[0].message.content, sources: $('Build Context').item.json.sources, context_found: $('Build Context').item.json.has_context } }}", - "options": {} - }, - "id": "respond-answer", - "name": "Return Answer", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [1340, 500] - }, - { - "parameters": { - "httpMethod": "GET", - "path": "doc/list", - "responseMode": "responseNode", - "options": {} - }, - "id": "webhook-list", - "name": "List Docs Endpoint", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 750], - "notes": "GET /webhook/doc/list - List all indexed documents" - }, - { - "parameters": { - "method": "POST", - "url": "http://qdrant:6333/collections/documents/points/scroll", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ limit: 1000, with_payload: { include: ['doc_id', 'title', 'timestamp'] } }) }}", - "options": {} - }, - "id": "scroll-qdrant", - "name": "Get All Points", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [460, 750] - }, - { - "parameters": { - "jsCode": "// Deduplicate and list unique documents\nconst points = $input.item.json.result?.points || [];\nconst docs = new Map();\n\npoints.forEach(p => {\n const docId = p.payload?.doc_id;\n if (docId && !docs.has(docId)) {\n docs.set(docId, {\n doc_id: docId,\n title: p.payload?.title || 'Untitled',\n indexed_at: p.payload?.timestamp\n });\n }\n});\n\nreturn [{\n json: {\n documents: Array.from(docs.values()),\n total_documents: docs.size,\n total_chunks: points.length\n }\n}];" - }, - "id": "list-docs", - "name": "List Documents", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [680, 750] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ $json }}", - "options": {} - }, - "id": "respond-list", - "name": "Return List", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [900, 750] - } - ], - "connections": { - "Upload Endpoint": { - "main": [ - [ - { - "node": "Ensure Collection", - "type": "main", - "index": 0 - } - ] - ] - }, - "Ensure Collection": { - "main": [ - [ - { - "node": "Chunk Text", - "type": "main", - "index": 0 - } - ] - ] - }, - "Chunk Text": { - "main": [ - [ - { - "node": "Generate Embedding", - "type": "main", - "index": 0 - } - ] - ] - }, - "Generate Embedding": { - "main": [ - [ - { - "node": "Store in Qdrant", - "type": "main", - "index": 0 - } - ] - ] - }, - "Store in Qdrant": { - "main": [ - [ - { - "node": "Aggregate Results", - "type": "main", - "index": 0 - } - ] - ] - }, - "Aggregate Results": { - "main": [ - [ - { - "node": "Return Upload Result", - "type": "main", - "index": 0 - } - ] - ] - }, - "Ask Endpoint": { - "main": [ - [ - { - "node": "Embed Question", - "type": "main", - "index": 0 - } - ] - ] - }, - "Embed Question": { - "main": [ - [ - { - "node": "Search Qdrant", - "type": "main", - "index": 0 - } - ] - ] - }, - "Search Qdrant": { - "main": [ - [ - { - "node": "Build Context", - "type": "main", - "index": 0 - } - ] - ] - }, - "Build Context": { - "main": [ - [ - { - "node": "Generate Answer", - "type": "main", - "index": 0 - } - ] - ] - }, - "Generate Answer": { - "main": [ - [ - { - "node": "Return Answer", - "type": "main", - "index": 0 - } - ] - ] - }, - "List Docs Endpoint": { - "main": [ - [ - { - "node": "Get All Points", - "type": "main", - "index": 0 - } - ] - ] - }, - "Get All Points": { - "main": [ - [ - { - "node": "List Documents", - "type": "main", - "index": 0 - } - ] - ] - }, - "List Documents": { - "main": [ - [ - { - "node": "Return List", - "type": "main", - "index": 0 - } - ] - ] - } - }, - "active": false, - "settings": { - "executionOrder": "v1" - }, - "versionId": "1", - "meta": { - "templateCredsSetupCompleted": true, - "instanceId": "dream-server" - }, - "tags": [ - { - "name": "dream-server", - "id": "1" - }, - { - "name": "rag", - "id": "5" - }, - { - "name": "documents", - "id": "10" - } - ] -} diff --git a/dream-server/workflows/voice-memo.json b/dream-server/workflows/voice-memo.json deleted file mode 100644 index 934c3d32e..000000000 --- a/dream-server/workflows/voice-memo.json +++ /dev/null @@ -1,436 +0,0 @@ -{ - "name": "Voice Memo - Transcribe & Summarize", - "nodes": [ - { - "parameters": { - "httpMethod": "POST", - "path": "voice-memo", - "responseMode": "responseNode", - "options": { - "rawBody": true - } - }, - "id": "webhook-memo", - "name": "Receive Voice Memo", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 300], - "notes": "POST multipart/form-data with audio file" - }, - { - "parameters": { - "method": "POST", - "url": "http://whisper:9000/asr", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "multipart/form-data" - } - ] - }, - "sendBody": true, - "contentType": "multipart-form-data", - "bodyParameters": { - "parameters": [ - { - "name": "audio_file", - "parameterType": "formBinaryData", - "inputDataFieldName": "data" - }, - { - "name": "output", - "value": "json" - }, - { - "name": "word_timestamps", - "value": "true" - } - ] - }, - "options": { - "timeout": 120000 - } - }, - "id": "whisper-transcribe", - "name": "Whisper Transcribe", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [460, 300], - "notes": "Uses faster-whisper for transcription" - }, - { - "parameters": { - "jsCode": "// Extract and format transcription\nconst result = $input.item.json;\nconst transcript = result.text || '';\nconst segments = result.segments || [];\n\nif (!transcript || transcript.trim().length < 5) {\n throw new Error('Transcription failed or audio was too short');\n}\n\n// Calculate duration from segments\nlet duration = 0;\nif (segments.length > 0) {\n duration = segments[segments.length - 1].end || 0;\n}\n\nconst timestamp = new Date().toISOString();\nconst memoId = 'memo_' + Date.now();\n\nreturn [{\n json: {\n memo_id: memoId,\n transcript: transcript.trim(),\n word_count: transcript.split(/\\s+/).length,\n duration_seconds: Math.round(duration),\n segments: segments,\n timestamp: timestamp\n }\n}];" - }, - "id": "format-transcript", - "name": "Format Transcript", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [680, 300] - }, - { - "parameters": { - "method": "POST", - "url": "http://vllm:8000/v1/chat/completions", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ model: 'Qwen/Qwen2.5-32B-Instruct-AWQ', messages: [{ role: 'system', content: 'You are an assistant that summarizes voice memos. Create a clear, concise summary that captures:\\n1. Main topic/purpose of the memo\\n2. Key points or action items\\n3. Any important details or decisions mentioned\\n\\nKeep the summary under 150 words. Use bullet points for action items.' }, { role: 'user', content: 'Please summarize this voice memo transcript:\\n\\n' + $json.transcript }], temperature: 0.5, max_tokens: 512, stream: false }) }}", - "options": { - "timeout": 120000 - } - }, - "id": "summarize-llm", - "name": "Generate Summary", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [900, 300] - }, - { - "parameters": { - "jsCode": "// Combine transcript and summary into final output\nconst transcript = $('Format Transcript').item.json;\nconst summary = $json.choices[0].message.content;\n\nconst dateStr = new Date(transcript.timestamp).toLocaleDateString('en-US', {\n year: 'numeric',\n month: 'long',\n day: 'numeric',\n hour: '2-digit',\n minute: '2-digit'\n});\n\n// Create markdown file content\nconst fileContent = `# Voice Memo\\n\\n**ID:** ${transcript.memo_id}\\n**Date:** ${dateStr}\\n**Duration:** ${transcript.duration_seconds} seconds\\n**Words:** ${transcript.word_count}\\n\\n---\\n\\n## Summary\\n\\n${summary}\\n\\n---\\n\\n## Full Transcript\\n\\n${transcript.transcript}\\n`;\n\nreturn [{\n json: {\n memo_id: transcript.memo_id,\n timestamp: transcript.timestamp,\n duration_seconds: transcript.duration_seconds,\n word_count: transcript.word_count,\n summary: summary,\n transcript: transcript.transcript,\n file_content: fileContent\n }\n}];" - }, - "id": "build-output", - "name": "Build Output", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [1120, 300] - }, - { - "parameters": { - "operation": "write", - "fileName": "=/memos/{{ $json.memo_id }}.md", - "options": {}, - "dataPropertyName": "file_content" - }, - "id": "save-memo", - "name": "Save Memo", - "type": "n8n-nodes-base.readWriteFile", - "typeVersion": 1, - "position": [1340, 200], - "notes": "Saves to /memos/ folder - configure mount in docker-compose" - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ { success: true, memo_id: $json.memo_id, timestamp: $json.timestamp, duration_seconds: $json.duration_seconds, word_count: $json.word_count, summary: $json.summary, transcript: $json.transcript, saved_to: '/memos/' + $json.memo_id + '.md' } }}", - "options": {} - }, - "id": "respond-memo", - "name": "Return Result", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [1340, 400] - }, - { - "parameters": { - "httpMethod": "GET", - "path": "voice-memos", - "responseMode": "responseNode", - "options": {} - }, - "id": "webhook-list", - "name": "List Memos Endpoint", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 600], - "notes": "GET /webhook/voice-memos - List all saved memos" - }, - { - "parameters": { - "operation": "list", - "folderPath": "/memos", - "options": {} - }, - "id": "list-files", - "name": "List Memo Files", - "type": "n8n-nodes-base.readWriteFile", - "typeVersion": 1, - "position": [460, 600] - }, - { - "parameters": { - "jsCode": "// Parse memo files into list\nconst files = $input.all();\nconst memos = files\n .filter(f => f.json.fileName?.endsWith('.md'))\n .map(f => ({\n filename: f.json.fileName,\n memo_id: f.json.fileName?.replace('.md', ''),\n size: f.json.size,\n modified: f.json.mtime\n }));\n\nreturn [{\n json: {\n memos: memos,\n total: memos.length\n }\n}];" - }, - "id": "format-list", - "name": "Format List", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [680, 600] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ $json }}", - "options": {} - }, - "id": "respond-list", - "name": "Return List", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [900, 600] - }, - { - "parameters": { - "httpMethod": "POST", - "path": "voice-memo-text", - "responseMode": "responseNode", - "options": {} - }, - "id": "webhook-text", - "name": "Text Memo Input", - "type": "n8n-nodes-base.webhook", - "typeVersion": 2, - "position": [240, 850], - "notes": "POST { transcript: '...' } - For pre-transcribed text" - }, - { - "parameters": { - "jsCode": "// Handle direct text input (skip Whisper)\nconst transcript = $input.item.json.body.transcript || '';\n\nif (!transcript || transcript.trim().length < 10) {\n throw new Error('Transcript text is required and must be at least 10 characters');\n}\n\nconst timestamp = new Date().toISOString();\nconst memoId = 'memo_' + Date.now();\n\nreturn [{\n json: {\n memo_id: memoId,\n transcript: transcript.trim(),\n word_count: transcript.split(/\\s+/).length,\n duration_seconds: 0,\n segments: [],\n timestamp: timestamp\n }\n}];" - }, - "id": "format-text-input", - "name": "Format Text Input", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [460, 850] - }, - { - "parameters": { - "method": "POST", - "url": "http://vllm:8000/v1/chat/completions", - "sendHeaders": true, - "headerParameters": { - "parameters": [ - { - "name": "Content-Type", - "value": "application/json" - } - ] - }, - "sendBody": true, - "specifyBody": "json", - "jsonBody": "={{ JSON.stringify({ model: 'Qwen/Qwen2.5-32B-Instruct-AWQ', messages: [{ role: 'system', content: 'You are an assistant that summarizes voice memos. Create a clear, concise summary that captures:\\n1. Main topic/purpose of the memo\\n2. Key points or action items\\n3. Any important details or decisions mentioned\\n\\nKeep the summary under 150 words. Use bullet points for action items.' }, { role: 'user', content: 'Please summarize this voice memo transcript:\\n\\n' + $json.transcript }], temperature: 0.5, max_tokens: 512, stream: false }) }}", - "options": { - "timeout": 120000 - } - }, - "id": "summarize-text", - "name": "Summarize Text", - "type": "n8n-nodes-base.httpRequest", - "typeVersion": 4.2, - "position": [680, 850] - }, - { - "parameters": { - "jsCode": "// Combine transcript and summary into final output\nconst transcript = $('Format Text Input').item.json;\nconst summary = $json.choices[0].message.content;\n\nconst dateStr = new Date(transcript.timestamp).toLocaleDateString('en-US', {\n year: 'numeric',\n month: 'long',\n day: 'numeric',\n hour: '2-digit',\n minute: '2-digit'\n});\n\n// Create markdown file content\nconst fileContent = `# Voice Memo\\n\\n**ID:** ${transcript.memo_id}\\n**Date:** ${dateStr}\\n**Words:** ${transcript.word_count}\\n\\n---\\n\\n## Summary\\n\\n${summary}\\n\\n---\\n\\n## Full Transcript\\n\\n${transcript.transcript}\\n`;\n\nreturn [{\n json: {\n memo_id: transcript.memo_id,\n timestamp: transcript.timestamp,\n duration_seconds: 0,\n word_count: transcript.word_count,\n summary: summary,\n transcript: transcript.transcript,\n file_content: fileContent\n }\n}];" - }, - "id": "build-text-output", - "name": "Build Text Output", - "type": "n8n-nodes-base.code", - "typeVersion": 2, - "position": [900, 850] - }, - { - "parameters": { - "operation": "write", - "fileName": "=/memos/{{ $json.memo_id }}.md", - "options": {}, - "dataPropertyName": "file_content" - }, - "id": "save-text-memo", - "name": "Save Text Memo", - "type": "n8n-nodes-base.readWriteFile", - "typeVersion": 1, - "position": [1120, 800] - }, - { - "parameters": { - "respondWith": "json", - "responseBody": "={{ { success: true, memo_id: $json.memo_id, timestamp: $json.timestamp, word_count: $json.word_count, summary: $json.summary, transcript: $json.transcript, saved_to: '/memos/' + $json.memo_id + '.md' } }}", - "options": {} - }, - "id": "respond-text", - "name": "Return Text Result", - "type": "n8n-nodes-base.respondToWebhook", - "typeVersion": 1.1, - "position": [1120, 950] - } - ], - "connections": { - "Receive Voice Memo": { - "main": [ - [ - { - "node": "Whisper Transcribe", - "type": "main", - "index": 0 - } - ] - ] - }, - "Whisper Transcribe": { - "main": [ - [ - { - "node": "Format Transcript", - "type": "main", - "index": 0 - } - ] - ] - }, - "Format Transcript": { - "main": [ - [ - { - "node": "Generate Summary", - "type": "main", - "index": 0 - } - ] - ] - }, - "Generate Summary": { - "main": [ - [ - { - "node": "Build Output", - "type": "main", - "index": 0 - } - ] - ] - }, - "Build Output": { - "main": [ - [ - { - "node": "Save Memo", - "type": "main", - "index": 0 - }, - { - "node": "Return Result", - "type": "main", - "index": 0 - } - ] - ] - }, - "List Memos Endpoint": { - "main": [ - [ - { - "node": "List Memo Files", - "type": "main", - "index": 0 - } - ] - ] - }, - "List Memo Files": { - "main": [ - [ - { - "node": "Format List", - "type": "main", - "index": 0 - } - ] - ] - }, - "Format List": { - "main": [ - [ - { - "node": "Return List", - "type": "main", - "index": 0 - } - ] - ] - }, - "Text Memo Input": { - "main": [ - [ - { - "node": "Format Text Input", - "type": "main", - "index": 0 - } - ] - ] - }, - "Format Text Input": { - "main": [ - [ - { - "node": "Summarize Text", - "type": "main", - "index": 0 - } - ] - ] - }, - "Summarize Text": { - "main": [ - [ - { - "node": "Build Text Output", - "type": "main", - "index": 0 - } - ] - ] - }, - "Build Text Output": { - "main": [ - [ - { - "node": "Save Text Memo", - "type": "main", - "index": 0 - }, - { - "node": "Return Text Result", - "type": "main", - "index": 0 - } - ] - ] - } - }, - "active": false, - "settings": { - "executionOrder": "v1" - }, - "versionId": "1", - "meta": { - "templateCredsSetupCompleted": true, - "instanceId": "dream-server" - }, - "tags": [ - { - "name": "dream-server", - "id": "1" - }, - { - "name": "voice", - "id": "3" - }, - { - "name": "memos", - "id": "11" - } - ] -} diff --git a/install.ps1 b/install.ps1 index 0a222f243..31cd98eb7 100644 --- a/install.ps1 +++ b/install.ps1 @@ -1,450 +1,32 @@ -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• -# Lighthouse AI - Windows Installer -# https://github.com/Light-Heart-Labs/Lighthouse-AI -# -# Usage: -# .\install.ps1 # Interactive install -# .\install.ps1 -Config my.yaml # Use custom config -# .\install.ps1 -CleanupOnly # Only install session cleanup -# .\install.ps1 -ProxyOnly # Only install tool proxy -# .\install.ps1 -TokenSpyOnly # Only install Token Spy API monitor -# .\install.ps1 -Uninstall # Remove everything -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# Dream Server Root Installer (Windows) +# Delegates to dream-server/install.ps1 param( - [string]$Config = "", - [switch]$CleanupOnly, - [switch]$ProxyOnly, - [switch]$TokenSpyOnly, - [switch]$Uninstall, - [switch]$Help + [Parameter(ValueFromRemainingArguments=$true)] + [string[]]$RemainingArgs ) $ErrorActionPreference = "Stop" $ScriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path -if (-not $Config) { $Config = Join-Path $ScriptDir "config.yaml" } - -# โ”€โ”€ Colors โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -function Info($msg) { Write-Host "[INFO] $msg" -ForegroundColor Blue } -function Ok($msg) { Write-Host "[ OK] $msg" -ForegroundColor Green } -function Warn($msg) { Write-Host "[WARN] $msg" -ForegroundColor Yellow } -function Err($msg) { Write-Host "[FAIL] $msg" -ForegroundColor Red } - -# โ”€โ”€ Banner โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -Write-Host "" -Write-Host "===========================================================" -ForegroundColor Cyan -Write-Host " Lighthouse AI - Windows Installer" -ForegroundColor Cyan -Write-Host "===========================================================" -ForegroundColor Cyan +Write-Host "Dream Server Installer" -ForegroundColor Cyan Write-Host "" -if ($Help) { - Write-Host "Usage: .\install.ps1 [options]" - Write-Host "" - Write-Host "Options:" - Write-Host " -Config FILE Use custom config file (default: config.yaml)" - Write-Host " -CleanupOnly Only install session cleanup" - Write-Host " -ProxyOnly Only install vLLM tool proxy" - Write-Host " -TokenSpyOnly Only install Token Spy API monitor" - Write-Host " -Uninstall Remove all installed components" - Write-Host " -Help Show this help" - exit 0 -} - -# โ”€โ”€ Parse YAML (section-aware parser) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -# Usage: Parse-Yaml "section.key" "default" โ€” reads key within a section -# Parse-Yaml "key" "default" โ€” reads top-level key (legacy) -function Parse-Yaml { - param([string]$Input, [string]$Default) - if (-not (Test-Path $Config)) { return $Default } - - $section = "" - $key = $Input - if ($Input -match "^(.+)\.(.+)$") { - $section = $Matches[1] - $key = $Matches[2] - } - - if ($section) { - $lines = Get-Content $Config - $inSection = $false - foreach ($line in $lines) { - if ($line -match "^${section}:") { - $inSection = $true - continue - } - if ($inSection -and $line -match "^[a-zA-Z_]") { - break - } - if ($inSection -and $line -match "^\s+${key}:") { - $value = ($line -split ":\s*", 2)[1].Trim().Trim('"').Trim("'") - $value = ($value -split "\s*#")[0].Trim() - if ($value -and $value -ne '""' -and $value -ne "''") { return $value } - return $Default - } - } - return $Default - } else { - $match = Select-String -Path $Config -Pattern "^\s*${key}:" | Select-Object -First 1 - if ($match) { - $value = ($match.Line -split ":\s*", 2)[1].Trim().Trim('"').Trim("'") - $value = ($value -split "\s*#")[0].Trim() - if ($value -and $value -ne '""' -and $value -ne "''") { return $value } - } - return $Default - } -} - -# โ”€โ”€ Load config โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if (-not (Test-Path $Config)) { - Err "Config file not found: $Config" - Info "Copy config.yaml and edit it for your setup" +# Check if dream-server directory exists +$DreamServerDir = Join-Path $ScriptDir "dream-server" +if (-not (Test-Path $DreamServerDir)) { + Write-Host "Error: dream-server directory not found" -ForegroundColor Red + Write-Host "Expected: $DreamServerDir" -ForegroundColor Red exit 1 } -Info "Loading config from $Config" - -# Session cleanup settings -$OpenClawDir = Parse-Yaml "session_cleanup.openclaw_dir" "$env:USERPROFILE\.openclaw" -$OpenClawDir = $OpenClawDir -replace "^~", $env:USERPROFILE -$SessionsPath = Parse-Yaml "session_cleanup.sessions_path" "agents\main\sessions" -$MaxSessionSize = Parse-Yaml "session_cleanup.max_session_size" "256000" -$IntervalMinutes = Parse-Yaml "session_cleanup.interval_minutes" "60" - -# Proxy settings -$ProxyPort = Parse-Yaml "tool_proxy.port" "8003" -$VllmUrl = Parse-Yaml "tool_proxy.vllm_url" "http://localhost:8000" - -$SessionsDir = Join-Path $OpenClawDir $SessionsPath - -# Token Spy settings -$TsEnabled = Parse-Yaml "token_spy.enabled" "false" -$TsAgentName = Parse-Yaml "token_spy.agent_name" "my-agent" -$TsPort = Parse-Yaml "token_spy.port" "9110" -$TsHost = Parse-Yaml "token_spy.host" "0.0.0.0" -$TsAnthropicUpstream = Parse-Yaml "token_spy.anthropic_upstream" "https://api.anthropic.com" -$TsOpenaiUpstream = Parse-Yaml "token_spy.openai_upstream" "" -$TsApiProvider = Parse-Yaml "token_spy.api_provider" "anthropic" -$TsDbBackend = Parse-Yaml "token_spy.db_backend" "sqlite" -$TsSessionCharLimit = Parse-Yaml "token_spy.session_char_limit" "200000" - -Write-Host "" -Info "Configuration:" -Info " OpenClaw dir: $OpenClawDir" -Info " Max session size: $MaxSessionSize bytes" -Info " Cleanup interval: ${IntervalMinutes}min" -if ($TsEnabled -eq "true") { - Info " Token Spy: enabled on :$TsPort ($TsAgentName)" -} -Write-Host "" - -# โ”€โ”€ Task Name โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -$CleanupTaskName = "OpenClawSessionCleanup" -$ProxyTaskName = "OpenClawToolProxy" -$TokenSpyTaskName = "OpenClawTokenSpy" - -# โ”€โ”€ Uninstall โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if ($Uninstall) { - Info "Uninstalling Lighthouse AI..." - - # Remove scheduled task - if (Get-ScheduledTask -TaskName $CleanupTaskName -ErrorAction SilentlyContinue) { - Unregister-ScheduledTask -TaskName $CleanupTaskName -Confirm:$false - Ok "Removed cleanup scheduled task" - } - if (Get-ScheduledTask -TaskName $ProxyTaskName -ErrorAction SilentlyContinue) { - Unregister-ScheduledTask -TaskName $ProxyTaskName -Confirm:$false - Ok "Removed proxy scheduled task" - } - - if (Get-ScheduledTask -TaskName $TokenSpyTaskName -ErrorAction SilentlyContinue) { - Unregister-ScheduledTask -TaskName $TokenSpyTaskName -Confirm:$false - Ok "Removed Token Spy scheduled task" - } - - # Stop proxy and Token Spy if running - Get-Process python* | Where-Object { $_.CommandLine -like "*vllm-tool-proxy*" } | Stop-Process -Force -ErrorAction SilentlyContinue - Get-Process python* | Where-Object { $_.CommandLine -like "*uvicorn*main:app*" } | Stop-Process -Force -ErrorAction SilentlyContinue - - # Remove scripts - $CleanupScript = Join-Path $OpenClawDir "session-cleanup.ps1" - $ProxyScript = Join-Path $OpenClawDir "vllm-tool-proxy.py" - $TsDir = Join-Path $OpenClawDir "token-spy" - if (Test-Path $CleanupScript) { Remove-Item $CleanupScript; Ok "Removed $CleanupScript" } - if (Test-Path $ProxyScript) { Remove-Item $ProxyScript; Ok "Removed $ProxyScript" } - if (Test-Path $TsDir) { Remove-Item $TsDir -Recurse -Force; Ok "Removed $TsDir" } - - Ok "Uninstall complete" - exit 0 -} - -# โ”€โ”€ Preflight โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -Info "Running preflight checks..." - -if (-not (Test-Path $OpenClawDir)) { - Err "OpenClaw directory not found: $OpenClawDir" +# Delegate to dream-server installer +$DreamServerInstaller = Join-Path $DreamServerDir "install.ps1" +if (-not (Test-Path $DreamServerInstaller)) { + Write-Host "Error: dream-server installer not found" -ForegroundColor Red + Write-Host "Expected: $DreamServerInstaller" -ForegroundColor Red exit 1 } -Ok "OpenClaw directory found: $OpenClawDir" - -# Check Python -try { - $pyVer = python --version 2>&1 - Ok "Python found: $pyVer" -} catch { - try { - $pyVer = python3 --version 2>&1 - Ok "Python found: $pyVer" - } catch { - Err "Python not found. Install Python 3 first." - exit 1 - } -} - -# โ”€โ”€ Install Session Cleanup (Windows Task Scheduler) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if (-not $ProxyOnly -and -not $TokenSpyOnly) { - Info "Installing session cleanup..." - - # Create PowerShell version of cleanup script - $CleanupScript = Join-Path $OpenClawDir "session-cleanup.ps1" - - $cleanupContent = @" -# Lighthouse AI - Session Cleanup (Windows) -# Auto-generated by install.ps1 - -`$SessionsDir = "$SessionsDir" -`$SessionsJson = Join-Path `$SessionsDir "sessions.json" -`$MaxSize = $MaxSessionSize -Write-Output "[`$(Get-Date)] Session cleanup starting" - -if (-not (Test-Path `$SessionsJson)) { - Write-Output "[`$(Get-Date)] No sessions.json found, skipping" - exit 0 -} - -# Parse active session IDs -`$jsonContent = Get-Content `$SessionsJson -Raw | ConvertFrom-Json -`$activeIds = @() -`$jsonContent.PSObject.Properties | ForEach-Object { - if (`$_.Value -is [PSCustomObject] -and `$_.Value.sessionId) { - `$activeIds += `$_.Value.sessionId - } -} - -Write-Output "[`$(Get-Date)] Active sessions: `$(`$activeIds.Count)" - -# Clean debris -Get-ChildItem `$SessionsDir -Filter "*.deleted.*" -ErrorAction SilentlyContinue | Remove-Item -Force -Get-ChildItem `$SessionsDir -Filter "*.bak*" -ErrorAction SilentlyContinue | Where-Object { `$_.Name -notlike "*.bak-cleanup" } | Remove-Item -Force - -`$removedInactive = 0 -`$removedBloated = 0 -`$wipeIds = @() - -Get-ChildItem `$SessionsDir -Filter "*.jsonl" -ErrorAction SilentlyContinue | ForEach-Object { - `$basename = `$_.BaseName - `$isActive = `$activeIds -contains `$basename - - if (-not `$isActive) { - Write-Output "[`$(Get-Date)] Removing inactive session: `$basename (`$([math]::Round(`$_.Length/1KB))KB)" - Remove-Item `$_.FullName -Force - `$removedInactive++ - } else { - if (`$_.Length -gt `$MaxSize) { - Write-Output "[`$(Get-Date)] Session `$basename is bloated (`$([math]::Round(`$_.Length/1KB))KB), deleting to force fresh session" - Remove-Item `$_.FullName -Force - `$wipeIds += `$basename - `$removedBloated++ - } - } -} - -# Remove wiped sessions from sessions.json -if (`$wipeIds.Count -gt 0) { - Write-Output "[`$(Get-Date)] Clearing session references for: `$(`$wipeIds -join ', ')" - `$jsonContent = Get-Content `$SessionsJson -Raw | ConvertFrom-Json - - foreach (`$id in `$wipeIds) { - `$keysToRemove = @() - `$jsonContent.PSObject.Properties | ForEach-Object { - if (`$_.Value -is [PSCustomObject] -and `$_.Value.sessionId -eq `$id) { - `$keysToRemove += `$_.Name - } - } - foreach (`$key in `$keysToRemove) { - `$jsonContent.PSObject.Properties.Remove(`$key) - Write-Output " Removed session key: `$key" - } - } - - `$jsonContent | ConvertTo-Json -Depth 10 | Set-Content `$SessionsJson -Encoding UTF8 -} - -Write-Output "[`$(Get-Date)] Cleanup complete: removed `$removedInactive inactive, `$removedBloated bloated" -"@ - - Set-Content -Path $CleanupScript -Value $cleanupContent -Encoding UTF8 - Ok "Cleanup script installed: $CleanupScript" - - # Create scheduled task - if (Get-ScheduledTask -TaskName $CleanupTaskName -ErrorAction SilentlyContinue) { - Unregister-ScheduledTask -TaskName $CleanupTaskName -Confirm:$false - } - - $action = New-ScheduledTaskAction -Execute "powershell.exe" -Argument "-NoProfile -ExecutionPolicy Bypass -File `"$CleanupScript`"" - $trigger = New-ScheduledTaskTrigger -Once -At (Get-Date) -RepetitionInterval (New-TimeSpan -Minutes $IntervalMinutes) -RepetitionDuration ([TimeSpan]::MaxValue) - $settings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries -StartWhenAvailable - $principal = New-ScheduledTaskPrincipal -UserId $env:USERNAME -LogonType S4U -RunLevel Limited - - Register-ScheduledTask -TaskName $CleanupTaskName -Action $action -Trigger $trigger -Settings $settings -Principal $principal -Description "Lighthouse AI - Cleanup every ${IntervalMinutes}min" | Out-Null - Ok "Scheduled task created: $CleanupTaskName (every ${IntervalMinutes}min)" -} - -# โ”€โ”€ Install Tool Proxy โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if (-not $CleanupOnly -and -not $TokenSpyOnly) { - Info "Installing vLLM tool proxy..." - - $ProxyScript = Join-Path $OpenClawDir "vllm-tool-proxy.py" - Copy-Item (Join-Path $ScriptDir "scripts\vllm-tool-proxy.py") $ProxyScript -Force - Ok "Proxy script installed: $ProxyScript" - - # Check Python deps - $missingDeps = @() - try { python -c "import flask" 2>$null } catch { $missingDeps += "flask" } - try { python -c "import requests" 2>$null } catch { $missingDeps += "requests" } - if ($missingDeps.Count -gt 0) { - Info "Installing Python packages: $($missingDeps -join ', ')" - pip install @missingDeps --quiet 2>$null - } - - # Create scheduled task to run proxy at logon - if (Get-ScheduledTask -TaskName $ProxyTaskName -ErrorAction SilentlyContinue) { - Unregister-ScheduledTask -TaskName $ProxyTaskName -Confirm:$false - } - - $action = New-ScheduledTaskAction -Execute "python" -Argument "`"$ProxyScript`" --port $ProxyPort --vllm-url $VllmUrl" - $trigger = New-ScheduledTaskTrigger -AtLogOn - $settings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries -StartWhenAvailable -ExecutionTimeLimit (New-TimeSpan -Days 365) - - Register-ScheduledTask -TaskName $ProxyTaskName -Action $action -Trigger $trigger -Settings $settings -Description "Open Claw - vLLM Tool Call Proxy on :$ProxyPort" | Out-Null - Ok "Scheduled task created: $ProxyTaskName (starts at logon)" - - # Start it now - Start-ScheduledTask -TaskName $ProxyTaskName - Start-Sleep -Seconds 2 - - try { - $health = Invoke-RestMethod "http://localhost:$ProxyPort/health" -TimeoutSec 5 - Ok "Proxy is running: $($health.status)" - } catch { - Warn "Proxy may still be starting. Test with: curl http://localhost:${ProxyPort}/health" - } -} - -# โ”€โ”€ Install Token Spy โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if (($TokenSpyOnly -or (-not $CleanupOnly -and -not $ProxyOnly)) -and $TsEnabled -eq "true") { - Info "Installing Token Spy API monitor..." - - $TsInstallDir = Join-Path $OpenClawDir "token-spy" - $TsProvidersDir = Join-Path $TsInstallDir "providers" - New-Item -ItemType Directory -Path $TsProvidersDir -Force | Out-Null - - # Copy source files - Copy-Item (Join-Path $ScriptDir "token-spy\main.py") $TsInstallDir -Force - Copy-Item (Join-Path $ScriptDir "token-spy\db.py") $TsInstallDir -Force - Copy-Item (Join-Path $ScriptDir "token-spy\db_postgres.py") $TsInstallDir -Force - Copy-Item (Join-Path $ScriptDir "token-spy\requirements.txt") $TsInstallDir -Force - Copy-Item (Join-Path $ScriptDir "token-spy\providers\*.py") $TsProvidersDir -Force - - # Generate .env - $envContent = @" -# Token Spy - generated by install.ps1 -AGENT_NAME=$TsAgentName -PORT=$TsPort -ANTHROPIC_UPSTREAM=$TsAnthropicUpstream -OPENAI_UPSTREAM=$TsOpenaiUpstream -API_PROVIDER=$TsApiProvider -DB_BACKEND=$TsDbBackend -SESSION_CHAR_LIMIT=$TsSessionCharLimit -"@ - Set-Content -Path (Join-Path $TsInstallDir ".env") -Value $envContent -Encoding UTF8 - Ok "Token Spy installed: $TsInstallDir" - - # Install Python deps - $tsMissing = @() - try { python -c "import fastapi" 2>$null } catch { $tsMissing += "fastapi" } - try { python -c "import httpx" 2>$null } catch { $tsMissing += "httpx" } - try { python -c "import uvicorn" 2>$null } catch { $tsMissing += "uvicorn" } - if ($tsMissing.Count -gt 0) { - Info "Installing Token Spy packages..." - pip install -r (Join-Path $TsInstallDir "requirements.txt") --quiet 2>$null - } - - # Create scheduled task to run Token Spy at logon - if (Get-ScheduledTask -TaskName $TokenSpyTaskName -ErrorAction SilentlyContinue) { - Unregister-ScheduledTask -TaskName $TokenSpyTaskName -Confirm:$false - } - - $tsAction = New-ScheduledTaskAction -Execute "python" -Argument "-m uvicorn main:app --host $TsHost --port $TsPort" -WorkingDirectory $TsInstallDir - $tsTrigger = New-ScheduledTaskTrigger -AtLogOn - $tsSettings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries -StartWhenAvailable -ExecutionTimeLimit (New-TimeSpan -Days 365) - - Register-ScheduledTask -TaskName $TokenSpyTaskName -Action $tsAction -Trigger $tsTrigger -Settings $tsSettings -Description "Lighthouse AI - Token Spy on :$TsPort" | Out-Null - Ok "Scheduled task created: $TokenSpyTaskName (starts at logon)" - - # Start it now - Start-ScheduledTask -TaskName $TokenSpyTaskName - Start-Sleep -Seconds 3 - - try { - $tsHealth = Invoke-RestMethod "http://localhost:$TsPort/health" -TimeoutSec 5 - Ok "Token Spy is running: $($tsHealth.status)" - } catch { - Warn "Token Spy may still be starting. Test with: curl http://localhost:${TsPort}/health" - } -} - -# โ”€โ”€ Done โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -Write-Host "" -Write-Host "===========================================================" -ForegroundColor Cyan -Write-Host " Installation complete!" -ForegroundColor Green -Write-Host "===========================================================" -ForegroundColor Cyan -Write-Host "" - -if (-not $CleanupOnly -and -not $TokenSpyOnly) { - Info "IMPORTANT: Update your openclaw.json model providers to use the proxy:" - Write-Host "" - Write-Host " Change your provider baseUrl from:" - Write-Host " `"baseUrl`": `"http://localhost:8000/v1`"" - Write-Host "" - Write-Host " To:" - Write-Host " `"baseUrl`": `"http://localhost:${ProxyPort}/v1`"" - Write-Host "" -} - -if ($TsEnabled -eq "true" -and ($TokenSpyOnly -or (-not $CleanupOnly -and -not $ProxyOnly))) { - Info "IMPORTANT: Update your openclaw.json cloud providers to route through Token Spy:" - Write-Host "" - Write-Host " Anthropic: `"baseUrl`": `"http://localhost:${TsPort}`"" - Write-Host " OpenAI: `"baseUrl`": `"http://localhost:${TsPort}/v1`"" - Write-Host "" - Write-Host " Dashboard: http://localhost:${TsPort}/dashboard" - Write-Host "" -} - -Info "Useful commands:" -if (-not $ProxyOnly -and -not $TokenSpyOnly) { - Write-Host " Get-ScheduledTask -TaskName '$CleanupTaskName' # Check cleanup task" - Write-Host " Start-ScheduledTask -TaskName '$CleanupTaskName' # Run cleanup now" -} -if (-not $CleanupOnly -and -not $TokenSpyOnly) { - Write-Host " Get-ScheduledTask -TaskName '$ProxyTaskName' # Check proxy task" - Write-Host " curl http://localhost:${ProxyPort}/health # Test proxy" -} -if ($TsEnabled -eq "true" -and ($TokenSpyOnly -or (-not $CleanupOnly -and -not $ProxyOnly))) { - Write-Host " Get-ScheduledTask -TaskName '$TokenSpyTaskName' # Check Token Spy task" - Write-Host " curl http://localhost:${TsPort}/health # Test Token Spy" - Write-Host " Start http://localhost:${TsPort}/dashboard # Open dashboard" -} -Write-Host "" +# Execute dream-server installer with all passed arguments +& $DreamServerInstaller @RemainingArgs diff --git a/install.sh b/install.sh index ac4aef810..032d3df82 100755 --- a/install.sh +++ b/install.sh @@ -1,529 +1,25 @@ #!/bin/bash -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• -# Lighthouse AI - Installer -# https://github.com/Light-Heart-Labs/Lighthouse-AI -# -# Usage: -# ./install.sh # Interactive install -# ./install.sh --config my.yaml # Use custom config -# ./install.sh --cleanup-only # Only install session cleanup -# ./install.sh --proxy-only # Only install tool proxy -# ./install.sh --token-spy-only # Only install Token Spy API monitor -# ./install.sh --cold-storage-only # Only install LLM Cold Storage timer -# ./install.sh --uninstall # Remove everything -# โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ• +# Dream Server Root Installer +# Delegates to dream-server/install.sh set -euo pipefail SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -CONFIG_FILE="$SCRIPT_DIR/config.yaml" -CLEANUP_ONLY=false -PROXY_ONLY=false -TOKEN_SPY_ONLY=false -COLD_STORAGE_ONLY=false -UNINSTALL=false -# โ”€โ”€ Colors โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -BLUE='\033[0;34m' +# Colors CYAN='\033[0;36m' NC='\033[0m' -info() { echo -e "${BLUE}[INFO]${NC} $1"; } -ok() { echo -e "${GREEN}[ OK]${NC} $1"; } -warn() { echo -e "${YELLOW}[WARN]${NC} $1"; } -err() { echo -e "${RED}[FAIL]${NC} $1"; } - -# โ”€โ”€ Parse args โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -while [[ $# -gt 0 ]]; do - case $1 in - --config) CONFIG_FILE="$2"; shift 2 ;; - --cleanup-only) CLEANUP_ONLY=true; shift ;; - --proxy-only) PROXY_ONLY=true; shift ;; - --token-spy-only) TOKEN_SPY_ONLY=true; shift ;; - --cold-storage-only) COLD_STORAGE_ONLY=true; shift ;; - --uninstall) UNINSTALL=true; shift ;; - -h|--help) - echo "Usage: ./install.sh [options]" - echo "" - echo "Options:" - echo " --config FILE Use custom config file (default: config.yaml)" - echo " --cleanup-only Only install session cleanup" - echo " --proxy-only Only install vLLM tool proxy" - echo " --token-spy-only Only install Token Spy API monitor" - echo " --cold-storage-only Only install LLM Cold Storage timer" - echo " --uninstall Remove all installed components" - echo " -h, --help Show this help" - exit 0 - ;; - *) err "Unknown option: $1"; exit 1 ;; - esac -done - -# โ”€โ”€ Banner โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -echo "" -echo -e "${CYAN}โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•${NC}" -echo -e "${CYAN} Lighthouse AI - Installer${NC}" -echo -e "${CYAN}โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•${NC}" +echo -e "${CYAN}Dream Server Installer${NC}" echo "" -# โ”€โ”€ Parse config (section-aware YAML parser โ€” no dependencies needed) โ”€โ”€ -# Usage: parse_yaml "section.key" "default" โ€” reads key within a section -# parse_yaml "key" "default" โ€” reads top-level key (legacy) -parse_yaml() { - local input="$1" - local default="$2" - local section="" key="" value="" - - if [[ "$input" == *.* ]]; then - section="${input%%.*}" - key="${input#*.}" - else - key="$input" - fi - - if [ -n "$section" ]; then - # Extract lines between "section:" and the next top-level key (non-indented) - value=$(sed -n "/^${section}:/,/^[a-zA-Z_]/{/^${section}:/d;/^[a-zA-Z_]/d;p;}" "$CONFIG_FILE" \ - | grep -E "^\s+${key}:" | head -1 \ - | sed 's/.*:\s*//' | sed 's/\s*#.*//' | sed 's/^"//' | sed 's/"$//' | sed "s/^'//" | sed "s/'$//" | xargs) - else - value=$(grep -E "^\s*${key}:" "$CONFIG_FILE" 2>/dev/null | head -1 \ - | sed 's/.*:\s*//' | sed 's/\s*#.*//' | sed 's/^"//' | sed 's/"$//' | sed "s/^'//" | sed "s/'$//" | xargs) - fi - - if [ -z "$value" ] || [ "$value" = '""' ] || [ "$value" = "''" ]; then - echo "$default" - else - echo "$value" - fi -} - -# โ”€โ”€ Load config โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if [ ! -f "$CONFIG_FILE" ]; then - err "Config file not found: $CONFIG_FILE" - info "Copy config.yaml.example to config.yaml and edit it first" +# Check if dream-server directory exists +if [ ! -d "$SCRIPT_DIR/dream-server" ]; then + echo "Error: dream-server directory not found" + echo "Expected: $SCRIPT_DIR/dream-server" exit 1 fi -info "Loading config from $CONFIG_FILE" - -# Session cleanup settings -CLEANUP_ENABLED=$(parse_yaml "session_cleanup.enabled" "true") -OPENCLAW_DIR=$(parse_yaml "session_cleanup.openclaw_dir" "~/.openclaw") -OPENCLAW_DIR="${OPENCLAW_DIR/#\~/$HOME}" -SESSIONS_PATH=$(parse_yaml "session_cleanup.sessions_path" "agents/main/sessions") -MAX_SESSION_SIZE=$(parse_yaml "session_cleanup.max_session_size" "256000") -INTERVAL_MINUTES=$(parse_yaml "session_cleanup.interval_minutes" "60") -BOOT_DELAY=$(parse_yaml "session_cleanup.boot_delay_minutes" "5") - -# Proxy settings -PROXY_ENABLED=$(parse_yaml "tool_proxy.enabled" "true") -PROXY_PORT=$(parse_yaml "tool_proxy.port" "8003") -PROXY_HOST=$(parse_yaml "tool_proxy.host" "0.0.0.0") -VLLM_URL=$(parse_yaml "tool_proxy.vllm_url" "http://localhost:8000") -LOG_FILE=$(parse_yaml "tool_proxy.log_file" "~/vllm-proxy.log") -LOG_FILE="${LOG_FILE/#\~/$HOME}" - -# Token Spy settings -TS_ENABLED=$(parse_yaml "token_spy.enabled" "false") -TS_AGENT_NAME=$(parse_yaml "token_spy.agent_name" "my-agent") -TS_PORT=$(parse_yaml "token_spy.port" "9110") -TS_HOST=$(parse_yaml "token_spy.host" "0.0.0.0") -TS_ANTHROPIC_UPSTREAM=$(parse_yaml "token_spy.anthropic_upstream" "https://api.anthropic.com") -TS_OPENAI_UPSTREAM=$(parse_yaml "token_spy.openai_upstream" "") -TS_API_PROVIDER=$(parse_yaml "token_spy.api_provider" "anthropic") -TS_DB_BACKEND=$(parse_yaml "token_spy.db_backend" "sqlite") -TS_SESSION_CHAR_LIMIT=$(parse_yaml "token_spy.session_char_limit" "200000") -TS_AGENT_SESSION_DIRS=$(parse_yaml "token_spy.agent_session_dirs" "") -TS_LOCAL_MODEL_AGENTS=$(parse_yaml "token_spy.local_model_agents" "") - -# LLM Cold Storage settings -CS_ENABLED=$(parse_yaml "llm_cold_storage.enabled" "false") -CS_HF_CACHE=$(parse_yaml "llm_cold_storage.hf_cache_dir" "~/.cache/huggingface/hub") -CS_HF_CACHE="${CS_HF_CACHE/#\~/$HOME}" -CS_COLD_DIR=$(parse_yaml "llm_cold_storage.cold_dir" "~/llm-cold-storage") -CS_COLD_DIR="${CS_COLD_DIR/#\~/$HOME}" -CS_MAX_IDLE_DAYS=$(parse_yaml "llm_cold_storage.max_idle_days" "7") - -# System user -SYSTEM_USER=$(parse_yaml "system_user" "") -if [ -z "$SYSTEM_USER" ]; then - SYSTEM_USER="$(whoami)" -fi - -echo "" -info "Configuration:" -info " OpenClaw dir: $OPENCLAW_DIR" -info " System user: $SYSTEM_USER" -info " Max session size: $MAX_SESSION_SIZE bytes" -info " Cleanup interval: ${INTERVAL_MINUTES}min" -if [ "$PROXY_ONLY" = false ] && [ "$TOKEN_SPY_ONLY" = false ]; then - info " Session cleanup: $([ "$CLEANUP_ENABLED" = "true" ] && echo "enabled" || echo "disabled")" -fi -if [ "$CLEANUP_ONLY" = false ] && [ "$TOKEN_SPY_ONLY" = false ]; then - info " Tool proxy: $([ "$PROXY_ENABLED" = "true" ] && echo "enabled on :$PROXY_PORT -> $VLLM_URL" || echo "disabled")" -fi -if [ "$CLEANUP_ONLY" = false ] && [ "$PROXY_ONLY" = false ]; then - info " Token Spy: $([ "$TS_ENABLED" = "true" ] && echo "enabled on :$TS_PORT ($TS_AGENT_NAME)" || echo "disabled")" -fi -if [ "$COLD_STORAGE_ONLY" = true ] || ([ "$CLEANUP_ONLY" = false ] && [ "$PROXY_ONLY" = false ] && [ "$TOKEN_SPY_ONLY" = false ]); then - info " Cold Storage: $([ "$CS_ENABLED" = "true" ] && echo "enabled (idle >${CS_MAX_IDLE_DAYS}d โ†’ $CS_COLD_DIR)" || echo "disabled")" -fi -echo "" - -# โ”€โ”€ Uninstall โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if [ "$UNINSTALL" = true ]; then - info "Uninstalling Lighthouse AI..." - - if systemctl is-active --quiet openclaw-session-cleanup.timer 2>/dev/null; then - sudo systemctl stop openclaw-session-cleanup.timer - sudo systemctl disable openclaw-session-cleanup.timer - ok "Stopped session cleanup timer" - fi - sudo rm -f /etc/systemd/system/openclaw-session-cleanup.service - sudo rm -f /etc/systemd/system/openclaw-session-cleanup.timer - - if systemctl is-active --quiet vllm-tool-proxy 2>/dev/null; then - sudo systemctl stop vllm-tool-proxy - sudo systemctl disable vllm-tool-proxy - ok "Stopped tool proxy service" - fi - sudo rm -f /etc/systemd/system/vllm-tool-proxy.service - - # Token Spy (check for any token-spy@ instances) - for svc in $(systemctl list-units --type=service --all 2>/dev/null | grep -oP 'token-spy@[^.]+\.service' || true); do - sudo systemctl stop "$svc" 2>/dev/null || true - sudo systemctl disable "$svc" 2>/dev/null || true - ok "Stopped $svc" - done - sudo rm -f /etc/systemd/system/token-spy@.service - - # LLM Cold Storage - if systemctl --user is-active --quiet llm-cold-storage.timer 2>/dev/null; then - systemctl --user stop llm-cold-storage.timer - systemctl --user disable llm-cold-storage.timer - ok "Stopped cold storage timer" - fi - rm -f "$HOME/.config/systemd/user/llm-cold-storage.service" - rm -f "$HOME/.config/systemd/user/llm-cold-storage.timer" - systemctl --user daemon-reload 2>/dev/null || true - - sudo systemctl daemon-reload - rm -f "$OPENCLAW_DIR/session-cleanup.sh" - - ok "Uninstall complete" - exit 0 -fi - -# โ”€โ”€ Preflight checks โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -info "Running preflight checks..." - -# Check for OpenClaw (not needed for cold-storage-only) -if [ "$COLD_STORAGE_ONLY" = false ]; then - if [ ! -d "$OPENCLAW_DIR" ]; then - err "OpenClaw directory not found: $OPENCLAW_DIR" - err "Is OpenClaw installed? Edit openclaw_dir in config.yaml" - exit 1 - fi - ok "OpenClaw directory found: $OPENCLAW_DIR" -fi - -# Check for python3 (not needed for cold-storage-only) -if [ "$COLD_STORAGE_ONLY" = false ]; then - if ! command -v python3 &>/dev/null; then - err "python3 not found. Install Python 3 first." - exit 1 - fi - ok "Python 3 found: $(python3 --version 2>&1)" -fi - -# Check for systemd -if ! command -v systemctl &>/dev/null; then - warn "systemd not found โ€” will install scripts but not services" - warn "You'll need to run them manually or set up your own scheduler" - HAS_SYSTEMD=false -else - ok "systemd found" - HAS_SYSTEMD=true -fi - -# Check for sudo -if [ "$HAS_SYSTEMD" = true ] && ! sudo -n true 2>/dev/null; then - warn "sudo access required for systemd services (you'll be prompted)" -fi - -# Check Python deps for proxy -if [ "$CLEANUP_ONLY" = false ] && [ "$TOKEN_SPY_ONLY" = false ] && [ "$PROXY_ENABLED" = "true" ]; then - MISSING_DEPS=() - python3 -c "import flask" 2>/dev/null || MISSING_DEPS+=("flask") - python3 -c "import requests" 2>/dev/null || MISSING_DEPS+=("requests") - - if [ ${#MISSING_DEPS[@]} -gt 0 ]; then - warn "Missing Python packages: ${MISSING_DEPS[*]}" - info "Installing: pip3 install ${MISSING_DEPS[*]}" - pip3 install "${MISSING_DEPS[@]}" --quiet 2>/dev/null || { - err "Failed to install Python dependencies" - err "Run manually: pip3 install flask requests" - exit 1 - } - ok "Python dependencies installed" - else - ok "Python dependencies satisfied (flask, requests)" - fi -fi - -# Check Python deps for Token Spy -if ([ "$TOKEN_SPY_ONLY" = true ] || ([ "$CLEANUP_ONLY" = false ] && [ "$PROXY_ONLY" = false ])) && [ "$TS_ENABLED" = "true" ]; then - TS_MISSING_DEPS=() - python3 -c "import fastapi" 2>/dev/null || TS_MISSING_DEPS+=("fastapi") - python3 -c "import httpx" 2>/dev/null || TS_MISSING_DEPS+=("httpx") - python3 -c "import uvicorn" 2>/dev/null || TS_MISSING_DEPS+=("uvicorn") - - if [ ${#TS_MISSING_DEPS[@]} -gt 0 ]; then - warn "Missing Token Spy packages: ${TS_MISSING_DEPS[*]}" - info "Installing from token-spy/requirements.txt" - pip3 install -r "$SCRIPT_DIR/token-spy/requirements.txt" --quiet 2>/dev/null || { - err "Failed to install Token Spy dependencies" - err "Run manually: pip3 install -r token-spy/requirements.txt" - exit 1 - } - ok "Token Spy dependencies installed" - else - ok "Token Spy dependencies satisfied (fastapi, httpx, uvicorn)" - fi -fi - -echo "" - -# โ”€โ”€ Install Session Cleanup โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if [ "$PROXY_ONLY" = false ] && [ "$TOKEN_SPY_ONLY" = false ] && [ "$CLEANUP_ENABLED" = "true" ]; then - info "Installing session cleanup..." - - SESSIONS_DIR="$OPENCLAW_DIR/$SESSIONS_PATH" - - # Copy script to openclaw dir - cp "$SCRIPT_DIR/scripts/session-cleanup.sh" "$OPENCLAW_DIR/session-cleanup.sh" - chmod +x "$OPENCLAW_DIR/session-cleanup.sh" - - # Patch in config values - sed -i "s|OPENCLAW_DIR=\"\${OPENCLAW_DIR:-\$HOME/.openclaw}\"|OPENCLAW_DIR=\"$OPENCLAW_DIR\"|" "$OPENCLAW_DIR/session-cleanup.sh" - sed -i "s|SESSIONS_DIR=\"\${SESSIONS_DIR:-\$OPENCLAW_DIR/agents/main/sessions}\"|SESSIONS_DIR=\"$SESSIONS_DIR\"|" "$OPENCLAW_DIR/session-cleanup.sh" - sed -i "s|MAX_SIZE=\"\${MAX_SIZE:-256000}\"|MAX_SIZE=\"$MAX_SESSION_SIZE\"|" "$OPENCLAW_DIR/session-cleanup.sh" - - ok "Session cleanup script installed: $OPENCLAW_DIR/session-cleanup.sh" - - # Install systemd units - if [ "$HAS_SYSTEMD" = true ]; then - # Service - sudo cp "$SCRIPT_DIR/systemd/openclaw-session-cleanup.service" /etc/systemd/system/ - sudo sed -i "s|__USER__|$SYSTEM_USER|g" /etc/systemd/system/openclaw-session-cleanup.service - sudo sed -i "s|__OPENCLAW_DIR__|$OPENCLAW_DIR|g" /etc/systemd/system/openclaw-session-cleanup.service - - # Timer - sudo cp "$SCRIPT_DIR/systemd/openclaw-session-cleanup.timer" /etc/systemd/system/ - sudo sed -i "s|__INTERVAL__|$INTERVAL_MINUTES|g" /etc/systemd/system/openclaw-session-cleanup.timer - sudo sed -i "s|__BOOT_DELAY__|$BOOT_DELAY|g" /etc/systemd/system/openclaw-session-cleanup.timer - - sudo systemctl daemon-reload - sudo systemctl enable openclaw-session-cleanup.timer - sudo systemctl start openclaw-session-cleanup.timer - - ok "Session cleanup timer enabled (every ${INTERVAL_MINUTES}min)" - fi -fi - -# โ”€โ”€ Install Tool Proxy โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if [ "$CLEANUP_ONLY" = false ] && [ "$TOKEN_SPY_ONLY" = false ] && [ "$PROXY_ENABLED" = "true" ]; then - info "Installing vLLM tool proxy..." - - # Determine install location - INSTALL_DIR="$OPENCLAW_DIR" - cp "$SCRIPT_DIR/scripts/vllm-tool-proxy.py" "$INSTALL_DIR/vllm-tool-proxy.py" - chmod +x "$INSTALL_DIR/vllm-tool-proxy.py" - - ok "Tool proxy installed: $INSTALL_DIR/vllm-tool-proxy.py" - - # Install systemd service - if [ "$HAS_SYSTEMD" = true ]; then - # Stop existing if running - if systemctl is-active --quiet vllm-tool-proxy 2>/dev/null; then - sudo systemctl stop vllm-tool-proxy - fi - - sudo cp "$SCRIPT_DIR/systemd/vllm-tool-proxy.service" /etc/systemd/system/ - sudo sed -i "s|__USER__|$SYSTEM_USER|g" /etc/systemd/system/vllm-tool-proxy.service - sudo sed -i "s|__INSTALL_DIR__|$INSTALL_DIR|g" /etc/systemd/system/vllm-tool-proxy.service - sudo sed -i "s|__PROXY_PORT__|$PROXY_PORT|g" /etc/systemd/system/vllm-tool-proxy.service - sudo sed -i "s|__VLLM_URL__|$VLLM_URL|g" /etc/systemd/system/vllm-tool-proxy.service - - sudo systemctl daemon-reload - sudo systemctl enable vllm-tool-proxy - sudo systemctl start vllm-tool-proxy - - sleep 2 - if systemctl is-active --quiet vllm-tool-proxy; then - ok "Tool proxy service running on :$PROXY_PORT -> $VLLM_URL" - else - err "Tool proxy failed to start. Check: journalctl -u vllm-tool-proxy" - fi - else - info "No systemd. Start manually:" - info " python3 $INSTALL_DIR/vllm-tool-proxy.py --port $PROXY_PORT --vllm-url $VLLM_URL" - fi -fi - -# โ”€โ”€ Install Token Spy โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if ([ "$TOKEN_SPY_ONLY" = true ] || ([ "$CLEANUP_ONLY" = false ] && [ "$PROXY_ONLY" = false ])) && [ "$TS_ENABLED" = "true" ]; then - info "Installing Token Spy API monitor..." - - TS_INSTALL_DIR="$OPENCLAW_DIR/token-spy" - mkdir -p "$TS_INSTALL_DIR/providers" - - # Copy Token Spy source - cp "$SCRIPT_DIR/token-spy/main.py" "$TS_INSTALL_DIR/" - cp "$SCRIPT_DIR/token-spy/db.py" "$TS_INSTALL_DIR/" - cp "$SCRIPT_DIR/token-spy/db_postgres.py" "$TS_INSTALL_DIR/" - cp "$SCRIPT_DIR/token-spy/requirements.txt" "$TS_INSTALL_DIR/" - cp "$SCRIPT_DIR/token-spy/providers/"*.py "$TS_INSTALL_DIR/providers/" - - # Generate .env from config values - cat > "$TS_INSTALL_DIR/.env" << TSENV -# Token Spy โ€” generated by install.sh -AGENT_NAME=$TS_AGENT_NAME -PORT=$TS_PORT -ANTHROPIC_UPSTREAM=$TS_ANTHROPIC_UPSTREAM -OPENAI_UPSTREAM=$TS_OPENAI_UPSTREAM -API_PROVIDER=$TS_API_PROVIDER -DB_BACKEND=$TS_DB_BACKEND -SESSION_CHAR_LIMIT=$TS_SESSION_CHAR_LIMIT -AGENT_SESSION_DIRS=$TS_AGENT_SESSION_DIRS -LOCAL_MODEL_AGENTS=$TS_LOCAL_MODEL_AGENTS -TSENV - - ok "Token Spy installed: $TS_INSTALL_DIR" - - # Install systemd service - if [ "$HAS_SYSTEMD" = true ]; then - # Stop existing if running - if systemctl is-active --quiet "token-spy@${TS_AGENT_NAME}" 2>/dev/null; then - sudo systemctl stop "token-spy@${TS_AGENT_NAME}" - fi - - sudo cp "$SCRIPT_DIR/systemd/token-spy@.service" /etc/systemd/system/ - sudo sed -i "s|__USER__|$SYSTEM_USER|g" /etc/systemd/system/token-spy@.service - sudo sed -i "s|__INSTALL_DIR__|$TS_INSTALL_DIR|g" /etc/systemd/system/token-spy@.service - sudo sed -i "s|__HOST__|$TS_HOST|g" /etc/systemd/system/token-spy@.service - sudo sed -i "s|__PORT__|$TS_PORT|g" /etc/systemd/system/token-spy@.service - - sudo systemctl daemon-reload - sudo systemctl enable "token-spy@${TS_AGENT_NAME}" - sudo systemctl start "token-spy@${TS_AGENT_NAME}" - - sleep 2 - if systemctl is-active --quiet "token-spy@${TS_AGENT_NAME}"; then - ok "Token Spy running on :$TS_PORT (agent: $TS_AGENT_NAME)" - else - err "Token Spy failed to start. Check: journalctl -u token-spy@${TS_AGENT_NAME}" - fi - else - info "No systemd. Start manually:" - info " cd $TS_INSTALL_DIR && AGENT_NAME=$TS_AGENT_NAME python3 -m uvicorn main:app --host $TS_HOST --port $TS_PORT" - fi -fi - -# โ”€โ”€ Install LLM Cold Storage โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -if ([ "$COLD_STORAGE_ONLY" = true ] || ([ "$CLEANUP_ONLY" = false ] && [ "$PROXY_ONLY" = false ] && [ "$TOKEN_SPY_ONLY" = false ])) && [ "$CS_ENABLED" = "true" ]; then - info "Installing LLM Cold Storage..." - - if [ ! -f "$SCRIPT_DIR/scripts/llm-cold-storage.sh" ]; then - err "scripts/llm-cold-storage.sh not found" - exit 1 - fi - - chmod +x "$SCRIPT_DIR/scripts/llm-cold-storage.sh" - ok "Cold storage script: $SCRIPT_DIR/scripts/llm-cold-storage.sh" - - # Install systemd user timer - if [ "$HAS_SYSTEMD" = true ]; then - mkdir -p "$HOME/.config/systemd/user" - - # Service โ€” patch in config values - cp "$SCRIPT_DIR/systemd/llm-cold-storage.service" "$HOME/.config/systemd/user/" - sed -i "s|%h/Lighthouse-AI/scripts|$SCRIPT_DIR/scripts|g" "$HOME/.config/systemd/user/llm-cold-storage.service" - sed -i "s|%h/.cache/huggingface/hub|$CS_HF_CACHE|g" "$HOME/.config/systemd/user/llm-cold-storage.service" - sed -i "s|%h/llm-cold-storage|$CS_COLD_DIR|g" "$HOME/.config/systemd/user/llm-cold-storage.service" - # Remove User=%i (not needed for user services) - sed -i '/^User=%i/d' "$HOME/.config/systemd/user/llm-cold-storage.service" - - # Timer - cp "$SCRIPT_DIR/systemd/llm-cold-storage.timer" "$HOME/.config/systemd/user/" - - systemctl --user daemon-reload - systemctl --user enable llm-cold-storage.timer - systemctl --user start llm-cold-storage.timer - - ok "Cold storage timer enabled (daily at 2am)" - info " Dry-run first: $SCRIPT_DIR/scripts/llm-cold-storage.sh" - info " Execute: $SCRIPT_DIR/scripts/llm-cold-storage.sh --execute" - else - info "No systemd. Run manually:" - info " HF_CACHE=$CS_HF_CACHE COLD_DIR=$CS_COLD_DIR $SCRIPT_DIR/scripts/llm-cold-storage.sh --execute" - fi -fi - -# โ”€โ”€ OpenClaw Config Reminder โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ -echo "" -echo -e "${CYAN}โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•${NC}" -echo -e "${GREEN} Installation complete!${NC}" -echo -e "${CYAN}โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•${NC}" -echo "" - -if [ "$CLEANUP_ONLY" = false ] && [ "$TOKEN_SPY_ONLY" = false ] && [ "$PROXY_ENABLED" = "true" ]; then - info "IMPORTANT: Update your openclaw.json model providers to use the proxy:" - echo "" - echo " Change your provider baseUrl from:" - echo " \"baseUrl\": \"http://localhost:8000/v1\"" - echo "" - echo " To:" - echo " \"baseUrl\": \"http://localhost:${PROXY_PORT}/v1\"" - echo "" -fi - -if [ "$TS_ENABLED" = "true" ] && ([ "$TOKEN_SPY_ONLY" = true ] || ([ "$CLEANUP_ONLY" = false ] && [ "$PROXY_ONLY" = false ])); then - info "IMPORTANT: Update your openclaw.json cloud providers to route through Token Spy:" - echo "" - echo " Change your Anthropic baseUrl to:" - echo " \"baseUrl\": \"http://localhost:${TS_PORT}\"" - echo "" - echo " Change your OpenAI-compatible baseUrl to:" - echo " \"baseUrl\": \"http://localhost:${TS_PORT}/v1\"" - echo "" - echo " Dashboard: http://localhost:${TS_PORT}/dashboard" - echo "" -fi - -info "Useful commands:" -if [ "$HAS_SYSTEMD" = true ]; then - if [ "$PROXY_ONLY" = false ] && [ "$TOKEN_SPY_ONLY" = false ]; then - echo " systemctl status openclaw-session-cleanup.timer # Check timer" - echo " journalctl -u openclaw-session-cleanup -f # Watch cleanup logs" - fi - if [ "$CLEANUP_ONLY" = false ] && [ "$TOKEN_SPY_ONLY" = false ]; then - echo " systemctl status vllm-tool-proxy # Check proxy" - echo " journalctl -u vllm-tool-proxy -f # Watch proxy logs" - echo " curl http://localhost:${PROXY_PORT}/health # Test proxy health" - fi - if [ "$TS_ENABLED" = "true" ] && ([ "$TOKEN_SPY_ONLY" = true ] || ([ "$CLEANUP_ONLY" = false ] && [ "$PROXY_ONLY" = false ])); then - echo " systemctl status token-spy@${TS_AGENT_NAME} # Check Token Spy" - echo " journalctl -u token-spy@${TS_AGENT_NAME} -f # Watch Token Spy logs" - echo " curl http://localhost:${TS_PORT}/health # Test Token Spy health" - fi - if [ "$CS_ENABLED" = "true" ] && ([ "$COLD_STORAGE_ONLY" = true ] || ([ "$CLEANUP_ONLY" = false ] && [ "$PROXY_ONLY" = false ] && [ "$TOKEN_SPY_ONLY" = false ])); then - echo " systemctl --user status llm-cold-storage.timer # Check cold storage timer" - echo " systemctl --user list-timers llm-cold-storage.timer # Next run time" - fi -fi -echo "" +# Delegate to dream-server installer +cd "$SCRIPT_DIR/dream-server" +exec ./install.sh "$@"