feat: add RVC extension for voice conversion/cloning (Wave 1, Light-Heart-Labs#8 priority)

Android-16 · Android-16 · commit 73696b022974 · 2026-03-07T19:21:39.000-05:00
diff --git a/dream-server/extensions/services/rvc/README.md b/dream-server/extensions/services/rvc/README.md
@@ -0,0 +1,76 @@
+# RVC Extension for Dream Server
+
+## Overview
+
+RVC (Retrieval-based Voice Conversion) is a voice cloning and conversion system. Train voice models with just 10 minutes of audio data, then convert any voice to match your trained model.
+
+## Features
+
+- Train voice models with minimal data (≤10 mins)
+- Real-time voice conversion
+- Singing voice conversion
+- Pre-trained model support
+- GPU accelerated inference
+
+## Usage
+
+### Enable the extension
+
+```bash
+dream extensions enable rvc
+```
+
+### Access the WebUI
+
+```
+http://localhost:${RVC_PORT:-7865}
+```
+
+### Basic Workflow
+
+1. **Prepare dataset**: Place audio files in `./data/rvc/dataset/`
+2. **Process data**: Use the WebUI to preprocess and extract features
+3. **Train model**: Train your voice model (30 mins - 4 hours depending on data)
+4. **Convert voice**: Use trained model to convert new audio
+
+## Data Directories
+
+| Path | Purpose |
+|------|---------|
+| `./data/rvc/dataset/` | Training audio files |
+| `./data/rvc/weights/` | Model weights and checkpoints |
+| `./data/rvc/opt/` | Output converted audio |
+| `./data/rvc/logs/` | Training logs, TensorBoard |
+
+## Configuration
+
+| Environment Variable | Default | Description |
+|---------------------|---------|-------------|
+| `RVC_PORT` | 7865 | Web UI port |
+
+## GPU Memory Requirements
+
+| Quality | VRAM Required |
+|---------|--------------|
+| Low | 4 GB |
+| Medium | 6 GB |
+| High | 8+ GB |
+
+## Integration
+
+RVC works standalone for voice conversion. Combined with other extensions:
+- **Piper TTS** → RVC: TTS output → voice conversion
+- **Whisper** → RVC: Transcribe → convert voice → synthesize
+
+## Uninstall
+
+```bash
+dream extensions disable rvc
+```
+
+Models and data in `./data/rvc/` are preserved.
+
+## Documentation
+
+- GitHub: <https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI>
+- Models: <https://huggingface.co/lj1995/VoiceConversionWebUI>
diff --git a/dream-server/extensions/services/rvc/compose.yaml b/dream-server/extensions/services/rvc/compose.yaml
@@ -0,0 +1,37 @@
+name: rvc
+
+services:
+  rvc:
+    image: aladdin1234/rvc-webui:0.1
+    container_name: rvc
+    ports:
+      - "${RVC_PORT:-7865}:7865"
+    volumes:
+      - ./data/rvc/weights:/app/assets/weights
+      - ./data/rvc/opt:/app/opt
+      - ./data/rvc/dataset:/app/dataset
+      - ./data/rvc/logs:/app/logs
+    environment:
+      - LLM_API_URL=${LLM_API_URL:-http://localhost:8000}
+    shm_size: "1gb"
+    restart: unless-stopped
+    healthcheck:
+      test: ["CMD", "wget", "-q", "--spider", "http://localhost:7865"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+      start_period: 120s
+    networks:
+      - dream-network
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: all
+              capabilities: [gpu]
+
+networks:
+  dream-network:
+    external: true
+    name: dream-network
diff --git a/dream-server/extensions/services/rvc/manifest.yaml b/dream-server/extensions/services/rvc/manifest.yaml
@@ -0,0 +1,20 @@
+name: rvc
+version: "1.0.0"
+description: RVC - Retrieval-based Voice Conversion for voice cloning
+category: tool
+authors:
+  - RVC-Project
+website: https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
+image: aladdin1234/rvc-webui:0.1
+gpu: [amd, nvidia]
+port: 7865
+data:
+  - ./data/rvc/weights:/app/assets/weights
+  - ./data/rvc/opt:/app/opt
+  - ./data/rvc/dataset:/app/dataset
+healthcheck:
+  - wget -q --spider http://localhost:7865
+env:
+  - LLM_API_URL
+dependencies: []
+tags: [voice, audio, ai, conversion, cloning]