Skip to content

Commit 73696b0

Browse files
author
Android-16
committed
feat: add RVC extension for voice conversion/cloning (Wave 1, Light-Heart-Labs#8 priority)
1 parent da931fb commit 73696b0

3 files changed

Lines changed: 133 additions & 0 deletions

File tree

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# RVC Extension for Dream Server
2+
3+
## Overview
4+
5+
RVC (Retrieval-based Voice Conversion) is a voice cloning and conversion system. Train voice models with just 10 minutes of audio data, then convert any voice to match your trained model.
6+
7+
## Features
8+
9+
- Train voice models with minimal data (≤10 mins)
10+
- Real-time voice conversion
11+
- Singing voice conversion
12+
- Pre-trained model support
13+
- GPU accelerated inference
14+
15+
## Usage
16+
17+
### Enable the extension
18+
19+
```bash
20+
dream extensions enable rvc
21+
```
22+
23+
### Access the WebUI
24+
25+
```
26+
http://localhost:${RVC_PORT:-7865}
27+
```
28+
29+
### Basic Workflow
30+
31+
1. **Prepare dataset**: Place audio files in `./data/rvc/dataset/`
32+
2. **Process data**: Use the WebUI to preprocess and extract features
33+
3. **Train model**: Train your voice model (30 mins - 4 hours depending on data)
34+
4. **Convert voice**: Use trained model to convert new audio
35+
36+
## Data Directories
37+
38+
| Path | Purpose |
39+
|------|---------|
40+
| `./data/rvc/dataset/` | Training audio files |
41+
| `./data/rvc/weights/` | Model weights and checkpoints |
42+
| `./data/rvc/opt/` | Output converted audio |
43+
| `./data/rvc/logs/` | Training logs, TensorBoard |
44+
45+
## Configuration
46+
47+
| Environment Variable | Default | Description |
48+
|---------------------|---------|-------------|
49+
| `RVC_PORT` | 7865 | Web UI port |
50+
51+
## GPU Memory Requirements
52+
53+
| Quality | VRAM Required |
54+
|---------|--------------|
55+
| Low | 4 GB |
56+
| Medium | 6 GB |
57+
| High | 8+ GB |
58+
59+
## Integration
60+
61+
RVC works standalone for voice conversion. Combined with other extensions:
62+
- **Piper TTS** → RVC: TTS output → voice conversion
63+
- **Whisper** → RVC: Transcribe → convert voice → synthesize
64+
65+
## Uninstall
66+
67+
```bash
68+
dream extensions disable rvc
69+
```
70+
71+
Models and data in `./data/rvc/` are preserved.
72+
73+
## Documentation
74+
75+
- GitHub: <https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI>
76+
- Models: <https://huggingface.co/lj1995/VoiceConversionWebUI>
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
name: rvc
2+
3+
services:
4+
rvc:
5+
image: aladdin1234/rvc-webui:0.1
6+
container_name: rvc
7+
ports:
8+
- "${RVC_PORT:-7865}:7865"
9+
volumes:
10+
- ./data/rvc/weights:/app/assets/weights
11+
- ./data/rvc/opt:/app/opt
12+
- ./data/rvc/dataset:/app/dataset
13+
- ./data/rvc/logs:/app/logs
14+
environment:
15+
- LLM_API_URL=${LLM_API_URL:-http://localhost:8000}
16+
shm_size: "1gb"
17+
restart: unless-stopped
18+
healthcheck:
19+
test: ["CMD", "wget", "-q", "--spider", "http://localhost:7865"]
20+
interval: 30s
21+
timeout: 10s
22+
retries: 3
23+
start_period: 120s
24+
networks:
25+
- dream-network
26+
deploy:
27+
resources:
28+
reservations:
29+
devices:
30+
- driver: nvidia
31+
count: all
32+
capabilities: [gpu]
33+
34+
networks:
35+
dream-network:
36+
external: true
37+
name: dream-network
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
name: rvc
2+
version: "1.0.0"
3+
description: RVC - Retrieval-based Voice Conversion for voice cloning
4+
category: tool
5+
authors:
6+
- RVC-Project
7+
website: https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
8+
image: aladdin1234/rvc-webui:0.1
9+
gpu: [amd, nvidia]
10+
port: 7865
11+
data:
12+
- ./data/rvc/weights:/app/assets/weights
13+
- ./data/rvc/opt:/app/opt
14+
- ./data/rvc/dataset:/app/dataset
15+
healthcheck:
16+
- wget -q --spider http://localhost:7865
17+
env:
18+
- LLM_API_URL
19+
dependencies: []
20+
tags: [voice, audio, ai, conversion, cloning]

0 commit comments

Comments
 (0)