This project provides an OpenAI compatible text-to-speech server for the OpenVoice project.
OpenVoice is an Open Source project of Text-to-speech model and tools.
- text-to-speech
- tone management
- create tone by upload wave files
- list tones
- openai-compatible '/v1/audio/speech' endpoint
You need to build the docker images.
cd src
docker build . -t openvoice-serverThe Dockerfile requires the download of all related models, hence the resulting docker image can run in an isolated environment.
Currently this project only supports CUDA.
To run the server, use the docker image you just built:
docker run --rm -p 18080:8080 --device=nvidia.com/gpu=all openvoice-server:latestDATABASE_URLThe server stores tone information in a SQLite database, you can control the position of the database by specifying theDATABASE_URLenvironment variable.
Get a list of existing tones
Create a tone by uploading a .wav file.
- Request
Content-Type: multipart/form-data
Fields in request:
-
audiofile
-
name
-
desc
-
Response
Content-Type: application/json
See examples/create-tone.sh for the curl command.
Get the details of a tone by tone_id.
- Response
Content-Type: application/json
Generate voice using the specified tone_id and dialect.
- Request
Content-Type: application/json
Fields in request:
-
text
-
speed[optional]
-
Response
Content-Type: audio/x-wav
See examples/generate_speech.sh for the curl command
Get a list of supported languages and dialects.
The OpenAI compatible endpoint.
See examples/generate_speech_openai.sh for the curl command.
Note:
- use
dialectas model in request, find dialects from the/capendpoint. - use
tone_nameas voice in request, find tones from the/tonesendpoint.