HTTP microservice using Axum to generate TTS from an HTTP reqwest.
- eSpeak - Local TTS, low quality. Returns WAV audio.
- gTTS - Cloud TTS, medium quality. Returns MP3 audio
- gcloud - Google Cloud TTS, high quality. Returns OPUS audio. Requires a gCloud API key
- Polly - Amazon Polly TTS, high quality. Returns OggVorbis audio. Requires Amazon Polly credentials
GET /tts?text={CONTENT}&lang={VOICE}&mode={MODE}&speaking_rate={SPEAKING_RATE}&max_length={MAX_LENGTH}&preferred_format={PREFERRED_AUDIO_FORMAT}- Returns the audio generated.GET /voices?mode={MODE}&raw={BOOL}- Returns the supported voices for the given mode as either a JSON array of strings, or a raw format from the source with therawset to true.GET /modes- Returns the currently supported modes for TTS as a JSON array of strings.
Non-200 responses will return a JSON object with the following keys:
0- Unknown error1- Unknown voice2- Max length exceeded3- Speaking rate exceeded limits, see thedisplayfor more information4-AUTH_KEYhas been set and theAuthorizationheader doesn't match the key.
A human readable message describing the error
-
BIND_ADDR(0.0.0.0:3000) - The address to bind the web server to -
LOG_LEVEL(INFO) - The lowest log level to output to stdout -
AUTH_KEY- If set, this key must be sent in theAuthorizationheader of each request
IPV6_BLOCK- A block of IPv6 addresses, randomly selected for each gTTS request
GOOGLE_APPLICATION_CREDENTIALS- The file path to the gCloud JSON
-
AWS_REGION- The AWS region to use -
AWS_ACCESS_KEY_ID- The AWS access key ID -
AWS_SECRET_ACCESS_KEY- The AWS secret access key