You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -75,29 +79,36 @@ The language to use for speech synthesis. Defaults to `en-US`. The following lan
75
79
- Welsh (`cy-GB`)
76
80
77
81
#### `voice` (String) (optional)
78
-
The voice to use for speech synthesis. Defaults to `Joanna` when `provider` is `aws-polly`, or `alloy` when using the OpenAI provider.
82
+
The voice to use for speech synthesis. Defaults to `Joanna` when `provider` is `aws-polly`, `alloy` when using the OpenAI provider, or `21m00Tcm4TlvDq8ikWAM` when using ElevenLabs.
79
83
80
84
-**AWS Polly voices:** See the [AWS Polly voice list](https://docs.aws.amazon.com/polly/latest/dg/available-voices.html) for available IDs and languages.
81
85
-**OpenAI voices:** Built-in options include `alloy`, `ash`, `ballad`, `coral`, `echo`, `fable`, `nova`, `onyx`, `sage`, and `shimmer`.
86
+
-**ElevenLabs voices:** Use any ElevenLabs voice ID from your account (for example `21m00Tcm4TlvDq8ikWAM` for the public "Rachel" sample voice).
82
87
83
88
#### `engine` (String) (optional)
84
89
*AWS Polly only.*
85
90
86
91
The speech synthesis engine to use. Can be `standard`, `neural`, `long-form`, or `generative`. Defaults to `standard`. Higher-end engines provide better quality but may incur higher usage costs.
87
92
88
93
#### `provider` (String) (optional)
89
-
Selects which backend performs the synthesis. Use `'aws-polly'` (default) for the existing AWS voices, or `'openai'` to access the GPT-4o mini TTS family.
94
+
Selects which backend performs the synthesis. Use `'aws-polly'` (default) for the existing AWS voices, `'openai'` to access the GPT-4o mini TTS family, or `'elevenlabs'` to use ElevenLabs voices.
90
95
91
96
#### `model` (String) (optional)
92
-
*OpenAI provider only.*
97
+
Specifies which TTS model to use for the selected provider.
93
98
94
-
Specifies which OpenAI TTS model to use. Defaults to `gpt-4o-mini-tts`. Other available models include `tts-1` and `tts-1-hd`.
99
+
-*OpenAI:* Defaults to `gpt-4o-mini-tts`. Other available models include `tts-1` and `tts-1-hd`.
100
+
-*ElevenLabs:* Defaults to `eleven_multilingual_v2`. Other available models include `eleven_flash_v2_5`, `eleven_turbo_v2_5`, and `eleven_v3`.
95
101
96
102
#### `response_format` (String) (optional)
97
103
*OpenAI provider only.*
98
104
99
105
Controls the output format when using OpenAI. Defaults to `mp3`, but you can request `wav`, `opus`, `aac`, `flac`, or `pcm` for different latency/quality characteristics.
100
106
107
+
#### `output_format` (String) (optional)
108
+
*ElevenLabs provider only.*
109
+
110
+
Controls the output format when using ElevenLabs. Defaults to `mp3_44100_128`. See the ElevenLabs docs for supported presets (e.g. `pcm_16000`, `ulaw_8000`).
111
+
101
112
#### `instructions` (String) (optional)
102
113
*OpenAI provider only.*
103
114
@@ -174,6 +185,31 @@ A `Promise` that resolves to an `HTMLAudioElement`. The element’s `src` points
0 commit comments