Skip to content

Commit cdaaca6

Browse files
authored
Add 11labs support to txt2speech docs (#76)
* Add 11labs support to `txt2speech` docs * Add 11labs tts example to playground
1 parent 94b1655 commit cdaaca6

File tree

3 files changed

+70
-8
lines changed

3 files changed

+70
-8
lines changed

src/AI/txt2speech.md

Lines changed: 44 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -24,12 +24,16 @@ A string containing the text you want to convert to speech. The text must be les
2424
An object containing the following optional properties:
2525

2626
- `language` (String): Language code for speech synthesis (AWS Polly only). Defaults to `en-US`.
27-
- `voice` (String): Voice ID used for synthesis. Defaults to `Joanna` (AWS) or `alloy` (OpenAI).
27+
- `voice` (String): Voice ID used for synthesis. Defaults to `Joanna` (AWS), `alloy` (OpenAI), or `21m00Tcm4TlvDq8ikWAM` (ElevenLabs sample voice).
2828
- `engine` (String): AWS Polly engine. Can be `standard`, `neural`, `long-form`, or `generative`. Defaults to `standard`.
29-
- `provider` (String): TTS provider to use. Supports `'aws-polly'` (default) and `'openai'`.
30-
- `model` (String): OpenAI text-to-speech model (`gpt-4o-mini-tts`, `tts-1`, `tts-1-hd`, ...). Defaults to `gpt-4o-mini-tts`.
31-
- `response_format` (String): Desired OpenAI output format (`mp3`, `wav`, `opus`, `aac`, `flac`, `pcm`). Defaults to `mp3`.
29+
- `provider` (String): TTS provider to use. Supports `'aws-polly'` (default), `'openai'`, and `'elevenlabs'`.
30+
- `model` (String): Model identifier for the chosen provider. Examples:
31+
- OpenAI: `gpt-4o-mini-tts` (default), `tts-1`, `tts-1-hd`
32+
- ElevenLabs: `eleven_multilingual_v2` (default), `eleven_flash_v2_5`, `eleven_turbo_v2_5`, `eleven_v3`
33+
- `response_format` (String): Output format for OpenAI voices (`mp3`, `wav`, `opus`, `aac`, `flac`, `pcm`). Defaults to `mp3`.
34+
- `output_format` (String): Output format for ElevenLabs voices (e.g. `mp3_44100_128`). Defaults to `mp3_44100_128` when using ElevenLabs.
3235
- `instructions` (String): Additional guidance for OpenAI voices (tone, pacing, style, etc.).
36+
- `voice_settings` (Object): ElevenLabs voice tuning options (e.g. stability, similarity boost, speed).
3337

3438
#### `language` (String) (optional)
3539
*AWS Polly only.*
@@ -75,29 +79,36 @@ The language to use for speech synthesis. Defaults to `en-US`. The following lan
7579
- Welsh (`cy-GB`)
7680

7781
#### `voice` (String) (optional)
78-
The voice to use for speech synthesis. Defaults to `Joanna` when `provider` is `aws-polly`, or `alloy` when using the OpenAI provider.
82+
The voice to use for speech synthesis. Defaults to `Joanna` when `provider` is `aws-polly`, `alloy` when using the OpenAI provider, or `21m00Tcm4TlvDq8ikWAM` when using ElevenLabs.
7983

8084
- **AWS Polly voices:** See the [AWS Polly voice list](https://docs.aws.amazon.com/polly/latest/dg/available-voices.html) for available IDs and languages.
8185
- **OpenAI voices:** Built-in options include `alloy`, `ash`, `ballad`, `coral`, `echo`, `fable`, `nova`, `onyx`, `sage`, and `shimmer`.
86+
- **ElevenLabs voices:** Use any ElevenLabs voice ID from your account (for example `21m00Tcm4TlvDq8ikWAM` for the public "Rachel" sample voice).
8287

8388
#### `engine` (String) (optional)
8489
*AWS Polly only.*
8590

8691
The speech synthesis engine to use. Can be `standard`, `neural`, `long-form`, or `generative`. Defaults to `standard`. Higher-end engines provide better quality but may incur higher usage costs.
8792

8893
#### `provider` (String) (optional)
89-
Selects which backend performs the synthesis. Use `'aws-polly'` (default) for the existing AWS voices, or `'openai'` to access the GPT-4o mini TTS family.
94+
Selects which backend performs the synthesis. Use `'aws-polly'` (default) for the existing AWS voices, `'openai'` to access the GPT-4o mini TTS family, or `'elevenlabs'` to use ElevenLabs voices.
9095

9196
#### `model` (String) (optional)
92-
*OpenAI provider only.*
97+
Specifies which TTS model to use for the selected provider.
9398

94-
Specifies which OpenAI TTS model to use. Defaults to `gpt-4o-mini-tts`. Other available models include `tts-1` and `tts-1-hd`.
99+
- *OpenAI:* Defaults to `gpt-4o-mini-tts`. Other available models include `tts-1` and `tts-1-hd`.
100+
- *ElevenLabs:* Defaults to `eleven_multilingual_v2`. Other available models include `eleven_flash_v2_5`, `eleven_turbo_v2_5`, and `eleven_v3`.
95101

96102
#### `response_format` (String) (optional)
97103
*OpenAI provider only.*
98104

99105
Controls the output format when using OpenAI. Defaults to `mp3`, but you can request `wav`, `opus`, `aac`, `flac`, or `pcm` for different latency/quality characteristics.
100106

107+
#### `output_format` (String) (optional)
108+
*ElevenLabs provider only.*
109+
110+
Controls the output format when using ElevenLabs. Defaults to `mp3_44100_128`. See the ElevenLabs docs for supported presets (e.g. `pcm_16000`, `ulaw_8000`).
111+
101112
#### `instructions` (String) (optional)
102113
*OpenAI provider only.*
103114

@@ -174,6 +185,31 @@ A `Promise` that resolves to an `HTMLAudioElement`. The element’s `src` points
174185
</html>
175186
```
176187

188+
<strong class="example-title">Use ElevenLabs voices</strong>
189+
190+
```html;ai-txt2speech-elevenlabs
191+
<html>
192+
<body>
193+
<script src="https://js.puter.com/v2/"></script>
194+
<button id="play">Use ElevenLabs voice</button>
195+
<script>
196+
document.getElementById('play').addEventListener('click', async ()=>{
197+
const audio = await puter.ai.txt2speech(
198+
"Hello! This sample uses an ElevenLabs voice.",
199+
{
200+
provider: "elevenlabs",
201+
model: "eleven_multilingual_v2",
202+
voice: "21m00Tcm4TlvDq8ikWAM",
203+
output_format: "mp3_44100_128"
204+
}
205+
);
206+
audio.play();
207+
});
208+
</script>
209+
</body>
210+
</html>
211+
```
212+
177213
<strong class="example-title">Compare different engines</strong>
178214

179215
```html;ai-txt2speech-engines

src/examples.js

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,12 @@ const examples = [
151151
slug: 'ai-txt2speech-openai',
152152
source: '/playground/examples/ai-txt2speech-openai.html'
153153
},
154+
{
155+
title: 'Text to Speech with ElevenLabs',
156+
description: 'Generate speech with ElevenLabs voices using Puter.js AI API. Run and experiment with this TTS example in the playground.',
157+
slug: 'ai-txt2speech-elevenlabs',
158+
source: '/playground/examples/ai-txt2speech-elevenlabs.html'
159+
},
154160
{
155161
title: 'Text to Video',
156162
description: 'Generate videos from text with Puter.js AI API. Run and experiment with this text-to-video example in the playground.',
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
<html>
2+
<body>
3+
<script src="https://js.puter.com/v2/"></script>
4+
<button id="play">Use ElevenLabs voice</button>
5+
<script>
6+
document.getElementById('play').addEventListener('click', async ()=>{
7+
const audio = await puter.ai.txt2speech(
8+
"Hello! This sample uses an ElevenLabs voice.",
9+
{
10+
provider: "elevenlabs",
11+
model: "eleven_multilingual_v2",
12+
voice: "21m00Tcm4TlvDq8ikWAM",
13+
output_format: "mp3_44100_128"
14+
}
15+
);
16+
audio.play();
17+
});
18+
</script>
19+
</body>
20+
</html>

0 commit comments

Comments
 (0)