From 824e3381022d6892eb5a3ff684d0b392a5ac8023 Mon Sep 17 00:00:00 2001 From: jelveh Date: Fri, 21 Nov 2025 16:28:44 -0800 Subject: [PATCH 1/2] Add 11labs support to `txt2speech` docs --- src/AI/txt2speech.md | 52 +++++++++++++++++++++++++++++++++++++------- 1 file changed, 44 insertions(+), 8 deletions(-) diff --git a/src/AI/txt2speech.md b/src/AI/txt2speech.md index 8e88962..ff4eabc 100755 --- a/src/AI/txt2speech.md +++ b/src/AI/txt2speech.md @@ -24,12 +24,16 @@ A string containing the text you want to convert to speech. The text must be les An object containing the following optional properties: - `language` (String): Language code for speech synthesis (AWS Polly only). Defaults to `en-US`. -- `voice` (String): Voice ID used for synthesis. Defaults to `Joanna` (AWS) or `alloy` (OpenAI). +- `voice` (String): Voice ID used for synthesis. Defaults to `Joanna` (AWS), `alloy` (OpenAI), or `21m00Tcm4TlvDq8ikWAM` (ElevenLabs sample voice). - `engine` (String): AWS Polly engine. Can be `standard`, `neural`, `long-form`, or `generative`. Defaults to `standard`. -- `provider` (String): TTS provider to use. Supports `'aws-polly'` (default) and `'openai'`. -- `model` (String): OpenAI text-to-speech model (`gpt-4o-mini-tts`, `tts-1`, `tts-1-hd`, ...). Defaults to `gpt-4o-mini-tts`. -- `response_format` (String): Desired OpenAI output format (`mp3`, `wav`, `opus`, `aac`, `flac`, `pcm`). Defaults to `mp3`. +- `provider` (String): TTS provider to use. Supports `'aws-polly'` (default), `'openai'`, and `'elevenlabs'`. +- `model` (String): Model identifier for the chosen provider. Examples: + - OpenAI: `gpt-4o-mini-tts` (default), `tts-1`, `tts-1-hd` + - ElevenLabs: `eleven_multilingual_v2` (default), `eleven_flash_v2_5`, `eleven_turbo_v2_5`, `eleven_v3` +- `response_format` (String): Output format for OpenAI voices (`mp3`, `wav`, `opus`, `aac`, `flac`, `pcm`). Defaults to `mp3`. +- `output_format` (String): Output format for ElevenLabs voices (e.g. `mp3_44100_128`). Defaults to `mp3_44100_128` when using ElevenLabs. - `instructions` (String): Additional guidance for OpenAI voices (tone, pacing, style, etc.). +- `voice_settings` (Object): ElevenLabs voice tuning options (e.g. stability, similarity boost, speed). #### `language` (String) (optional) *AWS Polly only.* @@ -75,10 +79,11 @@ The language to use for speech synthesis. Defaults to `en-US`. The following lan - Welsh (`cy-GB`) #### `voice` (String) (optional) -The voice to use for speech synthesis. Defaults to `Joanna` when `provider` is `aws-polly`, or `alloy` when using the OpenAI provider. +The voice to use for speech synthesis. Defaults to `Joanna` when `provider` is `aws-polly`, `alloy` when using the OpenAI provider, or `21m00Tcm4TlvDq8ikWAM` when using ElevenLabs. - **AWS Polly voices:** See the [AWS Polly voice list](https://docs.aws.amazon.com/polly/latest/dg/available-voices.html) for available IDs and languages. - **OpenAI voices:** Built-in options include `alloy`, `ash`, `ballad`, `coral`, `echo`, `fable`, `nova`, `onyx`, `sage`, and `shimmer`. +- **ElevenLabs voices:** Use any ElevenLabs voice ID from your account (for example `21m00Tcm4TlvDq8ikWAM` for the public "Rachel" sample voice). #### `engine` (String) (optional) *AWS Polly only.* @@ -86,18 +91,24 @@ The voice to use for speech synthesis. Defaults to `Joanna` when `provider` is ` The speech synthesis engine to use. Can be `standard`, `neural`, `long-form`, or `generative`. Defaults to `standard`. Higher-end engines provide better quality but may incur higher usage costs. #### `provider` (String) (optional) -Selects which backend performs the synthesis. Use `'aws-polly'` (default) for the existing AWS voices, or `'openai'` to access the GPT-4o mini TTS family. +Selects which backend performs the synthesis. Use `'aws-polly'` (default) for the existing AWS voices, `'openai'` to access the GPT-4o mini TTS family, or `'elevenlabs'` to use ElevenLabs voices. #### `model` (String) (optional) -*OpenAI provider only.* +Specifies which TTS model to use for the selected provider. -Specifies which OpenAI TTS model to use. Defaults to `gpt-4o-mini-tts`. Other available models include `tts-1` and `tts-1-hd`. +- *OpenAI:* Defaults to `gpt-4o-mini-tts`. Other available models include `tts-1` and `tts-1-hd`. +- *ElevenLabs:* Defaults to `eleven_multilingual_v2`. Other available models include `eleven_flash_v2_5`, `eleven_turbo_v2_5`, and `eleven_v3`. #### `response_format` (String) (optional) *OpenAI provider only.* Controls the output format when using OpenAI. Defaults to `mp3`, but you can request `wav`, `opus`, `aac`, `flac`, or `pcm` for different latency/quality characteristics. +#### `output_format` (String) (optional) +*ElevenLabs provider only.* + +Controls the output format when using ElevenLabs. Defaults to `mp3_44100_128`. See the ElevenLabs docs for supported presets (e.g. `pcm_16000`, `ulaw_8000`). + #### `instructions` (String) (optional) *OpenAI provider only.* @@ -174,6 +185,31 @@ A `Promise` that resolves to an `HTMLAudioElement`. The element’s `src` points ``` +Use ElevenLabs voices + +```html;ai-txt2speech-elevenlabs + + + + + + + +``` + Compare different engines ```html;ai-txt2speech-engines From 6888fbdee0fc8e35e50068bd868d5ba341397f0b Mon Sep 17 00:00:00 2001 From: jelveh Date: Fri, 21 Nov 2025 20:49:34 -0800 Subject: [PATCH 2/2] Add 11labs tts example to playground --- src/examples.js | 6 ++++++ .../examples/ai-txt2speech-elevenlabs.html | 20 +++++++++++++++++++ 2 files changed, 26 insertions(+) create mode 100644 src/playground/examples/ai-txt2speech-elevenlabs.html diff --git a/src/examples.js b/src/examples.js index b0afcf3..6794a25 100644 --- a/src/examples.js +++ b/src/examples.js @@ -151,6 +151,12 @@ const examples = [ slug: 'ai-txt2speech-openai', source: '/playground/examples/ai-txt2speech-openai.html' }, + { + title: 'Text to Speech with ElevenLabs', + description: 'Generate speech with ElevenLabs voices using Puter.js AI API. Run and experiment with this TTS example in the playground.', + slug: 'ai-txt2speech-elevenlabs', + source: '/playground/examples/ai-txt2speech-elevenlabs.html' + }, { title: 'Text to Video', description: 'Generate videos from text with Puter.js AI API. Run and experiment with this text-to-video example in the playground.', diff --git a/src/playground/examples/ai-txt2speech-elevenlabs.html b/src/playground/examples/ai-txt2speech-elevenlabs.html new file mode 100644 index 0000000..4b171a4 --- /dev/null +++ b/src/playground/examples/ai-txt2speech-elevenlabs.html @@ -0,0 +1,20 @@ + + + + + + + \ No newline at end of file