Release v4.19.0 - Qwen3-TTS Engine with Voice Designer · diodiogod/TTS-Audio-Suite

🎨 Qwen3-TTS Engine - Create Voices from Text!

Major new engine addition! Qwen3-TTS brings a unique Voice Designer feature that lets you create custom voices from natural language descriptions. Plus three distinct model types for different use cases!

✨ New Features

Qwen3-TTS Engine

🎨 Voice Designer - Create custom voices from text descriptions! "A calm female voice with British accent" → instant voice generation
Three model types with different capabilities:
- CustomVoice: 9 high-quality preset speakers (Vivian, Serena, Dylan, Eric, Ryan, etc.)
- VoiceDesign: Text-to-voice creation - describe your ideal voice and generate it
- Base: Zero-shot voice cloning from audio samples
10 language support - Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian
Model sizes: 0.6B (low VRAM) and 1.7B (high quality) variants
Character voice switching with [CharacterName] syntax - automatic preset mapping
SRT subtitle timing support with all timing modes (stretch_to_fit, pad_with_silence, etc.)
Inline edit tags - Apply Step Audio EditX post-processing (emotions, styles, paralinguistic effects)
Sage attention support - Improved VRAM efficiency with sageattention backend
Smart caching - Prevents duplicate voice generation, skips model loading for existing voices
Per-segment parameters - Control [seed:42], [temperature:0.8] inline
Auto-download system - All 6 model variants downloaded automatically when needed

🎙️ Voice Designer Node

The standout feature of this release! Create voices without audio samples:

Natural language input - Describe voice characteristics in plain English
Disk caching - Saved voices load instantly without regeneration
Standard format - Works seamlessly with Character Voices system
Unified output - Compatible with all TTS nodes via NARRATOR_VOICE format

Example descriptions:

"A calm female voice with British accent"
"Deep male voice, authoritative and professional"
"Young cheerful woman, slightly high-pitched"

📚 Documentation

YAML-driven engine tables - Auto-generated comparison tables
Condensed engine overview in README
Portuguese accent guidance - Clear documentation of model limitations and workarounds

🎯 Technical Highlights

Official Qwen3-TTS implementation bundled for stability
24kHz mono audio output
Progress bars with real-time token generation tracking
VRAM management with automatic model reload and device checking
Full unified architecture integration
Interrupt handling for cancellation support

Qwen3-TTS brings a total of 10 TTS engines to the suite, each with unique capabilities. Voice Designer is a first-of-its-kind feature in ComfyUI TTS extensions!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v4.19.0 - Qwen3-TTS Engine with Voice Designer

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

🎨 Qwen3-TTS Engine - Create Voices from Text!

✨ New Features

🎙️ Voice Designer Node

📚 Documentation

🎯 Technical Highlights

Uh oh!