🔊 Geeky Kokoro TTS for ComfyUI - 2025 Edition (Does not work with python 3.13)

The most comprehensive Kokoro TTS implementation for ComfyUI with ALL 54+ voices across 9 languages, voice blending, and advanced voice modification effects.

🌟 What's New in 2025 Edition

Ignore the Advanced Voice Mod node for now, it's an experimental thing currently.

Special Note: Advanced Voice Mod node currently under construction. Will not function as intended at the moment. Japanese voices do not work at the moment either, will require custom wheel.

Complete Rewrite from Ground Up

🌍 54+ Voices: Complete support for all Kokoro-82M voices across 9 languages
🎯 9 Languages: US English, UK English, Japanese (not working yet, needs a custom wheel build), Mandarin Chinese, Spanish, French, Hindi, Italian, Brazilian Portuguese
🔀 Advanced Voice Blending: Mix any two voices with adjustable blend ratios
**🐍 Python 3.12: Fully tested and optimized for the latest Python versions
📦 Modern Architecture: Completely rewritten following 2025 ComfyUI best practices
⚡ Improved Performance: Better memory management and processing speed
🛡️ Enhanced Reliability: Robust error handling and fallback mechanisms

Key Features

✅ ALL 54+ Kokoro-82M voices (nothing left out!)
✅ Voice blending with linear interpolation
✅ NEW: Guided Voice Morphing - Use any audio file to guide voice transformation
✅ NEW: Autotune-style Pitch Correction - Match pitch to reference audio
✅ NEW: Advanced Spectral Morphing - Match tone, timbre, and character
✅ NEW: 18 Voice Profiles - Professional presets for instant transformations
✅ Advanced voice modification effects (pitch, formant, reverb, etc.)
✅ Intelligent text chunking that preserves sentence order
✅ GPU acceleration with automatic CPU fallback
✅ Multi-language support with proper phoneme handling
✅ Professional audio processing pipeline with Dynamic Time Warping
✅ ComfyUI v3.49+ compatibility

🔧 Installation

Prerequisites

ComfyUI v3.49+ (or compatible version)
**Python 3.9, 3.10, 3.11, 3.12, (3.13+ not supported)
PyTorch 2.0+ (usually included with ComfyUI)
4GB+ RAM (8GB recommended for longer texts)
Optional: CUDA-capable GPU for faster processing

Method 1: ComfyUI Manager (Recommended)

Open ComfyUI and navigate to "Manager"
Click "Install Custom Nodes"
Search for "Geeky Kokoro TTS"
Click "Install" and restart ComfyUI
Done! Nodes will appear in the "audio" category

Method 2: Manual Installation (Git Clone)

# Navigate to your ComfyUI custom nodes directory
cd ComfyUI/custom_nodes

# Clone this repository
git clone https://github.com/GeekyGhost/ComfyUI-Geeky-Kokoro-TTS.git

# Navigate into the directory
cd ComfyUI-Geeky-Kokoro-TTS

# Install Python dependencies
pip install -r requirements.txt

# Optional: Run installation verification script
python install.py

Method 3: ComfyUI Portable (Windows)

REM Navigate to custom nodes directory
cd ComfyUI_windows_portable\ComfyUI\custom_nodes

REM Clone repository
git clone https://github.com/GeekyGhost/ComfyUI-Geeky-Kokoro-TTS.git

REM Navigate into directory
cd ComfyUI-Geeky-Kokoro-TTS

REM Install with portable Python
..\..\..\python_embeded\python.exe -m pip install -r requirements.txt

System Dependencies (Optional but Recommended)

For best phoneme processing, install espeak-ng:

Ubuntu/Debian:

sudo apt-get update
sudo apt-get install espeak-ng

macOS:

brew install espeak-ng

Windows: Download and install from: https://github.com/espeak-ng/espeak-ng/releases

🎭 Complete Voice List (54+ Voices)

🇺🇸 US English Voices (20 voices)

Female Voices (11)

Voice Name	Code	Character	Best For
Heart ❤️	`af_heart`	Warm, friendly, natural	Narration, audiobooks, general purpose
Bella 🔥	`af_bella`	Energetic, dynamic, engaging	Marketing, announcements, enthusiastic content
Nicole 🎧	`af_nicole`	Clear, professional, articulate	Training videos, tutorials, instructional content
Aoede 🎵	`af_aoede`	Musical, expressive, artistic	Creative content, storytelling, entertainment
Kore	`af_kore`	Balanced, versatile	General purpose, business content
Sarah	`af_sarah`	Neutral, calm, reliable	Documentation, formal content, reports
Nova ⭐	`af_nova`	Bright, modern, upbeat	Social media, vlogs, casual content
Sky ☁️	`af_sky`	Soft, gentle, soothing	Meditation, relaxation, ASMR
Alloy	`af_alloy`	Professional, authoritative	Corporate, presentations, business
Jessica	`af_jessica`	Friendly, approachable	Customer service, help content, guides
River 🌊	`af_river`	Flowing, natural, smooth	Long-form narration, podcasts

Male Voices (9)

Voice Name	Code	Character	Best For
Michael	`am_michael`	Deep, authoritative, commanding	Documentary, serious content, news
Fenrir 🐺	`am_fenrir`	Strong, bold, powerful	Action content, gaming, intense narration
Puck 🎭	`am_puck`	Playful, character-driven, versatile	Entertainment, comedy, character voices
Echo 🔊	`am_echo`	Clear, resonant, memorable	Announcements, radio-style content
Eric	`am_eric`	Reliable, professional	Business, training, educational content
Liam	`am_liam`	Modern, relatable, friendly	Casual content, social media, vlogs
Onyx 💎	`am_onyx`	Rich, deep, elegant	Premium content, luxury brands, sophistication
Adam	`am_adam`	Classic, versatile, dependable	General purpose, all-around use
Santa 🎅	`am_santa`	Warm, jolly, festive	Holiday content, cheerful narration

🇬🇧 UK English Voices (8 voices)

Female Voices (4)

Voice Name	Code	Character	Best For
Emma	`bf_emma`	Refined, elegant, sophisticated	Formal content, literature, high-end narration
Isabella	`bf_isabella`	Professional, articulate	Business, corporate, presentations
Alice 📚	`bf_alice`	Clear, storytelling, engaging	Children's content, education, books
Lily 🌸	`bf_lily`	Gentle, pleasant, approachable	General content, tutorials, friendly narration

Male Voices (4)

Voice Name	Code	Character	Best For
George	`bm_george`	Authoritative, professional, commanding	Business, education, serious content
Fable 📖	`bm_fable`	Narrative, expressive, storytelling	Audiobooks, tales, creative content
Lewis	`bm_lewis`	Reliable, clear, articulate	Training, documentation, instructional content
Daniel	`bm_daniel`	Modern, professional, versatile	General purpose, business, presentations

🇯🇵 Japanese Voices (5 voices) - Not Working, needs custom wheel

Voice Name	Code	Gender	Character	Best For
Hina ひな	`jf_hina`	Female	Gentle, youthful, sweet	Anime, casual content, friendly narration
Yuki 雪	`jf_yuki`	Female	Cool, elegant, refined	Formal content, professional narration
Sakura 桜	`jf_sakura`	Female	Warm, traditional, pleasant	Cultural content, storytelling
Sora 空	`jf_sora`	Female	Bright, energetic, cheerful	Entertainment, upbeat content
Kaito 海斗	`jm_kaito`	Male	Strong, confident, clear	News, serious content, professional narration

🇨🇳 Mandarin Chinese Voices (8 voices)

Female Voices (4)

Voice Name	Code	Character	Best For
Xiaoxiao 小小	`zf_xiaoxiao`	Gentle, friendly, approachable	General purpose, casual content
Yunxi 云希	`zf_yunxi`	Professional, clear, articulate	Business, news, formal content
Xiaoyi 小艺	`zf_xiaoyi`	Energetic, youthful, lively	Entertainment, social media
Xiaoxuan 小萱	`zf_xiaoxuan`	Warm, expressive, engaging	Storytelling, narration

Male Voices (4)

Voice Name	Code	Character	Best For
Yunyang 云扬	`zm_yunyang`	Strong, authoritative, commanding	News, serious content, professional
Yunfeng 云枫	`zm_yunfeng`	Calm, mature, reliable	Documentation, education
Yunhao 云昊	`zm_yunhao`	Clear, professional, articulate	Business, presentations
Yunxia 云霞	`zm_yunxia`	Versatile, balanced	General purpose content

🇪🇸 Spanish Voices (3 voices)

Voice Name	Code	Gender	Character	Best For
Sofia	`ef_sofia`	Female	Warm, friendly, engaging	General content, narration, education
Diego	`em_diego`	Male	Confident, clear, professional	Business, formal content, news
Carlos	`em_carlos`	Male	Friendly, approachable, versatile	Casual content, tutorials

🇫🇷 French Voice (1 voice)

Voice Name	Code	Gender	Character	Best For
Céline	`ff_celine`	Female	Elegant, refined, sophisticated	All French content, narration, professional

🇮🇳 Hindi Voices (4 voices)

Voice Name	Code	Gender	Character	Best For
Priya	`hf_priya`	Female	Friendly, warm, approachable	General content, education
Anjali	`hf_anjali`	Female	Professional, clear, articulate	Business, formal content
Arjun	`hm_arjun`	Male	Strong, confident, authoritative	News, serious content
Raj	`hm_raj`	Male	Friendly, versatile, engaging	General purpose, casual content

🇮🇹 Italian Voices (2 voices)

Voice Name	Code	Gender	Character	Best For
Giulia	`if_giulia`	Female	Expressive, warm, engaging	Narration, storytelling, general content
Marco	`im_marco`	Male	Confident, professional, clear	Business, formal content, presentations

🇧🇷 Brazilian Portuguese Voices (3 voices)

Voice Name	Code	Gender	Character	Best For
Lúcia	`pf_lucia`	Female	Warm, friendly, natural	General content, education, narration
João	`pm_joao`	Male	Professional, clear, reliable	Business, news, formal content
Pedro	`pm_pedro`	Male	Friendly, approachable, versatile	Casual content, tutorials, general purpose

🚀 Usage Guide

Basic Text-to-Speech

Add the Node: In ComfyUI, add "🔊 Geeky Kokoro TTS (2025)" node to your workflow
Enter Text: Type or paste your text in the multiline text field
Select Voice: Choose from 54+ voices in the dropdown
Adjust Speed: Set speed from 0.5x (slower) to 2.0x (faster)
GPU Option: Enable "use_gpu" if you have a CUDA-capable GPU
Generate: Connect to audio output or preview node

Voice Blending (Creating Unique Voices)

Voice blending allows you to create unique vocal characteristics by mixing two voices:

Enable Blending: Check the "enable_blending" checkbox
Select Second Voice: Choose a second voice from the dropdown
Adjust Blend Ratio:
- 1.0 = 100% primary voice (no blending)
- 0.7 = 70% primary, 30% secondary (subtle blend)
- 0.5 = 50/50 mix (balanced blend)
- 0.3 = 30% primary, 70% secondary (secondary dominant)
- 0.0 = 100% secondary voice

Blending Tips:

Mix voices from the same language for best results
Blend male + female voices for androgynous effects
Try Heart + Bella at 0.6 for energetic yet warm narration
Try Michael + Adam at 0.5 for rich, authoritative voice
Experiment with ratios to find your perfect voice!

🎵 Guided Voice Morphing (NEW!)

The game-changing feature that makes voices sing, match, and transform!

The Advanced Voice node now supports guided voice morphing - using a secondary audio file (like a song or reference voice) to guide the transformation of your TTS output. Perfect for:

Making TTS voices "sing" along to music
Matching tone and style of reference speakers
Creating autotune-style effects
Professional voice-over matching

How to Use Guided Morphing

Connect Guide Audio:
- Load your guide audio (song, reference voice, etc.)
- Connect it to the guide_audio input on the Advanced Voice node
Enable Morphing:
- Check the enable_guided_morph checkbox
Adjust Morph Parameters (0.0 to 1.0):
- Pitch Morph: Match pitch contour to guide audio (autotune effect)
- Formant Morph: Match vocal character and tone
- Spectral Morph: Match overall timbre and frequency balance
- Amplitude Morph: Match dynamics and volume envelope

Morphing Parameters Explained

Pitch Morph (0.0 - 1.0)

0.0: No pitch change (original TTS pitch)
0.3-0.5: Subtle pitch guidance (natural autotune)
0.7-0.9: Strong pitch matching (follows melody closely)
1.0: Complete pitch matching (perfect autotune)

Use Cases:

Music: 0.7-1.0 to make voice follow melody
Speech matching: 0.3-0.5 for natural intonation
Character voice: 0.0 (use manual pitch shift instead)

Formant Morph (0.0 - 1.0)

Matches the vocal tract characteristics
Affects perceived age, gender, and character
0.0: Original voice character
0.5: Blend of both voices
1.0: Fully matched character

Use Cases:

Voice cloning: 0.6-0.9
Gender transformation: 0.5-0.7
Age adjustment: 0.4-0.6

Spectral Morph (0.0 - 1.0)

Matches overall frequency spectrum and timbre
Affects "brightness", "warmth", and tonal quality
Most subtle but powerful for natural matching

Use Cases:

Microphone matching: 0.5-0.7
Tone matching: 0.6-0.8
Style transfer: 0.4-0.6

Amplitude Morph (0.0 - 1.0)

Matches volume dynamics and expression
Follows the energy and intensity patterns
Great for emotional expression

Use Cases:

Dynamic speech: 0.5-0.7
Singing expression: 0.6-0.8
Whisper/shout: 0.4-0.6

Guided Morphing Examples

Example 1: Make TTS Voice Sing

Setup:
1. Generate TTS with lyrics text
2. Load instrumental or vocal track as guide_audio
3. Enable guided morph
4. Set: pitch_morph=0.8, formant_morph=0.3, spectral_morph=0.4

Result: Voice follows melody while maintaining TTS character

Example 2: Clone Speaking Style

Setup:
1. Generate TTS with script
2. Load reference speaker audio as guide_audio
3. Enable guided morph
4. Set: pitch_morph=0.4, formant_morph=0.7, spectral_morph=0.6

Result: TTS matches speaking style and voice character

Example 3: Autotune Effect

Setup:
1. Generate TTS with any text
2. Load musical scale or melody as guide_audio
3. Enable guided morph
4. Set: pitch_morph=1.0, formant_morph=0.0, spectral_morph=0.2

Result: Perfect pitch-corrected robotic singing effect

Advanced Voice Effects

Connect the TTS output to "🎛️ Geeky Kokoro Advanced Voice (2025)" node for effects:

Preset Profiles (18 Total):

Original Profiles:

Cinematic: Deep, movie-trailer style (-3 semitones, reverb, compression)
Monster: Growling creature voice (-6 semitones, formant shift, distortion)
Robot: Mechanical, synthesized voice (band-pass filter, modulation)
Child: Young character voice (+3 semitones, formant shift)
Darth Vader: Deep, breathing villain voice (-4 semitones, echo, modulation)
Singer: Optimized for vocal content (compression, EQ, reverb)

NEW Profiles:

Alien: Otherworldly voice (-8 semitones, extreme formant shift, modulation)
Deep Voice: Professional bass voice (-5 semitones, bass boost)
Chipmunk: High-pitched cartoon voice (+6 semitones, formant shift up)
Telephone: Classic phone quality (300-3400Hz bandpass, compression)
Radio: Broadcast radio sound (100-5000Hz, compression, EQ)
Cathedral: Large reverberant space (heavy reverb, echo)
Cave: Echo chamber effect (reverb, echo with feedback)
Metallic: Robotic metallic sound (ring modulation, bandpass)
Whisper: Quiet breathy voice (noise, reduced bass)
Shout: Loud emphasized voice (compression, distortion, mid boost)
Custom: Full manual control of all parameters

Manual Controls:

Pitch Shift: ±12 semitones (0.1 step precision)
Formant Shift: Vocal tract size adjustment (-5 to +5)
Time Stretch: Speed without pitch change (0.5x to 2.0x)
Reverb: Room ambiance with room size control
Echo: Discrete repeats with adjustable feedback
Distortion: Harmonic saturation (0.0 to 1.0)
Compression: Dynamic range control
3-Band EQ: Bass, Mid, Treble (-1.0 to +1.0)
Brightness: High-frequency emphasis (-1.0 to +1.0)
Warmth: Low-frequency emphasis (-1.0 to +1.0)
Effect Blend: Mix with original audio (0.0 to 1.0)
Output Volume: -60dB to +60dB

⚙️ Technical Details

Model Information

Model: Kokoro-82M v0.19
Parameters: 82 million
Architecture: Decoder-only based on StyleTTS 2 + ISTFTNet
Sample Rate: 24kHz
License: Apache 2.0
Repository: hexgrad/Kokoro-82M

Performance Benchmarks

Processing Speed (Python 3.12, CUDA GPU):

Short text (< 200 chars): ~2-3 seconds
Medium text (200-800 chars): ~5-10 seconds
Long text (800+ chars): ~15-30 seconds
Voice blending: +20% processing time
Voice effects: +5-15% processing time
Guided morphing: +30-50% processing time (feature extraction + morphing)

Memory Usage:

Base model: ~2GB VRAM/RAM
With GPU acceleration: ~3GB VRAM
Voice effects processing: +500MB
Voice blending: +200MB temporary
Guided morphing: +800MB-1.5GB (feature extraction + DTW alignment)

Guided Morphing Technology

Feature Extraction:

Pitch Tracking: PYIN algorithm with autocorrelation fallback
Formant Analysis: LPC (Linear Predictive Coding) with Levinson-Durbin recursion
Spectral Envelope: Cepstral smoothing with liftering
Amplitude Envelope: RMS energy tracking
MFCC: 13-coefficient mel-frequency cepstral analysis

Morphing Algorithms:

Dynamic Time Warping (DTW): Aligns feature sequences between source and guide
Phase Vocoder: Time-varying pitch shifting with STFT
Spectral Transfer: Magnitude envelope morphing with phase preservation
Volume Matching: RMS-based amplitude envelope transfer

Supported Guide Audio:

Any sample rate (auto-resampling to 24kHz)
Mono or stereo (auto-converted to mono)
WAV, MP3, FLAC, OGG formats
Recommended: 16kHz+ sample rate for best results

Text Processing

Intelligent Chunking: Automatically splits long texts while preserving sentence order
Chunk Size: 350 characters (configurable)
Gap Insertion: 150ms natural pauses between chunks
Paragraph Awareness: Respects paragraph breaks and structure
Punctuation Handling: Proper sentence boundary detection

Supported Languages & Codes

a - American English
b - British English
j - Japanese
z - Mandarin Chinese
e - Spanish
f - French
h - Hindi
i - Italian
p - Brazilian Portuguese

🔧 Troubleshooting

Common Issues

"Kokoro import error"

Solution:

pip install --upgrade kokoro>=0.9.4

Voice not loading

Solution:

Restart ComfyUI completely
Check console for specific error messages
Ensure all dependencies are installed
Try reinstalling: pip install --force-reinstall kokoro

GPU out of memory

Solutions:

Disable "use_gpu" option
Reduce text length
Close other GPU-intensive applications
Use CPU mode for very long texts

Audio sounds distorted

Solutions:

Reduce "output_volume" in Voice Mod node
Lower "effect_blend" ratio (start at 0.3-0.5)
Reduce distortion and compression amounts
Check that input audio isn't already clipping

Python version issues

Solution:

python --version  # Check your version
# Must be 3.9, 3.10, 3.11, 3.12, or 3.13
pip install --upgrade pip
pip install -r requirements.txt --force-reinstall

espeak-ng not found

Solution:

Ubuntu/Debian: sudo apt-get install espeak-ng
macOS: brew install espeak-ng
Windows: Download from espeak-ng GitHub releases

Performance Tips

For long texts: Enable GPU acceleration
For short texts: CPU mode is often faster
Memory management: Process texts in batches if needed
Effect intensity: Start low (30-50%) and increase gradually
Voice blending: Keep both voices in the same language family

🤝 Contributing

Contributions are welcome! Areas where help is appreciated:

Additional voice profile presets
Performance optimizations
Bug reports and fixes
Documentation improvements
Testing on different platforms

📄 License & Credits

This Project

License: MIT License
Author: GeekyGhost
Repository: https://github.com/GeekyGhost/ComfyUI-Geeky-Kokoro-TTS

Kokoro TTS Model

License: Apache 2.0
Author: hexgrad
Model: https://huggingface.co/hexgrad/Kokoro-82M

Dependencies

librosa: Audio processing (ISC License)
scipy: Scientific computing (BSD License)
PyTorch: Deep learning framework (BSD License)
soundfile: Audio I/O (BSD License)

Special Thanks

hexgrad for the incredible Kokoro-82M model
ComfyUI Team for the amazing framework
Community testers and contributors
Audio processing library developers

📚 Research & Resources

Useful Links

Kokoro Model Page: https://huggingface.co/hexgrad/Kokoro-82M
ComfyUI Documentation: https://docs.comfy.org
Issue Tracker: https://github.com/GeekyGhost/ComfyUI-Geeky-Kokoro-TTS/issues
Discussions: https://github.com/GeekyGhost/ComfyUI-Geeky-Kokoro-TTS/discussions

Research Papers & References

StyleTTS 2 architecture
ISTFTNet vocoder
Phase vocoder techniques
Voice morphing and blending

🌟 Quick Start Examples

Example 1: Basic Narration

Node: Geeky Kokoro TTS (2025)
Text: "Welcome to my tutorial on advanced AI techniques."
Voice: 🇺🇸 🚺 Nicole 🎧
Speed: 1.0
GPU: true

Example 2: Character Voice with Effects

Node 1: Geeky Kokoro TTS (2025)
Voice: 🇺🇸 🚹 Puck 🎭
Text: "The villain laughed menacingly."

Node 2: Geeky Kokoro Advanced Voice
Profile: Monster
Intensity: 0.7

Example 3: Blended Voice for Unique Sound

Node: Geeky Kokoro TTS (2025)
Voice: 🇺🇸 🚺 Heart ❤️
Enable Blending: true
Second Voice: 🇺🇸 🚺 Bella 🔥
Blend Ratio: 0.6
Text: "This creates a warm yet energetic voice perfect for marketing."

Made with ❤️ for the ComfyUI community

Enjoy natural, high-quality text-to-speech with 54+ voices and unlimited creative possibilities! 🎉

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.github/workflows		.github/workflows
examples		examples
models		models
GeekyKokoroVoiceModNode.py		GeekyKokoroVoiceModNode.py
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
audio_feature_extraction.py		audio_feature_extraction.py
audio_utils.py		audio_utils.py
guided_voice_morph.py		guided_voice_morph.py
install.py		install.py
node.py		node.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
voice_profiles_utils.py		voice_profiles_utils.py

License

GeekyGhost/ComfyUI-Geeky-Kokoro-TTS

Folders and files

Latest commit

History

Repository files navigation

🔊 Geeky Kokoro TTS for ComfyUI - 2025 Edition (Does not work with python 3.13)

🌟 What's New in 2025 Edition

Complete Rewrite from Ground Up

Key Features

📋 Table of Contents

🔧 Installation

Prerequisites

Method 1: ComfyUI Manager (Recommended)

Method 2: Manual Installation (Git Clone)

Method 3: ComfyUI Portable (Windows)

System Dependencies (Optional but Recommended)

🎭 Complete Voice List (54+ Voices)

🇺🇸 US English Voices (20 voices)

Female Voices (11)

Male Voices (9)

🇬🇧 UK English Voices (8 voices)

Female Voices (4)

Male Voices (4)

🇯🇵 Japanese Voices (5 voices) - Not Working, needs custom wheel

🇨🇳 Mandarin Chinese Voices (8 voices)

Female Voices (4)

Male Voices (4)

🇪🇸 Spanish Voices (3 voices)

🇫🇷 French Voice (1 voice)

🇮🇳 Hindi Voices (4 voices)

🇮🇹 Italian Voices (2 voices)

🇧🇷 Brazilian Portuguese Voices (3 voices)

🚀 Usage Guide

Basic Text-to-Speech

Voice Blending (Creating Unique Voices)

🎵 Guided Voice Morphing (NEW!)

How to Use Guided Morphing

Morphing Parameters Explained

Pitch Morph (0.0 - 1.0)

Formant Morph (0.0 - 1.0)

Spectral Morph (0.0 - 1.0)

Amplitude Morph (0.0 - 1.0)

Guided Morphing Examples

Example 1: Make TTS Voice Sing

Example 2: Clone Speaking Style

Example 3: Autotune Effect

Advanced Voice Effects

Preset Profiles (18 Total):

Manual Controls:

⚙️ Technical Details

Model Information

Performance Benchmarks

Guided Morphing Technology

Text Processing

Supported Languages & Codes

🔧 Troubleshooting

Common Issues

"Kokoro import error"

Voice not loading

GPU out of memory

Audio sounds distorted

Python version issues

espeak-ng not found

Performance Tips

🤝 Contributing

📄 License & Credits

This Project

Kokoro TTS Model

Dependencies

Special Thanks

📚 Research & Resources

Useful Links

Research Papers & References

🌟 Quick Start Examples

Example 1: Basic Narration

Example 2: Character Voice with Effects

Example 3: Blended Voice for Unique Sound

About

Resources

Packages