Skip to content

semyenov/silero-tts-generator

Repository files navigation

🎙️ Silero TTS Text-to-Speech Generator

🌟 Overview

A powerful Python-based text-to-speech generator leveraging Silero TTS models, designed for versatile and high-quality speech synthesis:

  • 🌐 Multiple language support (Russian, English, German)
  • 📝 Advanced SSML text processing
  • 🔊 Intelligent noise reduction
  • 💻 GPU/CPU compatibility
  • 🎭 Flexible speaker selection
  • 🌪️ Tornado-based API server for remote TTS generation

📂 Project Structure

  • __main__.py: 🚀 Example script demonstrating local TTS usage
  • silero_tts_processor.py: 🧠 Core TTS processor class
  • tts_server.py: 🌐 Tornado-based API server for remote TTS generation
  • test_request.sh: 🧪 Bash script for testing the TTS API
  • requirements.txt: 📦 Project dependencies

🛠️ Prerequisites

  • 🐍 Python 3.8+
  • 🚀 CUDA (optional, for GPU acceleration)
  • 🌐 curl for API testing (optional)
  • 📊 jq for JSON parsing (optional)

🚀 Installation

  1. Clone the repository:
git clone https://github.com/semyenov/silero-tts-generator.git
cd silero-tts-generator
  1. Create a virtual environment:
python3 -m venv .venv
source .venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt

🎬 Local Usage

from silero_tts_processor import SileroTTSProcessor

# Create TTS processor
tts = SileroTTSProcessor(
    language_id="ru",
    model_id="v4_ru",
)

# Generate speech
audio = tts.generate_speech(
    "<speak>Привет, мир</speak>",
    speaker_id="xenia",
    enhance_noise=True,
    output_filename="output.wav"
)
tts.play_audio(audio)

🌐 API Server Usage

Start the Tornado API server:

python tts_server.py

🧪 API Testing with Bash Script

A convenient bash script test_request.sh is provided to test the TTS API:

# Basic usage
./test_request.sh -t "Привет, мир"

# Advanced usage with custom parameters
./test_request.sh \
    -t "<speak>Привет, мир</speak>" \
    -s xenia

Script options:

  • -t: Text to convert to speech (required)
  • -s: Speaker (default: xenia)
  • -h: Show help message

📄 API Endpoints

🎙️ Generate TTS

POST /tts

Request Body:

{
  "text": "<speak>Текст для синтеза речи</speak>",
  "speaker": "xenia",
  "enhance_noise": true
}

Response:

{
  "success": true,
  "filename": "generated_audio_file.wav"
}

🔍 Retrieve Audio File

GET /audio/{filename}

Retrieves the generated audio file.

🌐 Supported Languages

  • Russian
  • English
  • German
  • ...

Full list of supported languages and models can be found here.

🔧 Troubleshooting

  • Ensure you have the latest version of PyTorch
  • Check CUDA compatibility if using GPU
  • Verify audio device settings
  • Make sure curl and jq are installed for API testing

📄 License

MIT License

🤝 Contributing

Pull requests are welcome. For major changes, please open an issue first.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors