Skip to content

aman-tugnawat/Local_TTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎙️ Local TTS

A local text-to-speech application powered by Qwen3-TTS with a premium React web UI.

Generate natural, expressive speech locally — custom voices, voice design from natural language descriptions, and voice cloning from audio samples.


✨ Features

Feature Description
🎤 Custom Voice 10+ built-in speakers with 10 languages and emotional control
🎨 Voice Design Describe your ideal voice in plain English — the AI creates it
🔄 Clone Speaker Save a custom voice and use it to generate new speech
📚 Voice Library Save, manage, and reuse your custom voice profiles
Ultra Low Latency Streaming generation with end-to-end latency as low as 97ms
🌍 10 Languages Chinese, English, Japanese, Korean, German, French, and more
🖥️ Premium Web UI Dark-themed React interface with glassmorphism and animations
🍎 Apple Silicon Drop-in Metal GPU acceleration (MPS) for Mac users

📂 Project Structure

This project is organized into two independent modules:

Module Description Guide
🔧 backend/ FastAPI server + Qwen3-TTS engine + CLI Setup Guide →
🎨 frontend/ React web UI (Vite + Zustand) Setup Guide →

Each module can be developed and tested independently. See their respective READMEs for setup instructions.


🚀 Quick Start

Prerequisites - npm

Development (Two Terminals)

Run the backend and frontend separately to get live reloading on both ends.

# Terminal 1: Start backend
cd backend
make install
make serve

# Terminal 2: Start frontend dev server
cd frontend
npm install
npm run dev

Open http://localhost:5173 — the frontend automatically proxies API requests to the backend.

Production (Single Process)

Run both frontend and backend together via the root Makefile for an easy one-click setup.

first run

npm install
make serve

Open http://localhost:8765 to view the application in production mode.


📁 Storage Locations

All data is stored under ~/.local-tts/ by default. Override with the LOCAL_TTS_DATA_DIR environment variable or the --data-dir CLI flag.

Path Contents
~/.local-tts/models/ Downloaded model weights
~/.local-tts/voices/ Saved voice profiles & cloned voice samples
~/.local-tts/history/ Generated audio files (up to 100, then auto-pruned)
~/.local-tts/config.json Persisted server configuration

Disk tip: Models are large (2.5–6.5 GB each). Delete unwanted model folders from ~/.local-tts/models/ to reclaim space.


⚠️ Requirements

  • Python ≥ 3.11
  • Node.js ≥ 18 (for building the web UI)
  • GPU (recommended): NVIDIA GPU with ≥ 8 GB VRAM, or Apple Silicon Mac with Metal support
  • CPU mode: Works but inference is significantly slower

📄 License

MIT License

About

A local text-to-speech application powered by Qwen3-TTS with a premium React web UI.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors