Vox (GitHub Repository) - A Voice Therapy Coach for Trans Individuals
Vox is an open-source, affirming voice training web app designed specifically for trans people. It empowers users to explore, analyze, and develop their authentic voice through real-time feedback, personalized coaching, and inclusive design.
Created with pride by Shelbeely (Linktree), a trans woman developer.
Licensed under GPL-3.0 — share the love, keep it open!
-
Real-Time Voice Analysis:
- Detects pitch, harmonic-to-noise ratio (HNR), harmonics, and formants live as you speak or sing.
- Visualizes data with charts and gauges for instant feedback.
-
Personalized Coaching:
- Integrates with OpenRouter LLM API to provide supportive, pronoun-aware feedback tailored to your goals.
- Chat with Vox, your affirming AI coach, for guidance and encouragement.
-
Pronoun Inclusivity:
- Extensive pronoun options, including neopronouns and mixed sets.
- Dynamic pronoun handling in feedback and chat, respecting your identity.
-
Target Pitch Practice:
- Play a reference tone to help you match your desired pitch.
- Visual cues indicate if your pitch is within your target range.
-
Session History:
- Stores past recordings and analysis results locally.
- Review your progress over time and replay recordings.
-
User Customization:
- Set your name, pronouns, and target gender presentation.
- Adjust target pitch frequency.
-
Security & Fair Use:
- CSRF protection, rate limiting, and session management.
- Cleans up old recordings automatically.
-
Backend:
- Python Flask app (
app.py) serving REST API and Socket.IO for real-time communication. - Uses
librosaandaubiofor audio analysis (pitch, harmonics, formants). - Stores user info and vocal data in SQLite (
vocal_data.db). - Connects to OpenRouter API for LLM-powered feedback and chat.
- Python Flask app (
-
Frontend:
- HTML5 interface (
index.html) with accessible controls and visualizations. - JavaScript (
scripts.js) handles audio capture, visualization, Socket.IO events, and API calls. - Styled with CSS (
styles.css), including responsive design and animations.
- HTML5 interface (
- User records voice via browser microphone.
- Audio is streamed to backend via Socket.IO.
- Backend analyzes audio, extracts pitch, HNR, harmonics, formants.
- Frontend updates charts and gauges in real time.
- User can save recordings, which are stored on server and linked to their session.
- LLM generates personalized feedback based on vocal metrics and user profile.
- User can chat with Vox for additional support.
- Python 3.8+
- Node.js (for frontend development, optional)
- pip packages:
flask flask-socketio flask-wtf flask-limiter requests numpy librosa aubio gunicorn eventlet
Create a .env file or set environment variables:
FLASK_SECRET_KEY— secret key for Flask sessionsOPENROUTER_API_KEY— your OpenRouter API key for LLM access
- Clone the repository
git clone https://github.com/shelbeely/vox.git
cd vox/0.1.3- Install Python dependencies
pip install flask flask-socketio flask-wtf flask-limiter requests numpy librosa aubio gunicorn eventlet- Run the app with Gunicorn + eventlet
gunicorn --worker-class eventlet -w 1 app:app- Access Vox
Open your browser and navigate to http://localhost:5000
- Click the mic button to start recording.
- Speak or sing; watch your pitch, HNR, harmonics, and formants update live.
- Click stop to end recording.
- Review your performance in the history list.
- Play back saved recordings.
- Enter your name and select your pronouns (including neopronouns or mixed sets).
- Choose your target gender (feminine, masculine, or unspecified).
- Save your info to personalize feedback and chat.
- Set a target pitch frequency (Hz).
- Click start target pitch to hear a reference tone.
- Match your voice to the tone; visual cues help guide you.
- Click stop target pitch to silence the tone.
- Type messages in the chat box to talk with Vox.
- Receive affirming, pronoun-aware responses.
- After recordings, Vox provides detailed vocal feedback powered by LLM.
- View past performances with pitch, HNR, harmonics, and formants.
- Play back recordings.
- Clear history to start fresh.
- Backend: Python, Flask, Flask-SocketIO, SQLite, librosa, aubio, requests
- Frontend: HTML5, CSS3, JavaScript, Socket.IO, Tone.js, Chart.js, JustGage, Raphael
- AI Integration: OpenRouter API with DeepSeek Chat model
- Security: Flask-WTF CSRF, Flask-Limiter rate limiting
- License: GPL-3.0
- Designed with trans inclusivity and accessibility at its core.
- Modular codebase:
app.py— backend server, API, audio analysis, LLM integrationindex.html— UI layoutscripts.js— client logic, audio capture, visualizationstyles.css— styling and animations
- Real-time audio analysis combines browser-side (Tone.js) and server-side (librosa, aubio) processing.
- LLM prompts are carefully crafted to respect pronouns and uplift users.
- Cleans up old recordings after ~40 days to save space.
- Contributions welcome! Please respect the GPL license and trans-affirming mission.
This project is licensed under the GNU General Public License v3.0 (GPL-3.0).
You are free to use, modify, and share it, but keep it open and respect the community.
With love and pride,
Shelbeely