ExactTranscriber is a user-friendly Streamlit application designed for accurate audio transcription, editing, and management. It leverages the power of Google's Gemini API to provide high-quality transcriptions and offers a seamless interface for refining results and exporting them in various formats.
This tool is ideal for journalists, researchers, students, podcasters, and anyone who needs to convert spoken audio into text efficiently. Whether you're transcribing interviews, lectures, meetings, or personal notes, ExactTranscriber aims to streamline your workflow.
ExactTranscriber offers a comprehensive suite of features for a smooth transcription experience:
- Audio File Upload:
- Easily upload your audio files directly through the web interface.
- Supported formats: MP3, WAV, M4A, FLAC, OGG.
- Advanced Transcription:
- Utilizes Google's Gemini API for state-of-the-art speech-to-text conversion.
- Option to select between different Gemini models (e.g., "Gemini 2.0 Flash", "Gemini 2.5 Flash") to balance speed and accuracy based on your needs.
- In-App Transcript Editor:
- A built-in text editor allows for immediate review and correction of the generated transcript.
- Make changes, fix errors, and refine speaker labels directly within the application.
- Flexible Export Options:
- Download your original or edited transcript in multiple formats:
- TXT: Plain text for easy sharing and universal compatibility.
- SRT: SubRip Subtitle format, perfect for video captions.
- JSON: Structured data format for programmatic use or integration with other tools.
- Download your original or edited transcript in multiple formats:
- Efficient Handling of Large Files:
- Automatic audio chunking for files exceeding a configurable size (e.g., 20MB), ensuring reliable processing of longer recordings.
- Contextual Information:
- Option to provide context like audio type (podcast, interview), topic, description, and number of speakers to improve transcription accuracy.
- Password Protection:
- Includes a basic password authentication mechanism for self-hosted instances to secure access.
Follow these steps to get ExactTranscriber running on your local machine:
-
Clone the Repository:
git clone https://github.com/your-username/ExactTranscriber.git cd ExactTranscriber
(Replace
your-username/ExactTranscriber.git
with the actual repository URL if different) -
Install FFmpeg: FFmpeg is required for audio processing.
- Ubuntu/Debian:
sudo apt-get update && sudo apt-get install ffmpeg
- macOS (using Homebrew):
brew install ffmpeg
- Windows: Download the latest build from the official FFmpeg website. Ensure you add FFmpeg to your system's PATH environment variable.
- Ubuntu/Debian:
-
Create a Python Virtual Environment (Recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install Python Dependencies:
pip install -r requirements.txt
-
Set Your Gemini API Key: You need a Google Cloud API key with the Gemini API enabled.
- Method 1: Environment Variable (Recommended for local development):
Set the
GOOGLE_API_KEY
environment variable.(On Windows, useexport GOOGLE_API_KEY='YOUR_API_KEY_HERE'
set GOOGLE_API_KEY=YOUR_API_KEY_HERE
or set it via System Properties) You can also useGEMINI_API_KEY
as a fallback. - Method 2: Streamlit Secrets (Recommended for Streamlit Cloud deployment):
If deploying on Streamlit Cloud, create a secrets file
.streamlit/secrets.toml
with the following content:Make sure thisGOOGLE_API_KEY = "YOUR_API_KEY_HERE" # or alternatively # GEMINI_API_KEY = "YOUR_API_KEY_HERE" # For password protection (optional) APP_PASSWORD = "your_secure_password"
secrets.toml
file is not committed to version control. The.gitignore
file already excludes it so your credentials remain private.
- Method 1: Environment Variable (Recommended for local development):
Set the
-
Run the Application:
streamlit run main.py
Your default web browser should open with the ExactTranscriber application.
- Enter Password: If password protection is enabled for your instance, you'll be prompted to enter it.
- Select Model: Choose your preferred Gemini transcription model (e.g., "Gemini 2.5 Flash" for speed, or others for potentially higher accuracy if available).
- Upload Audio File: Click the upload button and select your audio file (MP3, WAV, OGG, etc.).
- Provide Context (Optional): Expand the "Optional Context" section to specify the audio type, topic, language, description, and number of speakers. This can significantly improve transcription quality.
- Transcribe: Click the "Transcribe" button to start the process. For larger files, this may take some time.
- View & Edit: Once complete, the transcript will appear in the "Transcript" tab. Use the "Edit" tab to make any necessary corrections. Remember to click "Save Edits".
- Export: Go to the "Export" tab, select your desired format (TXT, SRT, JSON), and click "Download".
This project is licensed under the MIT License. See the LICENSE file for details.
Contributions are welcome! If you'd like to help improve ExactTranscriber, please see our CONTRIBUTING.md guide for more information on how to get started, report bugs, or suggest new features.
To run the unit test suite, install dependencies and execute pytest
:
pip install -r requirements.txt
pytest
For a deeper look at the project, refer to the ExactTranscriber Deepwiki page.