CLI tool for transcribing audio files with OpenAI Whisper.
uv(install withpip install uvorpipx install uv)ffmpeginstall withbrew install ffmpeg(Mac) orsudo apt install ffmpeg(Linux)
uv syncsimplest case:
add your video/audio file to data/input and run:
uv run python main.pyor
- Extract MP3 from your recording (optional helper script):
./scripts/extract_audio.sh data/input/<filename>.mov- Transcribe the MP3:
uv run python main.py data/input/<filename>.mp3By default, transcript output is saved to:
data/output/<filename>.txt
uv run python main.py data/input/<filename>.mp3 --model small.en
uv run python main.py data/input/<filename>.mp3 --output-dir data/output/custom
uv run python main.py data/input/<filename>.mp3 --language Japanese --task translate --model medium
uv run python main.py data/input/<filename>.mp3 --realtime
uv run python main.py data/input/<filename>.mp3 --timestamps
uv run python main.py data/input/<filename>.mp3 --no-fast-decode--timestamps writes lines like [00:36.000 --> 00:49.000] ... to the output file.
--realtime is optional (off by default) because streaming logs can slow long CPU transcriptions.
Run without audio_file:
uv run python main.pyBatch mode will:
- Scan
data/inputfor.movfiles - Convert each
.movto.mp3withscripts/extract_audio.sh - Skip conversion when the target
.mp3already exists - Transcribe all
.mp3files indata/input - Save transcripts into
data/output
You can override paths:
uv run python main.py --input-dir data/input --output-dir data/output --extractor-script scripts/extract_audio.shAvailable models:
tiny.en, tiny, base.en, base, small.en, small, medium.en, medium, large, turbo
- The CLI uses Typer + Rich for styled logs and progress states.
- Transcription uses the Python Whisper API (
model.transcribe) for full-audio processing. - See the Whisper repository for model details.
- translation with
uv run python main.py --task translate --language sv --model tinyis not tested nor giving good results, so use with caution.
