Skip to content

etalab-ia/whisperx-openai-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whisper OpenAI API

FastAPI-based wrapper around WhisperX, providing an openAI compatible API for transcription and speaker diarization.

Deployment

docker run -d \
  --gpus all \
  -p 8000:8000 \
  -e API_KEY=your-api-key \
  -e HF_TOKEN=your-hf-token \
  -v /data/models:/data/models \
  ghcr.io/etalab-ia/whisperx-openai-api:latest

Models are downloaded on first startup and cached in /data/models. Mount a persistent volume to avoid re-downloading on restart.

1 worker is recommended. GPU inference is serialized internally : multiple workers each load a full model copy in VRAM, and it doesn't improve throughput unless you have multiple GPUs.

To scale workers (each worker loads its own model in VRAM):

docker run -d --gpus all ... -e WORKERS=2 whisperx-openai-api

Environment Variables

Variable Description Default
API_KEY API key for API access Required
HF_TOKEN Hugging Face token (required for diarization) Required
TRANSCRIBE_MODEL WhisperX model to load large-v3-turbo
BATCH_SIZE Transcription batch size 32
DIARIZE_MODEL Pyannote diarization model pyannote/speaker-diarization-community-1
PRELOADED_ALIGN_MODEL_LANGUAGES Languages to pre-load alignment models for ["en", "fr", "nl", "de"]
RETURN_CHAR_ALIGNMENTS Return character-level alignments (diarization only) false
INTERPOLATE_METHOD WhisperX interpolation method (diarization only) nearest
FILL_NEAREST Fill nearest gaps in speaker assignment (diarization only) false
TIMEOUT_KEEP_ALIVE Keep-alive timeout (seconds) 60
PORT Server port 8000
WORKERS Number of uvicorn workers (each loads its own model in VRAM) 1
RELOAD Enable auto-reload false
ROOT_PATH API root path None
LOGGING_CONFIG Path to logging config file logging-config.yaml
DEBUG Enable debug logging false

Local Development

Install uv

Install instructions in this link.

API-only Development

Inference libraries (whisperx, pytorch, etc.) are heavy and may not run on all devices. We provide a dev dependency group to allow running API tests locally and IDE autocompletion. To install:

uv sync --group dev

Full Inference Development

To develop with a fully functional transcription pipeline:

uv sync --group dev --group inference

Run the server locally:

export PORT=8010
export RELOAD=true
export LOGGING_CONFIG=logging-config.yaml
python app/main.py

Testing

Tests mock actual inference and can be run locally:

cd app
python -m pytest tests/ -v

Integration tests

Check the documentation to run integration tests on GPU.

About

Wrapper around whisperX to expose it as an API

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors