Dograh's modular STT architecture is a great fit for adding FunASR as a self-hosted STT provider. This would give users a completely self-contained voice AI stack with no external API dependencies.
Why FunASR?
- OpenAI-compatible API: FunASR ships with
funasr-server that serves at /v1/audio/transcriptions — same API as OpenAI Whisper, so integration is minimal
- 170x realtime on GPU, 17x on CPU — ideal for real-time voice agent pipelines
- 50+ languages with strong CJK support
- Built-in VAD + punctuation — no need for separate voice activity detection
- Speaker diarization — cam++ model included for multi-speaker scenarios
- Streaming support — Paraformer-streaming for low-latency real-time ASR
Integration
# Start FunASR server (OpenAI-compatible)
pip install funasr
funasr-server --device cuda
# Serves at localhost:8000/v1/audio/transcriptions
Since Dograh already supports custom STT providers, pointing to the FunASR server endpoint should work with minimal code changes.
Resources
This would make Dograh one of the few voice AI platforms with a truly zero-external-dependency option for STT.
Dograh's modular STT architecture is a great fit for adding FunASR as a self-hosted STT provider. This would give users a completely self-contained voice AI stack with no external API dependencies.
Why FunASR?
funasr-serverthat serves at/v1/audio/transcriptions— same API as OpenAI Whisper, so integration is minimalIntegration
Since Dograh already supports custom STT providers, pointing to the FunASR server endpoint should work with minimal code changes.
Resources
This would make Dograh one of the few voice AI platforms with a truly zero-external-dependency option for STT.