Feature Request
Dograh is building a great open-source voice AI platform. Suggesting SenseVoice / FunASR as an additional STT provider option.
Why SenseVoice for voice AI?
- 5x faster than Whisper — non-autoregressive architecture, critical for real-time voice agents
- 50+ languages in a single 234M param model
- Emotion detection — identifies speaker emotions (happy, angry, sad), useful for sentiment-aware agents
- Audio events — detects laughter, applause, music, background noise
- OpenAI-compatible API —
funasr-server serves /v1/audio/transcriptions, easy integration
- Streaming support — WebSocket-based real-time streaming with partial results
Self-hosted advantage
FunASR runs entirely locally — perfect for self-hosted voice AI:
pip install funasr vllm
funasr-server --device cuda # OpenAI-compatible /v1/audio/transcriptions
Or integrate directly:
from funasr import AutoModel
model = AutoModel(model="iic/SenseVoiceSmall", vad_model="fsmn-vad")
result = model.generate(input=audio_bytes)
Resources
Feature Request
Dograh is building a great open-source voice AI platform. Suggesting SenseVoice / FunASR as an additional STT provider option.
Why SenseVoice for voice AI?
funasr-serverserves/v1/audio/transcriptions, easy integrationSelf-hosted advantage
FunASR runs entirely locally — perfect for self-hosted voice AI:
pip install funasr vllm funasr-server --device cuda # OpenAI-compatible /v1/audio/transcriptionsOr integrate directly:
Resources
funasr-server