Feature Request: Add FunASR/SenseVoice as STT backend

## Feature Request

Dograh as an open-source voice AI platform could benefit from **FunASR/SenseVoice** as an STT backend.

### Why

- **OpenAI-compatible API** — `funasr-server` provides `/v1/audio/transcriptions` endpoint
- **Real-time WebSocket streaming** — built-in streaming server at `ws://localhost:10095`
- **Self-hosted** — fully offline, no external APIs
- **50+ languages** with automatic language detection
- **170x real-time** on GPU — non-autoregressive, very fast
- **Speaker diarization** — built-in cam++ model for multi-speaker scenarios
- **Emotion detection** — SenseVoice classifies speech emotion

### Quick Setup

```bash
pip install funasr
funasr-server --device cuda
```

### Streaming Mode

```bash
python -m funasr.bin.ws_server --device cuda
# WebSocket endpoint: ws://localhost:10095
```

Happy to help with integration!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Add FunASR/SenseVoice as STT backend #385

Feature Request

Why

Quick Setup

Streaming Mode

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature Request: Add FunASR/SenseVoice as STT backend #385

Description

Feature Request

Why

Quick Setup

Streaming Mode

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions