Feature Request: Add SenseVoice/FunASR as ASR engine option

## Feature Request

SoniTranslate handles video dubbing with speech recognition and translation. **SenseVoice/Paraformer** from [FunASR](https://github.com/modelscope/FunASR) would be a strong ASR engine option.

### Why

- **170x real-time on GPU** — non-autoregressive, much faster than Whisper for long video processing
- **Built-in speaker diarization** (cam++) — essential for multi-speaker dubbing
- **Built-in punctuation** — auto-adds punctuation for better subtitle quality
- **50+ languages** with automatic language detection
- **Chinese/Japanese/Korean accuracy** — outperforms Whisper on CJK benchmarks

### Quick Integration

```python
from funasr import AutoModel

model = AutoModel(
    model="iic/SenseVoiceSmall",
    vad_model="fsmn-vad",
    spk_model="cam++",
)
result = model.generate(input="video_audio.wav")
# Returns text with timestamps and speaker labels
```

Happy to help with integration!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Add SenseVoice/FunASR as ASR engine option #203

Feature Request

Why

Quick Integration

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Feature Request: Add SenseVoice/FunASR as ASR engine option #203

Description

Feature Request

Why

Quick Integration

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions