Feature: Add SenseVoice/FunASR as local STT engine — 5x faster, 234M params

Hi! EchoKit is a great open-source voice agent platform.

For the ASR/STT component, **SenseVoice** could significantly reduce latency:

## Why SenseVoice?

- **Non-autoregressive** — complete transcription in single forward pass
- **5x faster than Whisper** — critical for real-time voice agents
- **234M params** — lightweight
- **OpenAI-compatible API**: `funasr-server --device cuda` serves at `/v1/audio/transcriptions`
- **50+ languages**, auto-detection

## Quick integration
```python
from funasr import AutoModel
model = AutoModel(model="iic/SenseVoiceSmall", vad_model="fsmn-vad")
result = model.generate(input=audio_chunk)
```

## Links
- SenseVoice: https://github.com/FunAudioLLM/SenseVoice (8.3K stars)
- FunASR: https://github.com/modelscope/FunASR (16.7K stars)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Add SenseVoice/FunASR as local STT engine — 5x faster, 234M params #50

Why SenseVoice?

Quick integration

Links

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature: Add SenseVoice/FunASR as local STT engine — 5x faster, 234M params #50

Description

Why SenseVoice?

Quick integration

Links

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions