Feature Request: Add FunASR for training data annotation

## Feature Request

Coqui TTS is one of the best open-source TTS libraries! FunASR could be useful for training data preparation — automatic speech-to-text annotation for audio datasets.

**Use case:** Annotate raw audio with FunASR → get timestamped transcripts → use for TTS fine-tuning.

**Why FunASR?**

- **SenseVoice**: 50+ languages, matching Coqui TTS's multilingual ambition
- **Paraformer**: Character-level timestamps for precise audio-text alignment
- **Built-in VAD + punctuation**: One-call pipeline
- **170x realtime on GPU**: Efficient for large dataset annotation
- **Open source**: Apache 2.0 license

**Example:**
```python
from funasr import AutoModel
model = AutoModel(model="paraformer-zh", vad_model="fsmn-vad", punc_model="ct-punc")
result = model.generate(input="audio.wav")
```

- GitHub: https://github.com/modelscope/FunASR (16K+ stars)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Add FunASR for training data annotation #4422

Feature Request

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature Request: Add FunASR for training data annotation #4422

Description

Feature Request

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions