Skip to content

Feature request: Add FunASR/Paraformer as alternative ASR backend #1450

@LauraGPT

Description

@LauraGPT

Feature Request

Would it be possible to add FunASR's Paraformer model as an alternative ASR backend alongside Whisper?

Why

  • Paraformer is a non-autoregressive model — it runs significantly faster than autoregressive Whisper, especially on long audio (170x realtime on GPU)
  • For Chinese/Japanese/Korean audio, Paraformer and SenseVoice consistently outperform Whisper (see SenseVoice benchmark)
  • Fun-ASR-Nano (LLM-based, 800M params) supports 31 languages with word-level timestamps
  • FunASR models are available as ONNX via Sherpa-ONNX, which faster-whisper could potentially integrate

FunASR Models

Model Params Languages Speed Timestamps
SenseVoice-Small 234M 50+ 25x realtime (CPU)
Paraformer-large 220M zh/en/ja/ko/yue 170x realtime (GPU)
Fun-ASR-Nano 800M 31 GPU via vLLM ✅ word-level

Quick test

pip install funasr

from funasr import AutoModel
model = AutoModel(model="iic/SenseVoiceSmall")
result = model.generate(input="audio.wav")

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions