Skip to content

Feature: Add SenseVoice/FunASR as local STT engine — 5x faster, 234M params #50

@LauraGPT

Description

@LauraGPT

Hi! EchoKit is a great open-source voice agent platform.

For the ASR/STT component, SenseVoice could significantly reduce latency:

Why SenseVoice?

  • Non-autoregressive — complete transcription in single forward pass
  • 5x faster than Whisper — critical for real-time voice agents
  • 234M params — lightweight
  • OpenAI-compatible API: funasr-server --device cuda serves at /v1/audio/transcriptions
  • 50+ languages, auto-detection

Quick integration

from funasr import AutoModel
model = AutoModel(model="iic/SenseVoiceSmall", vad_model="fsmn-vad")
result = model.generate(input=audio_chunk)

Links

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions