Add FunASR/SenseVoice models as supported ASR in SlamKit

Hi! Great toolkit for efficient SpeechLM training.

Would you consider supporting [FunASR](https://github.com/modelscope/FunASR) models (Paraformer, SenseVoice) in SlamKit?

## Relevant models

- **Fun-ASR-Nano** — Audio encoder + Qwen2.5-0.5B LLM, end-to-end speech LM
- **SenseVoice** — Multi-task speech model (ASR + emotion + events), non-autoregressive
- **Paraformer** — Non-autoregressive ASR with CTC/attention hybrid

## Why integrate?

- Pre-trained audio encoders from FunASR could serve as speech encoders for SpeechLM training
- SenseVoice's multi-task architecture is relevant for multi-task SpeechLM research
- Fun-ASR-Nano demonstrates encoder-LLM fusion for speech understanding

## References

- Fun-ASR-Nano: https://github.com/FunAudioLLM/Fun-ASR (1.2K stars)
- SenseVoice: https://github.com/FunAudioLLM/SenseVoice (8K+ stars)
- FunASR: https://github.com/modelscope/FunASR (16K+ stars)
- Paper: https://arxiv.org/abs/2407.04051

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FunASR/SenseVoice models as supported ASR in SlamKit #22

Relevant models

Why integrate?

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add FunASR/SenseVoice models as supported ASR in SlamKit #22

Description

Relevant models

Why integrate?

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions