Commit ea16013

authored

[Feature] Add Qwen3-ASR MLX model and transcriber (1/2) (#150)

## Summary Add Qwen3-ASR-0.6B as a second STT model alongside Whisper — model implementation and offline transcriber only. No vLLM server changes. - MLX model: Conv2d audio encoder → Qwen3 causal LM (GQA, QK-norm, RoPE, tied embeddings) - `Qwen3ASRTranscriber`: prompt construction, greedy decode, post-processing - Weight sanitization: HF `thinker.*` prefix mapping, Conv2d NCHW→NHWC transpose - 38 unit tests Part 1 of 2. Part 2 (`stt/qwen3-asr-integration`) wires this into `vllm serve`. ## Test ```bash # Unit tests (38, <1s) pytest tests/test_qwen3_asr.py -v -m "not slow" # Regression (existing tests unaffected) pytest tests/ -m "not slow" -q # Slow tests (requires local model) pytest tests/test_qwen3_asr.py -v -m slow ``` <img width="1433" height="276" alt="截圖 2026-03-09 晚上8 59 16" src="https://github.com/user-attachments/assets/be5f3571-3f6c-411b-bc47-6f909c2e44de" /> <img width="1418" height="356" alt="截圖 2026-03-09 晚上9 01 57" src="https://github.com/user-attachments/assets/632a5da3-2de5-41fb-abac-635f9e27bdcd" /> <img width="1423" height="477" alt="截圖 2026-03-09 晚上9 02 19" src="https://github.com/user-attachments/assets/efd2de93-bf15-4d28-95d1-e219c3d2df60" /> --------- Signed-off-by: RickyChen / 陳昭儒 <ricky.chen@infinirc.com> Signed-off-by: RickyChen / 陳昭儒 <rickychen@infinirc.com>

1 parent 4b3166d commit ea16013Copy full SHA for ea16013

5 files changed

tests
- test_qwen3_asr.py
vllm_metal/stt

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit ea16013

File tree

0 commit comments