Faster-Whisper vs GGML - Transcription Speed Discrepancy of 1 : 4 - Seeking Advice #222

JilReg · 2025-04-01T07:50:02Z

JilReg
Apr 1, 2025

Hi Kolja,

First off, thank you so much for your amazing work on RealtimeSTT - I really love what you put together! 🙌 (And I am trying really hard to make it work for me. 😄)

Quick question: I’ve been testing on an M1 Max MacBook Pro using faster-whisper-large-v3-turbo-ct2, but I’m consistently getting around 4 seconds latency per transcription.

With superwhisper (using ggml-large-v3-turbo), I’m seeing <1s latency, for the same voice-text, at the same time and on the same hardware.

Could this difference be due to the model format (CT2 vs GGML)? Or is there something I might be missing in the RealtimeSTT config? (Which would be my hope. 🙃)

I’ve tried enabling and adjusting so many settings 😅 but haven’t had luck reducing latency.

If you have any recommended settings for fastest response on Apple Silicon, I’d really appreciate it!

Thanks again 🙏
Jil

Answered by KoljaB

Apr 2, 2025

Superwhisper uses apple-specific optimizations and takes full advantage of MPS and CoreML. faster-whisper does not use these apple-specific low-level tools, it's fast in the cross-platform sense but I doubt it will ever reach superwhisper inference speed.

This said maybe I can optimize RealtimeSTT more for Apple. Will make a new release soon exposing faster-whispers cpu_threads and num_workers parameters. With compute_type="int8" we can probably make use of multithreading on a Mac. I'm hoping that will speed things up on Mac - but I'm not sure if and how much that will be the case.

View full answer

KoljaB · 2025-04-02T18:26:54Z

KoljaB
Apr 2, 2025
Maintainer

Superwhisper uses apple-specific optimizations and takes full advantage of MPS and CoreML. faster-whisper does not use these apple-specific low-level tools, it's fast in the cross-platform sense but I doubt it will ever reach superwhisper inference speed.

This said maybe I can optimize RealtimeSTT more for Apple. Will make a new release soon exposing faster-whispers cpu_threads and num_workers parameters. With compute_type="int8" we can probably make use of multithreading on a Mac. I'm hoping that will speed things up on Mac - but I'm not sure if and how much that will be the case.

1 reply

JilReg Apr 4, 2025
Author

Thanks @KoljaB – that sounds really promising! 🙌 Appreciate the insight and all the work you're putting into this. Looking forward to the next release – but no pressure on my account, only if it makes sense for you 😊

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster-Whisper vs GGML - Transcription Speed Discrepancy of 1 : 4 - Seeking Advice #222

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Faster-Whisper vs GGML - Transcription Speed Discrepancy of 1 : 4 - Seeking Advice #222

JilReg Apr 1, 2025

Replies: 1 comment · 1 reply

KoljaB Apr 2, 2025 Maintainer

JilReg Apr 4, 2025 Author

JilReg
Apr 1, 2025

Replies: 1 comment 1 reply

KoljaB
Apr 2, 2025
Maintainer

JilReg Apr 4, 2025
Author