Skip to content

Evaluate Granite-4.0-1B-Speech as alternative STT engine #408

@arvanus

Description

@arvanus

Summary

IBM's Granite-4.0-1B-Speech (~2B params, Apache 2.0) achieves state-of-the-art English ASR with 5.52 WER on the OpenASR leaderboard — roughly 2 points better than Whisper Large V3 (~7.4 WER). It supports 7 languages including Portuguese, and runs at 280x real-time on GPU.

This model is worth evaluating as a future alternative or complement to our current Whisper.cpp-based transcription pipeline.

Current Blockers

The following requirements must be met before integration is practical:

  • Native runtime availability — No C/C++ runtime exists today (no whisper.cpp equivalent). Integration would require orchestrating a multi-component ONNX pipeline (encoder + projector + decoder) via the ort Rust crate, which is significantly more complex than our current whisper-rs setup.
  • Streaming / partial transcription support — Granite currently only supports batch inference. Meetily relies on real-time partial transcriptions with VAD-filtered audio chunks. Without streaming, it cannot replace our current pipeline.
  • Mature Rust/native bindings — No dedicated Rust bindings exist. The ecosystem is Python-first (HuggingFace Transformers, vLLM).

When to Revisit

This issue should be revisited if any of the following occur:

  • IBM or the community releases a lightweight C/C++ inference runtime
  • Streaming/chunked inference support is added to the model
  • A Rust crate wrapping Granite Speech inference becomes available
  • Meetily's architecture changes to support a server-side transcription backend (where Python/vLLM would be acceptable)

Key Comparisons

Factor Granite 4.0 1B Speech Whisper.cpp (current)
English WER 5.52 ~7.4 (Large V3)
Languages 7 99+
Native C/C++ runtime None Yes
Streaming support No Yes
Memory (~fp16) ~4 GB ~1.5 GB (Large V3)
License Apache 2.0 MIT

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions