Skip to content

Releases: second-state/cohere_transcribe_rs

v0.1.1

27 Mar 19:09

Choose a tag to compare

MLX backend optimizations

  • GPU-side argmax: Decoder now computes argmax on GPU via mlx_argmax_axis, transferring a single i32 per step instead of the full 16,384-element logits vector (64 KB → 4 bytes per token)
  • GPU-native batch norm: Replaced CPU round-trip sqrt/recip with mlx_rsqrt, eliminating 96 GPU→CPU→GPU transfers per encoder forward pass (48 layers × 2 ops)
  • O(1) weight cloning: shallow_clone() uses mlx_array_set ref-counted sharing instead of full CPU round-trip, reducing encoder construction time from ~75s to near-instant

Other changes

  • README updated with clearer title and streamlined intro
  • Added key learnings 13–19 to CLAUDE.md

v0.1.0

27 Mar 18:20

Choose a tag to compare

Cohere Transcribe RS v0.1.0

First release — pure-Rust CLI and OpenAI-compatible API server for the CohereLabs/cohere-transcribe-03-2026 speech recognition model.

Features

  • Two binaries: transcribe (CLI) and transcribe-server (HTTP API)
  • OpenAI Whisper API compatible — drop-in replacement, works with any OpenAI client
  • No Python or PyTorch at runtime — fully self-contained binaries
  • 14 languages: English, French, German, Spanish, Italian, Portuguese, Dutch, Polish, Greek, Arabic, Japanese, Chinese, Vietnamese, Korean
  • Multiple audio formats: WAV, FLAC, MP3, AAC, OGG (via symphonia)
  • Long audio support: automatic chunking with overlap for files > 35s

Platforms

Asset Platform Backend
transcribe-linux-x86_64.zip Linux x86_64 (CPU) libtorch
transcribe-linux-x86_64-cuda.zip Linux x86_64 (CUDA 12.6) libtorch
transcribe-linux-aarch64.zip Linux aarch64 (CPU, SVE) libtorch
transcribe-linux-aarch64-cuda.zip Linux aarch64 (CUDA 12.6) libtorch
transcribe-macos-aarch64.zip macOS Apple Silicon MLX (Metal GPU)

Each zip contains both binaries, vocab.json, and platform-specific runtime libraries (libtorch/ on Linux, mlx.metallib on macOS). No LD_LIBRARY_PATH needed — RPATH is baked in.

Quick Start

See the README for setup instructions.