Skip to content

mattthewong/vox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vox

System-wide speech-to-text for macOS. Hold a hotkey, speak, release -- transcribed text appears wherever your cursor is. Runs entirely locally using whisper.cpp. No paid services, no rate limits.

vox setup

How it works

Hold hotkey --> Record mic --> Whisper transcribes --> Text pasted at cursor
  1. You run vox in a terminal (or as a background process)
  2. Switch to any app -- editor, browser, terminal, chat
  3. Hold your hotkey (e.g. Fn, Cmd+Shift), speak naturally
  4. Release -- text appears where your cursor is

You hear a gentle chime on start and stop.

Install

Quick start (recommended)

Requirements:

  • macOS
  • Homebrew
  • Go 1.24+
git clone https://github.com/mattthewong/vox.git
cd vox
make setup   # installs sox + whisper-cpp and downloads a default model if missing
make start   # starts whisper-server and runs vox

Manual setup (advanced)

brew install sox whisper-cpp
mkdir -p ~/.local/share/whisper-cpp
curl -L -o ~/.local/share/whisper-cpp/ggml-base.en.bin \
  "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin"

Build or install binary

make build      # outputs bin/vox
make install    # installs to /usr/local/bin/vox

Start manually

whisper-server --host 127.0.0.1 --port 2022 \
  --model ~/.local/share/whisper-cpp/ggml-base.en.bin
vox

macOS permissions

On first run, macOS will prompt for two permissions. Grant them to your terminal app (Terminal, iTerm2, Ghostty, etc.):

  • Microphone -- System Settings > Privacy & Security > Microphone
  • Accessibility -- System Settings > Privacy & Security > Accessibility

Configuration

All via environment variables:

Variable Default Description
VOX_HOTKEY option+space Hotkey to trigger recording. Comma-separated for multiple.
WHISPER_URL http://127.0.0.1:2022 Whisper server URL
VOX_HOLD_TO_TALK true true = hold to record, false = toggle on/off
VOX_LANGUAGE (auto-detect) BCP-47 language code (e.g. en, es)
VOX_VERBOSE false Debug logging

Hotkey formats

VOX_HOTKEY="fn"                 # Fn / Globe key
VOX_HOTKEY="cmd+shift"          # Modifier-only (no extra key needed)
VOX_HOTKEY="option+space"       # Modifier + key
VOX_HOTKEY="ctrl+shift+d"       # Multiple modifiers + key
VOX_HOTKEY="fn,cmd+shift"       # Multiple hotkeys (either triggers)

Available modifiers: ctrl, shift, option/alt, cmd/command Available keys: a-z, 0-9, f1-f20, space, return, escape, tab, delete, arrow keys

Architecture

cmd/vox/main.go          -- Entrypoint, hotkey event loop, orchestration
internal/hotkey/          -- CGEventTap-based global hotkey (supports modifier-only, fn, modifier+key)
  hotkey_darwin.go        -- Go listener with keydown/keyup channels
  bridge.c                -- C event tap callback
internal/audio/           -- Mic recording via ffmpeg/sox subprocess
  recorder.go             -- Start/stop recording, WAV output
  sound.go                -- Embedded chime sounds (start/stop)
internal/transcribe/      -- Whisper HTTP client
  client.go               -- Multipart upload, auto-detects /inference vs /v1/audio/transcriptions
internal/inject/          -- Text injection into focused app
  paste_darwin.go         -- pbcopy + CGEvent Cmd+V (works in any app)
internal/config/          -- Env var config + hotkey string parsing

Development

make build        # Build binary
make test         # Run all tests
make test-short   # Skip integration tests
make lint         # go vet
make fmt          # gofmt
make run          # Build and run

Why

I was using Whisper Flow for speech-to-text but kept hitting rate limits on their free plan. Vox does the same thing -- system-wide dictation with a hold-to-talk hotkey -- but runs entirely on your machine with no external dependencies.

License

MIT

About

System-wide speech-to-text for macOS. Hold a hotkey, speak, text appears at your cursor. Local Whisper, no paid services.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors