Vox

System-wide speech-to-text for macOS. Hold a hotkey, speak, release -- transcribed text appears wherever your cursor is. Runs entirely locally using whisper.cpp. No paid services, no rate limits.

How it works

Hold hotkey --> Record mic --> Whisper transcribes --> Text pasted at cursor

You run vox in a terminal (or as a background process)
Switch to any app -- editor, browser, terminal, chat
Hold your hotkey (e.g. Fn, Cmd+Shift), speak naturally
Release -- text appears where your cursor is

You hear a gentle chime on start and stop.

Install

Quick start (recommended)

Requirements:

macOS
Homebrew
Go 1.24+

git clone https://github.com/mattthewong/vox.git
cd vox
make setup   # installs sox + whisper-cpp and downloads a default model if missing
make start   # starts whisper-server and runs vox

Manual setup (advanced)

brew install sox whisper-cpp
mkdir -p ~/.local/share/whisper-cpp
curl -L -o ~/.local/share/whisper-cpp/ggml-base.en.bin \
  "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin"

Build or install binary

make build      # outputs bin/vox
make install    # installs to /usr/local/bin/vox

Start manually

whisper-server --host 127.0.0.1 --port 2022 \
  --model ~/.local/share/whisper-cpp/ggml-base.en.bin
vox

macOS permissions

On first run, macOS will prompt for two permissions. Grant them to your terminal app (Terminal, iTerm2, Ghostty, etc.):

Microphone -- System Settings > Privacy & Security > Microphone
Accessibility -- System Settings > Privacy & Security > Accessibility

Configuration

All via environment variables:

Variable	Default	Description
`VOX_HOTKEY`	`option+space`	Hotkey to trigger recording. Comma-separated for multiple.
`WHISPER_URL`	`http://127.0.0.1:2022`	Whisper server URL
`VOX_HOLD_TO_TALK`	`true`	`true` = hold to record, `false` = toggle on/off
`VOX_LANGUAGE`	(auto-detect)	BCP-47 language code (e.g. `en`, `es`)
`VOX_VERBOSE`	`false`	Debug logging

Hotkey formats

VOX_HOTKEY="fn"                 # Fn / Globe key
VOX_HOTKEY="cmd+shift"          # Modifier-only (no extra key needed)
VOX_HOTKEY="option+space"       # Modifier + key
VOX_HOTKEY="ctrl+shift+d"       # Multiple modifiers + key
VOX_HOTKEY="fn,cmd+shift"       # Multiple hotkeys (either triggers)

Available modifiers: ctrl, shift, option/alt, cmd/command Available keys: a-z, 0-9, f1-f20, space, return, escape, tab, delete, arrow keys

Architecture

cmd/vox/main.go          -- Entrypoint, hotkey event loop, orchestration
internal/hotkey/          -- CGEventTap-based global hotkey (supports modifier-only, fn, modifier+key)
  hotkey_darwin.go        -- Go listener with keydown/keyup channels
  bridge.c                -- C event tap callback
internal/audio/           -- Mic recording via ffmpeg/sox subprocess
  recorder.go             -- Start/stop recording, WAV output
  sound.go                -- Embedded chime sounds (start/stop)
internal/transcribe/      -- Whisper HTTP client
  client.go               -- Multipart upload, auto-detects /inference vs /v1/audio/transcriptions
internal/inject/          -- Text injection into focused app
  paste_darwin.go         -- pbcopy + CGEvent Cmd+V (works in any app)
internal/config/          -- Env var config + hotkey string parsing

Development

make build        # Build binary
make test         # Run all tests
make test-short   # Skip integration tests
make lint         # go vet
make fmt          # gofmt
make run          # Build and run

Why

I was using Whisper Flow for speech-to-text but kept hitting rate limits on their free plan. Vox does the same thing -- system-wide dictation with a hold-to-talk hotkey -- but runs entirely on your machine with no external dependencies.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github/workflows		.github/workflows
assets		assets
cmd/vox		cmd/vox
internal		internal
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vox

How it works

Install

Quick start (recommended)

Manual setup (advanced)

Build or install binary

Start manually

macOS permissions

Configuration

Hotkey formats

Architecture

Development

Why

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vox

How it works

Install

Quick start (recommended)

Manual setup (advanced)

Build or install binary

Start manually

macOS permissions

Configuration

Hotkey formats

Architecture

Development

Why

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages