This file provides guidance for coding agents working in this repo.
Hex is a macOS menu bar application for on‑device voice‑to‑text. It supports Whisper (Core ML via WhisperKit) and Parakeet TDT v3 (Core ML via FluidAudio). Users activate transcription with hotkeys; text can be auto‑pasted into the active app.
# Build the app
xcodebuild -scheme Hex -configuration Release
# Run tests (must be run from HexCore directory for unit tests)
cd HexCore && swift test
# Or run all tests via Xcode
xcodebuild test -scheme Hex
# Open in Xcode (recommended for development)
open Hex.xcodeprojThe app uses The Composable Architecture (TCA) for state management. Key architectural components:
AppFeature: Root feature coordinating the app lifecycleTranscriptionFeature: Core recording and transcription logicSettingsFeature: User preferences and configurationHistoryFeature: Transcription history management
TranscriptionClient: WhisperKit integration for ML transcriptionRecordingClient: AVAudioRecorder wrapper for audio capturePasteboardClient: Clipboard operationsKeyEventMonitorClient: Global hotkey monitoring via Sauce framework
- WhisperKit: Core ML transcription (tracking main branch)
- FluidAudio (Parakeet): Core ML ASR (multilingual) default model
- Sauce: Keyboard event monitoring
- Sparkle: Auto-updates (feed: https://hex-updates.s3.amazonaws.com/appcast.xml)
- Swift Composable Architecture: State management
- Inject Hot Reloading for SwiftUI
-
Hotkey Recording Modes: The app supports both press-and-hold and double-tap recording modes, implemented in
HotKeyProcessor.swift. Seedocs/hotkey-semantics.mdfor detailed behavior specifications including:- Modifier-only hotkeys (e.g., Option) use a 0.3s threshold to prevent accidental triggers from OS shortcuts
- Regular hotkeys (e.g., Cmd+A) use user's
minimumKeyTimesetting (default 0.2s) - Mouse clicks and extra modifiers are discarded within threshold, ignored after
- Only ESC cancels recordings after the threshold
-
Model Management: Models are managed by
ModelDownloadFeature. Curated defaults live inHex/Resources/Data/models.json. The Settings UI shows a compact opinionated list (Parakeet + three Whisper sizes). No dropdowns. -
Sound Effects: Audio feedback is provided via
SoundEffect.swiftusing files inResources/Audio/ -
Window Management: Uses an
InvisibleWindowfor the transcription indicator overlay -
Permissions: Requires audio input and automation entitlements (see
Hex.entitlements) -
Logging: All diagnostics should use the unified logging helper
HexLog(HexCore/Sources/HexCore/Logging.swift). Pick an existing category (e.g.,.transcription,.recording,.settings) or add a new case so Console predicates stay consistent. Avoidprintand prefer privacy annotations (, privacy: .private) for anything potentially sensitive like transcript text or file paths.
- Default: Parakeet TDT v3 (multilingual) via FluidAudio
- Additional curated: Whisper Small (Tiny), Whisper Medium (Base), Whisper Large v3
- Note: Distil‑Whisper is English‑only and not shown by default
- WhisperKit models
~/Library/Application Support/com.kitlangton.Hex/models/argmaxinc/whisperkit-coreml/<model>
- Parakeet (FluidAudio)
- We set
XDG_CACHE_HOMEon launch so Parakeet caches under the app container: ~/Library/Containers/com.kitlangton.Hex/Data/Library/Application Support/FluidAudio/Models/parakeet-tdt-0.6b-v3-coreml- Legacy
~/.cache/fluidaudio/Models/…is not visible to the sandbox; re‑download or import.
- We set
- WhisperKit: native progress
- Parakeet: best‑effort progress by polling the model directory size during download
- Availability detection scans both
Application Support/FluidAudio/Modelsand our app cache path
- macOS 14+, Xcode 15+
- WhisperKit:
https://github.com/argmaxinc/WhisperKit - FluidAudio:
https://github.com/FluidInference/FluidAudio.git(linkFluidAudioto Hex target)
com.apple.security.app-sandbox = truecom.apple.security.network.client = true(HF downloads)com.apple.security.files.user-selected.read-write = true(optional import)com.apple.security.automation.apple-events = true(media control)
Set at app launch and logged:
XDG_CACHE_HOME = ~/Library/Containers/com.kitlangton.Hex/Data/Library/Application Support/com.kitlangton.Hex/cache
FluidAudio models reside under Application Support/FluidAudio/Models.
- Settings → Transcription Model shows a compact list with radio selection, accuracy/speed dots, size on right, and trailing menu / download‑check icon.
- Context menu offers Show in Finder / Delete.
- Repeated mic prompts during debug: ensure Debug signing uses "Apple Development" so TCC sticks
- Sandbox network errors (‑1003): add
com.apple.security.network.client = true(already set) - Parakeet not detected: ensure it resides under the container path above; downloading from Hex places it correctly.
- Always add a changeset: Any feature, UX change, or bug fix that ships to users must come with a
.changeset/*.mdfragment. The summary should mention the user-facing impact plus the GitHub issue/PR number (for example, "Improve Fn hotkey stability (#89)"). - Use non-interactive changeset creation: AI agents should use the non-interactive script:
bun run changeset:add-ai patch "Your summary here" bun run changeset:add-ai minor "Add new feature" bun run changeset:add-ai major "Breaking change"
- Only create changesets, don't process them: Agents should only create changeset fragments. The release tool is responsible for running
changeset versionto collect changesets intoCHANGELOG.mdand syncing toHex/Resources/changelog.md. - Reference GitHub issues: When a change addresses a filed issue, link it in code comments and the changeset entry (
(#123)) so release notes and Sparkle updates point users back to the discussion. If the work should close an issue, include "Fixes #123" (or "Closes #123") in the commit or PR description so GitHub auto-closes it once merged.
- Use a concise, descriptive subject line that captures the user-facing impact (roughly 50–70 characters).
- Follow up with as much context as needed in the body. Include the rationale, notable tradeoffs, relevant logs, or reproduction steps—future debugging benefits from having the full story directly in git history.
- Reference any related GitHub issues in the body if the change tracks ongoing work.
Releases are automated via a local CLI tool that handles building, signing, notarizing, and uploading.
-
AWS credentials must be set (for S3 uploads):
export AWS_ACCESS_KEY_ID=... export AWS_SECRET_ACCESS_KEY=...
-
Notarization credentials stored in keychain (one-time setup):
xcrun notarytool store-credentials "AC_PASSWORD" -
Dependencies installed at project root and in tools:
bun install # project root (for changesets) cd tools && bun install # tools dependencies
-
Ensure all changes are committed - the release tool requires a clean working tree
-
Ensure changesets exist - any user-facing change should have a
.changeset/*.mdfile:bun run changeset:add-ai patch "Fix microphone selection" -
Run the release command from project root:
bun run tools/src/cli.ts release
- Checks for clean working tree
- Finds pending changesets and applies them (bumps version in
package.json) - Syncs changelog to
Hex/Resources/changelog.md - Updates
Info.plistandproject.pbxprojwith new version - Increments build number
- Cleans DerivedData and archives with xcodebuild
- Exports and signs with Developer ID
- Notarizes app with Apple
- Creates and signs DMG
- Notarizes DMG
- Generates Sparkle appcast
- Uploads to S3 (versioned DMG +
hex-latest.dmg+ appcast.xml) - Commits version changes, creates git tag, pushes
- Creates GitHub release with DMG and ZIP attachments
The tool will prompt you to either:
- Stop and create a changeset (recommended)
- Continue with manual version bump (useful for re-running failed releases)
Each release produces:
Hex-{version}.dmg- Signed, notarized DMGHex-{version}.zip- For Homebrew caskhex-latest.dmg- Always points to latestappcast.xml- Sparkle update feed
- "Working tree is not clean": Commit or stash all changes before releasing
- Notarization fails: Check Apple ID credentials and app-specific password
- S3 upload fails: Verify AWS credentials and bucket permissions
- Build fails: Ensure Xcode 16+ and valid code signing certificates