An intelligence-first voice capture and analysis CLI for engineers.
Ten-Second Tom captures voice recordings and text notes, transcribes them locally with Whisper, analyzes sentiment and context with Claude AI (or local models), and provides semantic search across all your entries. Privacy-first — all data stays on your machine.
- Voice recording with real-time streaming transcription
- Text notes with optional voice dictation
- AI-powered analysis using Claude or local models (Ollama)
- Semantic search across all entries with vector embeddings
- Local-first architecture — your data never leaves your machine
- Node.js 20+
- SoX (for microphone capture):
brew install soxon macOS,choco install soxon Windows - Anthropic API key (for cloud analysis with Claude) OR Ollama (for local analysis)
- Whisper model (automatically downloaded on first use)
npm install -g ten-second-tomOr with pnpm:
pnpm add -g ten-second-tomgit clone https://github.com/sirkirby/ten-second-tom.git
cd ten-second-tom
pnpm install
pnpm build
pnpm -C packages/cli start# Configure your LLM, embedding, and STT preferences
tom setup
# Record audio with live transcription
tom record
# Create a text note (type or dictate)
tom note
# Search your entries
tom search
# View help for any command
tom --help
tom record --helpFirst-run configuration wizard. Choose your LLM provider (cloud Claude via Anthropic API or local model via Ollama), configure embeddings, and download the Whisper STT model.
Record audio from your microphone. Transcription happens in real-time. After recording, Tom analyzes the entry for sentiment and context, creates a vector embedding, and saves everything to your local database.
Create a text-based note. You can type freely or use voice dictation (speech-to-text). Notes go through the same analysis and embedding pipeline as recordings.
Search your entries using natural language. Uses semantic search (vector similarity) if embeddings are enabled, falls back to keyword search otherwise. Results show relevant entries ranked by similarity.
Tom stores all configuration and data in ~/.tom/:
~/.tom/
├── config.json # LLM, embedding, and STT settings
├── data.db # SQLite database with entries and embeddings
└── recordings/ # Audio files from `tom record`
Run tom setup to configure your LLM provider and embedding strategy. The setup wizard walks through:
- LLM Provider: Claude via Anthropic API (cloud) or local model via Ollama
- Embedding Provider: Ollama (local vectors), Voyage AI (cloud), or keyword-only search
- STT Model: Whisper (downloaded automatically)
For cloud analysis, you'll need an Anthropic API key. For local analysis, install Ollama and pull a model (e.g., ollama pull qwen2.5:7b).
pnpm installpnpm test # Run all tests once
pnpm test:watch # Watch modepnpm build # Build all packagespnpm lint # Check linting
pnpm format # Format code
pnpm format:check # Verify formattingpnpm -C packages/cli dev # Watch + rebuild CLI
pnpm -C packages/core dev # Watch + rebuild coreTom follows a monorepo structure:
packages/cli/— Terminal UI (Ink + React) and command routing (Commander)packages/core/— Business logic: transcription, analysis, embedding, search, and storage
See CLAUDE.md for detailed architecture and coding standards.
MIT. See LICENSE for details.