Whisper Typing

A powerful, human-like background speech-to-text application for Windows that runs locally. It listens for a global hotkey to record your voice, transcribes it in real-time using faster-whisper, and types the result into your active window with natural rhythm and pace.

Features

Real-Time Transcription: See your words appear in the preview area instantly as you speak.
Human-like Typing: Simulates natural typing with variable speed, random jitter, and intelligent pauses after punctuation.
Global Hotkeys: Control recording and typing from any application.
- Record/Stop: F8 (default)
- Confirm Type: F9 (default)
- Improve Text: F10 (default) - Uses Gemini AI to fix grammar and refine text.
Window Refocus: Automatically switches back to your target window after recording stops (configurable).
Safe Focus: Automatically stops typing if you switch away from the target window.
Secure Storage: Sensitive API keys (Gemini) are stored safely in a local .env file.
TUI Management: A sleek terminal interface for monitoring logs, previewing text, and configuring settings.
Microphone Selection: Choose your preferred input device directly from the configuration screen.
Local Processing: Audio is processed locally using faster-whisper (accelerated with CUDA if available).

Prerequisites

Python 3.13+
NVIDIA GPU (Recommended): Supports CUDA for lightning-fast transcription. Fallback to CPU is supported but slower.

Installation

This project uses uv for dependency management.

Clone the repository:

git clone https://github.com/rpfilomeno/whispher-typing.git
cd whispher-typing

Install dependencies:
```
uv sync
```

Usage

Run the application using uv:

uv run whisper-typing

Build EXE

Build a Windows executable application:

build_dist.ps1

TUI Shortcuts

Inside the application, you can use these keys:

c: Open Configuration screen.
p: Pause/Resume hotkeys.
r: Reload configuration.
q: Quit the application.

Workflow

Start Recording: Press F8. You will see "Recording" in the status bar.
Speak: You will see transcribed text appear in the Preview Area in real-time.
Stop: Press F8 again. If enabled, the application will automatically refocus the window you were in before recording.
Confirm Type: Switch to your target application (e.g., Notepad, Slack) and press F9. The text will be typed out with human-like timing.
Improve (Optional): Press F10 before typing to have Gemini AI refine your transcription.

Configuration

You can customize the application via the UI (press c) or by editing local files.

Secure API Keys

The Gemini API key is stored in a .env file. You can enter it through the UI on first run or by editing the file:

GEMINI_API_KEY=your_key_here

JSON Configuration (`config.json`)

Other settings are stored in config.json:

{
  "hotkey": "<f8>",
  "type_hotkey": "<f9>",
  "improve_hotkey": "<f10>",
  "model": "openai/whisper-base.en",
  "language": "en",
  "device": "cpu",
  "compute_type": "auto",
  "typing_wpm": 350,
  "refocus_window": false,
  "microphone_name": "Default System Mic",
  "gemini_model": "models/gemini-2.0-flash",
  "model_cache_dir": "./models/"
}

Model Storage

By default, Whisper models are downloaded and stored in the Hugging Face cache directory:

Windows: %USERPROFILE%\.cache\huggingface\hub
Linux/macOS: ~/.cache/huggingface/hub

Changing the Storage Location

You can change where models are stored in three ways:

Configuration Screen: Press c in the app and set the Model Cache Dir.
JSON Config: Manually add or edit the "model_cache_dir" field in config.json.
Environment Variable: Set the HF_HOME environment variable on your system.

Troubleshooting

Slow Transcription: Check the logs to see if "cuda" or "cpu" is being used. You can change this in the Configuration screen.
Hotkeys not working: Ensure no other application is capturing the same keys.
Microphone Issues: Ensure the correct microphone is selected in the Configuration screen (c).

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github		.github
src/whisper_typing		src/whisper_typing
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
build_dist.ps1		build_dist.ps1
config.json		config.json
dprint.json		dprint.json
errors.txt		errors.txt
lefthook.yml		lefthook.yml
mise.toml		mise.toml
pyproject.toml		pyproject.toml
uv.lock		uv.lock
whisper-typing.spec		whisper-typing.spec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper Typing

Features

Prerequisites

Installation

Usage

Build EXE

TUI Shortcuts

Workflow

Configuration

Secure API Keys

JSON Configuration (`config.json`)

Model Storage

Changing the Storage Location

Troubleshooting

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Whisper Typing

Features

Prerequisites

Installation

Usage

Build EXE

TUI Shortcuts

Workflow

Configuration

Secure API Keys

JSON Configuration (config.json)

Model Storage

Changing the Storage Location

Troubleshooting

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 1

Languages

JSON Configuration (`config.json`)

Packages