A simple push-to-talk speech-to-text system for Linux/Wayland using the multi-language parakeet ASR model from NVIDIA and the onnx-asr library.
- Press F12 to start recording, press again to transcribe and paste.
- Super fast on low end hardware with only CPU
- Multilanguage - you can even switch language while speaking.
- Everything runs local, forever forever
- Model is kept in memory for fast response
- System tray icon shows status (gray=idle, red=recording, blue=transcribing)
- Use any model from the model that onnx-asr supports by changing the model name the python file.
- All credits go to the onnx-asr project and NVIDIA, thanks for making local ASR so great!
- Python 3.12+
- uv (Python package manager)
- ydotool (for keyboard automation)
- Hyprland (or another Wayland compositor)
- wl-copy (wayland clipboard utility)
- PyAudio dependencies
# Arch Linux
sudo pacman -S python-pyaudio ydotool wl-clipboard
# Ubuntu/Debian
sudo apt install python3 python3-pyaudio ydotool wl-clipboardcurl -LsSf https://astral.sh/uv/install.sh | shgit clone <repository-url> ~/.local/share/parakeet
cd ~/.local/share/parakeet
uv syncThis will automatically install all Python dependencies including:
- onnx-asr
- PyGObject
- pydbus
- pyaudio
ydotoold needs to run as root to simulate keyboard input.
sudo cp ydotoold.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable ydotoold.service
sudo systemctl start ydotoold.serviceVerify it's running:
sudo systemctl status ydotoold.servicecp parakeet-dbus.service ~/.config/systemd/user/
systemctl --user daemon-reload
systemctl --user enable parakeet-dbus.service
systemctl --user start parakeet-dbus.serviceVerify it's running:
systemctl --user status parakeet-dbus.serviceThe first startup will take a while as it downloads and loads the Parakeet model (~2.8GB memory usage).
Some optional features use an LLM via OpenRouter (ask/transform/insert). To enable them, set your API key via a user environment file loaded by the systemd service.
- Create the environment file:
mkdir -p ~/.config/parakeet
printf 'OPENROUTER_API_KEY=sk-or-...\n' > ~/.config/parakeet/parakeet.env
chmod 600 ~/.config/parakeet/parakeet.env- Reload and restart the user service:
systemctl --user daemon-reload
systemctl --user restart parakeet-dbus.service- Verify the key is picked up (look for a log line like "API key found"):
journalctl --user -u parakeet-dbus.service -fIf the key is not set, LLM features will gracefully fall back (no external call).
Add this line to your Hyprland hotkeys config:
bind = , F12, exec, dbus-send --session --dest=com.parakeet.Transcribe --type=method_call /com/parakeet/Transcribe com.parakeet.Transcribe.Toggle
If you enabled the OpenRouter API key, you can bind additional hotkeys for the LLM-powered modes:
# Ask mode: speak a question/command; pastes concise answer
bind = , F9, exec, dbus-send --session --dest=com.parakeet.Transcribe --type=method_call /com/parakeet/Transcribe com.parakeet.Transcribe.ToggleAsk
# Transform mode: speak an instruction to transform current clipboard text; pastes transformed text
bind = , F10, exec, dbus-send --session --dest=com.parakeet.Transcribe --type=method_call /com/parakeet/Transcribe com.parakeet.Transcribe.ToggleTransform
# Insert mode: speak text to insert into clipboard content; pastes merged text
bind = , F11, exec, dbus-send --session --dest=com.parakeet.Transcribe --type=method_call /com/parakeet/Transcribe com.parakeet.Transcribe.ToggleLLM
- Press F12 to start recording
- Speak your text
- Press F12 again to stop recording
- The transcription will be automatically pasted at your cursor position using Ctrl+Shift+V
Parakeet includes three optional LLM-powered workflows that build on top of local speech recognition. These require an OpenRouter API key and internet connectivity. Without a key, these features are skipped gracefully.
-
Ask mode (
ToggleAsk)- Speak a question or instruction. The LLM returns a short, direct answer which is pasted.
- Example: “Translate ‘good morning’ to French.” → pastes “bonjour”.
-
Transform mode (
ToggleTransform)- Speak an instruction; the LLM applies it to your current clipboard content and pastes the result.
- Example: “Summarize in one sentence.”
-
Insert mode (
ToggleLLM)- Speak text to insert into the clipboard content. The LLM merges it in-place (or at the end), adjusting punctuation/casing minimally.
- Example: Insert a short sentence into a paragraph on your clipboard.
You can trigger these directly via D-Bus as well:
# Ask mode
dbus-send --session --dest=com.parakeet.Transcribe --type=method_call \
/com/parakeet/Transcribe com.parakeet.Transcribe.StartRecordingAsk
dbus-send --session --dest=com.parakeet.Transcribe --type=method_call \
/com/parakeet/Transcribe com.parakeet.Transcribe.StopRecordingAsk
# Transform mode
dbus-send --session --dest=com.parakeet.Transcribe --type=method_call \
/com/parakeet/Transcribe com.parakeet.Transcribe.StartRecordingTransform
dbus-send --session --dest=com.parakeet.Transcribe --type=method_call \
/com/parakeet/Transcribe com.parakeet.Transcribe.StopRecordingTransform
# Insert mode
dbus-send --session --dest=com.parakeet.Transcribe --type=method_call \
/com/parakeet/Transcribe com.parakeet.Transcribe.StartRecordingLLM
dbus-send --session --dest=com.parakeet.Transcribe --type=method_call \
/com/parakeet/Transcribe com.parakeet.Transcribe.StopRecordingLLM- Default OpenRouter model:
google/gemini-2.5-flash-preview-09-2025. - You may change the model by editing the
modelfield inparakeet_dbus.pywhere the OpenRouter request payload is constructed. - Using these features sends your prompt and relevant text to the selected OpenRouter model provider; avoid sensitive data.
Parakeet service:
journalctl --user -u parakeet-dbus.service -fydotoold service:
sudo journalctl -u ydotoold.service -f# Start recording
dbus-send --session --dest=com.parakeet.Transcribe --type=method_call /com/parakeet/Transcribe com.parakeet.Transcribe.StartRecording
# Stop recording (will transcribe and paste)
dbus-send --session --dest=com.parakeet.Transcribe --type=method_call /com/parakeet/Transcribe com.parakeet.Transcribe.StopRecordingMake sure ydotoold is running:
ls -la /run/ydotool/socketYou should see a socket file. If not, restart ydotoold:
sudo systemctl restart ydotoold.serviceSpeak for at least 1 second. Very short recordings are ignored.
- parakeet_dbus.py - Main D-Bus service that handles recording and transcription
- ydotoold.service - System service for keyboard automation (runs as root)
- parakeet-dbus.service - User service for the transcription D-Bus interface
Uses the NVIDIA Parakeet TDT 0.6B v3 model via onnx-asr:
- Automatic download on first run
- ~2.8GB memory usage when loaded
- Multilanguage - you can even switch language while speaking.