feat(telegram): add local voice transcription via whisper.cpp#1350
Closed
codefather-labs wants to merge 2 commits intoanthropics:mainfrom
Closed
feat(telegram): add local voice transcription via whisper.cpp#1350codefather-labs wants to merge 2 commits intoanthropics:mainfrom
codefather-labs wants to merge 2 commits intoanthropics:mainfrom
Conversation
Voice messages are now automatically transcribed using whisper.cpp before being forwarded to Claude. The transcribed text is prefixed with [voice transcription] so Claude knows it's machine-generated. Key features: - Cross-platform: auto-detects OS and package manager (brew, apt, dnf, pacman, winget, choco, scoop) - Auto-installs whisper-cpp and ffmpeg if missing - Downloads ggml-medium model (~1.5 GB) from HuggingFace on first use - Graceful degradation: falls back to "(voice message)" if dependencies are unavailable - Configurable via env vars: WHISPER_CLI_PATH, FFMPEG_PATH, WHISPER_MODEL_PATH, WHISPER_MODEL_NAME, WHISPER_MODEL_URL - Gate check before transcription to avoid wasting CPU on unauthenticated messages - Typing indicator shown during transcription - Temp files cleaned up after each transcription Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Thanks for your interest! This repo only accepts contributions from Anthropic team members. If you'd like to submit a plugin to the marketplace, please submit your plugin here. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds automatic local speech-to-text transcription for Telegram voice messages using whisper.cpp. When a voice
message arrives, it is transcribed locally before being forwarded to Claude — no external API calls, no data leaves the machine.
Motivation
Voice messages are one of the most natural ways to communicate on Telegram, but Claude currently receives them as opaque
(voice message)placeholders withno content. This forces users to either type everything out or manually transcribe their own voice notes.
With this change, Claude receives the full text of every voice message, making voice a first-class input method for the Telegram channel.
How it works
[voice transcription] <text>A typing indicator is shown in Telegram while transcription is in progress.
Auto-install
Dependencies are automatically installed on first voice message via the detected package manager:
whisper-cpp,ffmpegwhisper-cpp,ffmpegwhisper-cpp,ffmpegThe whisper medium model (
ggml-medium.bin, ~1.5 GB) is downloaded from HuggingFace automatically.Graceful degradation
If whisper-cli, ffmpeg, or the model are unavailable, the plugin falls back to the existing
(voice message)behavior. Zero breakage for users who don'tneed or want voice transcription.
Configuration
All paths are configurable via environment variables in
~/.claude/channels/telegram/.env:WHISPER_CLI_PATHFFMPEG_PATHWHISPER_MODEL_PATH~/.local/share/whisper-cpp/models/ggml-medium.binWHISPER_MODEL_NAMEggml-medium.binWHISPER_MODEL_URLChanges
external_plugins/telegram/server.ts— addedtranscribeVoice(), cross-platformensureWhisper()with auto-install, modified voice message handler(+308 lines, -3 lines)
Test plan
(voice message)🤖 Generated with Claude Code