v0.8.2 - Configurable OpenAI speech-to-text model support for Telegram voice messages
Changed
- Added configurable speech-to-text model selection through config.ini.
- Added support for gpt-4o-transcribe, gpt-4o-mini-transcribe, and whisper-1.
- Changed default voice transcription model from legacy whisper-1 to gpt-4o-transcribe.
- Kept backward compatibility with the existing EnableWhisper flag.
- Added OPENAI_STT_MODEL environment-variable fallback.
- Refactored voice message handling so the STT model comes from the main bot config object.
- Improved Telegram HTML safety by escaping transcribed text before formatting.
- Fixed duration checking so MaxDurationMinutes is treated as minutes while voice durations are handled as seconds.
- Improved voice transcription logging with Telegram user/chat/file/model metadata.
- Cleaned up voice handler registration in main.py.
Notes
Existing configs should continue working.
To select the STT model explicitly, add this to config.ini:
STTModel = gpt-4o-transcribe
Other useful values:
STTModel = gpt-4o-mini-transcribe
STTModel = whisper-1