Skip to content

v0.8.2 - Configurable OpenAI STT model for voice messages

Latest

Choose a tag to compare

@FlyingFathead FlyingFathead released this 14 May 16:36

v0.8.2 - Configurable OpenAI speech-to-text model support for Telegram voice messages

Changed

  • Added configurable speech-to-text model selection through config.ini.
  • Added support for gpt-4o-transcribe, gpt-4o-mini-transcribe, and whisper-1.
  • Changed default voice transcription model from legacy whisper-1 to gpt-4o-transcribe.
  • Kept backward compatibility with the existing EnableWhisper flag.
  • Added OPENAI_STT_MODEL environment-variable fallback.
  • Refactored voice message handling so the STT model comes from the main bot config object.
  • Improved Telegram HTML safety by escaping transcribed text before formatting.
  • Fixed duration checking so MaxDurationMinutes is treated as minutes while voice durations are handled as seconds.
  • Improved voice transcription logging with Telegram user/chat/file/model metadata.
  • Cleaned up voice handler registration in main.py.

Notes

Existing configs should continue working.

To select the STT model explicitly, add this to config.ini:

STTModel = gpt-4o-transcribe

Other useful values:

STTModel = gpt-4o-mini-transcribe
STTModel = whisper-1