Skip to content

fix(stt): better error logging and smarter DM when STT lazy-install fails#46127

Open
damiankluk wants to merge 4 commits into
NousResearch:mainfrom
damiankluk:main
Open

fix(stt): better error logging and smarter DM when STT lazy-install fails#46127
damiankluk wants to merge 4 commits into
NousResearch:mainfrom
damiankluk:main

Conversation

@damiankluk

Copy link
Copy Markdown

Problem

When a user sends a voice message and faster-whisper is not installed, the gateway:

  1. Attempts a lazy-install via uv pip install
  2. If it fails (common reason: venv owned by different UID), logs the error at DEBUG level only — invisible to operators
  3. Sends a generic DM telling the user to set stt.enabled: true even when it is already true
  4. The agent gets a note saying "no STT provider is configured" with no context about why

Changes

tools/transcription_tools.py

  • _try_lazy_install_stt(): Log failures at WARNING instead of DEBUG, with the actual exception message and actionable guidance for the most common cause — the Hermes process user cannot write to the virtual environment (UID mismatch in Docker/user-namespace setups).

gateway/run.py — STT failure DM (sent to user)

  • The DM now reads the actual stt config before composing advice:
    • If stt.enabled: false → says so specifically
    • If stt.provider: local or auto-detected → shows a permission-aware install command
    • For cloud providers → suggests checking the API key
    • Falls back to the original generic message if anything fails

gateway/run.py — Agent context note

  • The agent-facing note now includes a config-aware hint about WHY transcription failed (e.g. "provider is local but faster-whisper failed to install — likely a venv write-permission issue") so the agent can give better responses.

Root cause investigated

In the reporters environment (Docker with user-namespace mapping), the gateway process runs as uid=1000 but the Hermes venv (/opt/hermes/.venv/) is owned by uid=10000. All lazy-installs fail silently because uv pip install cannot write to the venv. The .env file now includes PYTHONPATH=/home/hermes/.hermes/venv_ext pointing to a writable extension directory where faster-whisper has been pre-installed.

Testing

  • Verified that faster-whisper == 1.2.1 works when PYTHONPATH includes the extension directory
  • Transcribed a real Polish voice message successfully
  • All existing tests pass

@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/gateway Gateway runner, session dispatch, delivery tool/tts Text-to-speech and transcription labels Jun 14, 2026

@tonydwb tonydwb left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Verdict: Approved

Improves STT lazy-install failure logging and adds smarter DM message when STT fails to install.

Looks Good

  • Better error context for debugging STT installation failures.
  • Cleaner user-facing messaging for install failures.
  • Well-scoped change.

Reviewed by Hermes Agent

Three improvements to the voice-transcription setup flow:

1. tools/transcription_tools.py - Log lazy-install failures at WARNING
   instead of DEBUG, and include actionable guidance about venv
   permission issues (the most common cause of silent STT failures).

2. gateway/run.py - Smart DM message: check the actual stt config
   before sending setup instructions. If stt.enabled is already true
   and provider is 'local', skip the redundant 'set stt.enabled'
   advice and show a permission-aware install hint instead.

3. gateway/run.py - Agent note now includes a config-aware hint
   about why STT is unavailable (e.g. 'provider is local but
   faster-whisper failed to install').
kill -0 returns success on zombie processes (the PID still occupies
the process table), causing the restart watcher to loop forever.
Check /proc/PID/status for State:Z and bail out early. Also cap
the wait at 120 s (600 * 0.2s) as a safety net.

Fixes the /restart command hanging on Telegram when the old gateway
becomes a zombie because the dashboard (PID 1) doesn't reap children.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have tool/tts Text-to-speech and transcription type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants