What happened?
POST /bots/{platform}/{native_meeting_id}/speak returns HTTP 202 with body {"message":"Speak command sent","meeting_id":<id>} but the bot never plays any audio in the meeting and the WebSocket at
wss://api.cloud.vexa.ai/ws (subscribed to the same meeting) emits neither speak.started nor speak.completed within 20+ seconds. There is also no error event. chat and transcription work fine on
the same bot — only /speak is silently no-op'd.
Reproduced on three independent meetings / call IDs, across two API keys, all under the same account.
What did you expect?
After POST /speak returns 202:
- The WebSocket should emit
speak.started (and eventually speak.completed or speak.interrupted).
- The bot should briefly unmute, play the TTS audio in the meeting, and re-mute — per the documented behavior: "The bot unmutes, plays the audio, then re-mutes."
Alternatively, if /speak cannot be fulfilled (account tier, missing provider key, etc.), the response should be a 4xx with a clear error — not a misleading 202.
How to reproduce?
export VEXA_API_KEY=vxa_bot_…
- Create a bot in a fresh Google Meet:
curl -X POST -H "X-API-Key: $VEXA_API_KEY" -H "Content-Type: application/json" \
-d '{"platform":"google_meet","native_meeting_id":"<id>","bot_name":"Juno",
"language":"en","transcribe_enabled":true,"voice_agent_enabled":true}' \
https://api.cloud.vexa.ai/bots
Returns 201 with a call_id. Admit the bot in the Meet host UI. Status transitions to active.
- Run this diagnostic (subscribes to /ws, sends one /speak, watches for events):
python3 scripts/diag_tts.py google_meet <native_meeting_id>
The script connects to wss://api.cloud.vexa.ai/ws, sends {"action":"subscribe","meetings":[{"platform":"google_meet","native_id":""}]}, gets back {"type":"subscribed",...}, then POSTs /speak with
{"text":"...","provider":"openai","voice":"alloy"}. Watches WS for 20 s.
- Observed every time: POST /speak → 202 OK, but speak.started is never emitted; no audio in the meeting.
Logs / screenshots?
Reproduction 1 (call_id 12674, native jcr-pnrn-tbw, 2026-05-18):
- chat worked (chat appeared in Meet UI)
- 3 /speak calls with voice: nova, voice: alloy, and provider: elevenlabs — all 202, all silent
Reproduction 2 (call_id 12677, native stj-keti-zoz, 2026-05-18) — diag_tts.py output:
WS [ 0.20s] subscribed: subscribed
POST /speak HTTP 202: {"message":"Speak command sent","meeting_id":12677}
speak events: []
result: TTS STILL SILENT
Reproduction 3 (call_id 12752, native gcm-isji-yds, 2026-05-19) — diag_tts.py output:
WS [ 0.17s] subscribed: subscribed
POST /speak HTTP 202: {"message":"Speak command sent","meeting_id":12752}
RESULT: TTS SILENT. /speak returned 202 but no speak.* events fired.
meetings/{call_id} for the affected bots echoes data.transcribe_enabled: true but does not echo any voice_agent_enabled / agent_enabled / TTS-related field — possible hint that the flag is being silently dropped
on input.
Version / env?
- Vexa: hosted (api.cloud.vexa.ai), accessed via REST + wss://api.cloud.vexa.ai/ws
- Affected call_ids: 12674, 12677, 12752 (all google_meet, all admitted, all status: active at the time of /speak)
- Client: Python 3.11, aiohttp 3.x, certifi for TLS; bare-metal macOS (Darwin 24.4)
- Reproduction script: atached
diag_tts.py
- Tried voices: alloy, nova (OpenAI). Tried providers: openai, elevenlabs. All 202, all silent.
What happened?
POST /bots/{platform}/{native_meeting_id}/speakreturnsHTTP 202with body{"message":"Speak command sent","meeting_id":<id>}but the bot never plays any audio in the meeting and the WebSocket atwss://api.cloud.vexa.ai/ws(subscribed to the same meeting) emits neitherspeak.startednorspeak.completedwithin 20+ seconds. There is also noerrorevent.chatandtranscriptionwork fine onthe same bot — only
/speakis silently no-op'd.Reproduced on three independent meetings / call IDs, across two API keys, all under the same account.
What did you expect?
After
POST /speakreturns 202:speak.started(and eventuallyspeak.completedorspeak.interrupted).Alternatively, if
/speakcannot be fulfilled (account tier, missing provider key, etc.), the response should be a 4xx with a clear error — not a misleading 202.How to reproduce?
export VEXA_API_KEY=vxa_bot_…python3 scripts/diag_tts.py google_meet <native_meeting_id>
The script connects to wss://api.cloud.vexa.ai/ws, sends {"action":"subscribe","meetings":[{"platform":"google_meet","native_id":""}]}, gets back {"type":"subscribed",...}, then POSTs /speak with
{"text":"...","provider":"openai","voice":"alloy"}. Watches WS for 20 s.
Logs / screenshots?
Reproduction 1 (call_id 12674, native jcr-pnrn-tbw, 2026-05-18):
Reproduction 2 (call_id 12677, native stj-keti-zoz, 2026-05-18) — diag_tts.py output:
WS [ 0.20s] subscribed: subscribed
Reproduction 3 (call_id 12752, native gcm-isji-yds, 2026-05-19) — diag_tts.py output:
WS [ 0.17s] subscribed: subscribed
meetings/{call_id} for the affected bots echoes data.transcribe_enabled: true but does not echo any voice_agent_enabled / agent_enabled / TTS-related field — possible hint that the flag is being silently dropped
on input.
Version / env?
diag_tts.py