Commit 0079352
feat(gateway): Google Chat attachment support (image / file / audio + STT) (#762)
* feat(gateway): inbound attachment support for Google Chat
Implements image / text file / audio download from Google Chat via
Media API + service account token, following the PR #731 base64 pattern.
Changes:
- GoogleChatMessage: parse attachment[] array (Attachment / AttachmentDataRef / DriveDataRef structs)
- GoogleChatMediaRef enum: Image / File / Audio variants for typed dispatch
- parse_attachments(): branches on contentType prefix, skips DRIVE_FILE source
- download_googlechat_image(): resize → 1200px JPEG q75, max 10MB, GIF preserved
- download_googlechat_file(): text extension whitelist (.txt/.md/.py/...), max 512KB
- download_googlechat_audio(): forwarded as-is for core STT pipeline, max 25MB
- media_url(): percent-encode resource_name as path segment
- webhook handler: parses attachments, async-downloads via adapter token, populates Content.attachments
- Empty-text events with attachments are now forwarded (previously dropped)
- Tests: 11 new (parse, download success/skip/oversized, URL encoding)
Refs: #731 (Feishu pattern)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(core): STT for Custom Gateway audio attachments
Extends src/gateway.rs attachment handling to transcribe audio attachments
via the existing STT pipeline (previously only Discord/Slack adapters
went through download_and_transcribe; Custom Gateway adapters got no
audio path even though stt::transcribe was available).
When a gateway adapter (Feishu, Google Chat, etc.) sends an Attachment
with attachment_type = "audio", core now:
1. Decodes base64 → audio bytes
2. Calls stt::transcribe with the configured SttConfig
3. Wraps the transcript as a ContentBlock::Text:
"[Voice message transcript]: ..."
The audio branch is gated on stt_config.enabled — if STT is disabled in
config, audio attachments fall through unchanged (same as before).
Threads stt_config through GatewayParams and run_gateway_adapter.
This closes the audio attachment gap left by the (now-closed) PR #726
without re-introducing the HTTP MediaStore proxy approach. Pairs with
the Google Chat adapter audio download (separate PR) — once both land,
Google Chat voice/audio attachments work end-to-end.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(gateway): address googlechat attachment review feedback
Addresses canyugs#4 must-fix items:
#1+#2 Webhook timeout safety:
- Spawn background tokio task for attachment downloads so the webhook
returns 200 within Google Chat's 30s deadline regardless of how long
downloads take.
- Add 30s per-request timeout to all Media API GET calls — a single
hung connection can no longer stall the download task indefinitely.
- Refactor event emission into send_googlechat_event helper to share
between sync (no-attachment) and async (background download) paths.
#4 Text file caps (matches Discord/Slack):
- TEXT_FILE_COUNT_CAP = 5: skip text files past the 5th with a warning.
- TEXT_TOTAL_CAP = 1 MB: skip text files that would push the running
aggregate past 1 MB with a warning.
- Image and audio attachments are not capped (same as Discord/Slack).
#6 STT silent failure:
- When stt::transcribe returns None, push a fallback ContentBlock::Text
("[Voice message — transcription failed for ...]") so the agent
knows a voice message was attempted and can ask the user to re-send.
Previously the failure was silent and confusing.
Skipped from issue #4: #3 (streaming download), #5 (cross-adapter
refactor — adapters stay independent per design), #7-#10 (cosmetic).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(gateway): correct media_url encoding, remove lossy UTF-8 round-trip, add spawn panic logging
- media_url: preserve `/` as literal path separators per Google Chat
Media API's RFC 6570 reserved expansion (`{+resourceName}`). Previously
all `/` were encoded as `%2F` which is fragile.
- download_googlechat_file: base64-encode raw bytes directly instead of
round-tripping through String::from_utf8_lossy which silently replaces
invalid bytes with U+FFFD.
- Spawned attachment download task: log an error if the task panics so
silent message drops are diagnosable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(gateway): address review — remove .env from whitelist, add audio decode fallback
- Remove `"env"` from TEXT_EXTS whitelist to prevent credential exposure
if a user accidentally uploads a .env file.
- Audio base64 decode failure now produces a fallback text block
("[Voice message — decode failed for X]") instead of silently dropping.
- Audio attachments when STT is disabled now log at debug level instead
of being silently discarded.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor(gateway): simplify text file cap logic, defer text allocation to spawn path
- Flatten nested if/else in File download cap check using early
continue, improving readability.
- Defer text .to_string() allocation to the tokio::spawn path so the
no-attachment fast path avoids a heap allocation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* refactor(gateway): address remaining review nits
- Replace double-spawn panic logging with single spawn +
catch_unwind — more idiomatic, same observability.
- Remove unused content_type from Image/File variants of
GoogleChatMediaRef; only Audio needs it. Drops #[allow(dead_code)]
on the enum.
- Pass remaining aggregate budget to download_googlechat_file so
Content-Length is checked against the budget before downloading,
avoiding wasted bandwidth on files that would exceed the cap.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(gateway): enforce aggregate budget on post-download check, log skipped video attachments
- download_googlechat_file: post-download size check now uses max_size
(min of FILE_MAX_DOWNLOAD and remaining_budget) instead of only
FILE_MAX_DOWNLOAD, ensuring TEXT_TOTAL_CAP is respected even when
Content-Length header is absent.
- parse_attachments: video/ MIME type now gets an explicit info! log
and is skipped early, instead of silently failing the text extension
whitelist downstream.
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: chaodu-agent <chaodu-agent@openab.dev>1 parent 1f8864c commit 0079352
3 files changed
Lines changed: 770 additions & 30 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
143 | 143 | | |
144 | 144 | | |
145 | 145 | | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
146 | 151 | | |
147 | 152 | | |
148 | 153 | | |
149 | 154 | | |
150 | | - | |
| 155 | + | |
| 156 | + | |
151 | 157 | | |
152 | 158 | | |
153 | 159 | | |
| |||
0 commit comments