fix(discord): transcribe inbound audio attachments#2700
Conversation
|
Thanks for contributing to ZeroClaw. For faster review, please ensure:
See |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
Note
|
| Cohort / File(s) | Summary |
|---|---|
Discord Voice Transcription src/channels/discord.rs |
Added transcription: Option<TranscriptionConfig> field and with_transcription() method to DiscordChannel. Extended process_attachments() to accept optional transcription config, classify audio attachments, apply duration limits, fetch audio data, invoke transcription service, and inject [Voice:<filename>] markers into output. Introduced helper functions for audio detection, content-type normalization, filename inference, and audio extension mapping. Updated call-sites and added tests for transcription flows. |
Channel Configuration src/channels/mod.rs |
Chained .with_transcription(config.transcription.clone()) to the Discord channel construction in collect_configured_channels to apply transcription settings from config. |
Sequence Diagram(s)
sequenceDiagram
participant Discord as Discord Channel Handler
participant Fetch as HTTP Client
participant Transcribe as Transcription Service
participant Output as Message Output
rect rgba(100, 150, 200, 0.5)
Note over Discord,Output: Audio Transcription Flow
Discord->>Discord: Detect audio attachment
Discord->>Discord: Check duration limits
Discord->>Fetch: Fetch audio file
Fetch-->>Discord: Audio data
Discord->>Transcribe: transcribe_audio(data, config)
Transcribe-->>Discord: Transcript text
Discord->>Output: Inject [Voice:filename] transcript
end
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related issues
- [Bug]: The transcription on the Discord channel isn't working. #2686: This PR directly addresses the bug where Discord channel transcription was not functioning; implements the missing transcription feature with audio detection, duration filtering, and transcript injection.
- [Feature]: Support voice transcription in Matrix channel #2668: Proposes per-channel voice transcription for Matrix channels using the same
TranscriptionConfigand sharedtranscribe_audioutility pattern.
Possibly related PRs
- feat: support config-level api_key for transcription #2112: Modifies transcription code paths and updates the
transcribe_audiofunction signature to accept&TranscriptionConfigand handle API key configuration, directly integrating with the transcription calls introduced here. - fix(channel:discord): robust inbound image marker detection #2237: Modifies
src/channels/discord.rsattachment processing logic, including helper functions for attachment classification and fetching that overlap with audio detection utilities added in this PR. - fix(discord): recover text attachments when content type is missing #1871: Updates
process_attachments()logic in Discord channel to handle attachment classification and fetching, with overlapping concerns around audio detection and content-type inference.
Suggested labels
size: M, risk: medium, channel, channel: transcription
🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 inconclusive)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Description check | ❓ Inconclusive | The description is incomplete. While it covers the problem, solution, and validation steps, it lacks required metadata sections including risk level, size label, scope labels, module labels, change type, security impact, privacy status, backward compatibility, and rollback plan. | Complete all required sections in the description template: Label Snapshot, Change Metadata, Security Impact, Privacy/Data Hygiene, Compatibility, and Rollback Plan. |
✅ Passed checks (4 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | The title 'fix(discord): transcribe inbound audio attachments' clearly and concisely describes the main change—adding transcription support for audio attachments in Discord channels. |
| Linked Issues check | ✅ Passed | The PR fully addresses issue #2686: it detects audio attachments in Discord, integrates transcription via the shared Whisper module, respects duration limits, and provides functional parity with other channels like Telegram. |
| Out of Scope Changes check | ✅ Passed | All changes are narrowly scoped to fixing Discord audio transcription: adding transcription config wiring, audio detection, transcription integration, and supporting tests. No unrelated refactoring or scope creep detected. |
| Docstring Coverage | ✅ Passed | Docstring coverage is 86.49% which is sufficient. The required threshold is 80.00%. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
🧪 Generate unit tests (beta)
- Create PR with unit tests
- Post copyable unit tests in a comment
- Commit unit tests in branch
issue-2686-discord-transcription
Comment @coderabbitai help to get the list of available commands and usage tips.
PR intake checks found warnings (non-blocking)Fast safe checks found advisory issues. CI lint/test/build gates still enforce merge quality.
Action items:
Detected Linear keys: none Run logs: https://github.com/zeroclaw-labs/zeroclaw/actions/runs/22666032846 Detected blocking line issues (sample):
Detected advisory line issues (sample):
|
ae41b17 to
c7b58b5
Compare
Summary
.with_transcription(config.transcription.clone())duration_secs, while preserving existing image/text attachment behaviorRoot Cause
Discord inbound attachments only handled images and text; audio attachments were skipped, so no transcription path was ever invoked.
Validation
cargo fmt --all -- --checkcargo test --lib discord::tests::process_attachments_ -- --nocapturecargo test --lib discord::tests::is_audio_attachment -- --nocapturecargo test --lib discord::tests::infer_audio_filename_ -- --nocapturecargo test --lib discord::tests::with_transcription_ -- --nocapturecargo check --package zeroclaw --libNotes
process_attachments_transcribes_audio_when_enabledis included as an ignored test because this sandbox disallows loopback TCP binds; it can run in unrestricted CI/dev environments.Closes #2686
Summary by CodeRabbit
New Features
Tests