fix(discord): transcribe inbound audio attachments by theonlyhennygod · Pull Request #2700 · zeroclaw-labs/zeroclaw

theonlyhennygod · 2026-03-04T08:52:40Z

Summary

add Discord channel transcription config wiring via .with_transcription(config.transcription.clone())
extend inbound Discord attachment processing to detect audio attachments and transcribe them with the shared Whisper-compatible transcription module
enforce configured duration limits when Discord exposes duration_secs, while preserving existing image/text attachment behavior

Root Cause

Discord inbound attachments only handled images and text; audio attachments were skipped, so no transcription path was ever invoked.

Validation

cargo fmt --all -- --check
cargo test --lib discord::tests::process_attachments_ -- --nocapture
cargo test --lib discord::tests::is_audio_attachment -- --nocapture
cargo test --lib discord::tests::infer_audio_filename_ -- --nocapture
cargo test --lib discord::tests::with_transcription_ -- --nocapture
cargo check --package zeroclaw --lib

Notes

process_attachments_transcribes_audio_when_enabled is included as an ignored test because this sandbox disallows loopback TCP binds; it can run in unrestricted CI/dev environments.

Closes #2686

Summary by CodeRabbit

New Features
- Added voice transcription for Discord audio attachments. Audio files are now automatically transcribed to text and included in message content when enabled per channel. Supports configurable maximum duration limits to control transcription processing.
Tests
- Extended test coverage for voice transcription functionality, including configuration options and various enabled/disabled scenarios.

github-actions · 2026-03-04T08:53:00Z

Thanks for contributing to ZeroClaw.

For faster review, please ensure:

PR template sections are fully completed
cargo fmt --all -- --check, cargo clippy --all-targets -- -D warnings, and cargo test are included
If automation/agents were used heavily, add brief workflow notes
Scope is focused (prefer one concern per PR)

See CONTRIBUTING.md and docs/pr-workflow.md for full collaboration rules.

coderabbitai · 2026-03-04T08:53:01Z

Warning

Rate limit exceeded

@theonlyhennygod has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minutes and 44 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6d72bc92-2f89-4996-a890-93529696d23e

📥 Commits

Reviewing files that changed from the base of the PR and between ae41b17 and c7b58b5.

📒 Files selected for processing (2)

src/channels/discord.rs
src/channels/mod.rs

Note

`.coderabbit.yaml` has unrecognized properties

CodeRabbit is using all valid settings from your configuration. Unrecognized properties (listed below) have been ignored and may indicate typos or deprecated fields that can be removed.

⚠️ Parsing warnings (1)

Validation error: Unrecognized key(s) in object: 'tools', 'path_filters', 'review_instructions'

⚙️ Configuration instructions

Please see the configuration documentation for more information.
You can also validate your configuration using the online YAML validator.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

📝 Walkthrough

Walkthrough

This PR implements voice transcription support for Discord channels by introducing a TranscriptionConfig field to DiscordChannel, extending the attachment processing logic to detect, fetch, and transcribe audio files when enabled, and wiring the configuration through the Discord channel initialization.

Changes

Cohort / File(s)	Summary
Discord Voice Transcription `src/channels/discord.rs`	Added `transcription: Option<TranscriptionConfig>` field and `with_transcription()` method to `DiscordChannel`. Extended `process_attachments()` to accept optional transcription config, classify audio attachments, apply duration limits, fetch audio data, invoke transcription service, and inject `[Voice:<filename>]` markers into output. Introduced helper functions for audio detection, content-type normalization, filename inference, and audio extension mapping. Updated call-sites and added tests for transcription flows.
Channel Configuration `src/channels/mod.rs`	Chained `.with_transcription(config.transcription.clone())` to the Discord channel construction in `collect_configured_channels` to apply transcription settings from config.

Sequence Diagram(s)

sequenceDiagram
    participant Discord as Discord Channel Handler
    participant Fetch as HTTP Client
    participant Transcribe as Transcription Service
    participant Output as Message Output

    rect rgba(100, 150, 200, 0.5)
    Note over Discord,Output: Audio Transcription Flow
    Discord->>Discord: Detect audio attachment
    Discord->>Discord: Check duration limits
    Discord->>Fetch: Fetch audio file
    Fetch-->>Discord: Audio data
    Discord->>Transcribe: transcribe_audio(data, config)
    Transcribe-->>Discord: Transcript text
    Discord->>Output: Inject [Voice:filename] transcript
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

[Bug]: The transcription on the Discord channel isn't working. #2686: This PR directly addresses the bug where Discord channel transcription was not functioning; implements the missing transcription feature with audio detection, duration filtering, and transcript injection.
[Feature]: Support voice transcription in Matrix channel #2668: Proposes per-channel voice transcription for Matrix channels using the same TranscriptionConfig and shared transcribe_audio utility pattern.

Possibly related PRs

feat: support config-level api_key for transcription #2112: Modifies transcription code paths and updates the transcribe_audio function signature to accept &TranscriptionConfig and handle API key configuration, directly integrating with the transcription calls introduced here.
fix(channel:discord): robust inbound image marker detection #2237: Modifies src/channels/discord.rs attachment processing logic, including helper functions for attachment classification and fetching that overlap with audio detection utilities added in this PR.
fix(discord): recover text attachments when content type is missing #1871: Updates process_attachments() logic in Discord channel to handle attachment classification and fetching, with overlapping concerns around audio detection and content-type inference.

Suggested labels

size: M, risk: medium, channel, channel: transcription

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The description is incomplete. While it covers the problem, solution, and validation steps, it lacks required metadata sections including risk level, size label, scope labels, module labels, change type, security impact, privacy status, backward compatibility, and rollback plan.	Complete all required sections in the description template: Label Snapshot, Change Metadata, Security Impact, Privacy/Data Hygiene, Compatibility, and Rollback Plan.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix(discord): transcribe inbound audio attachments' clearly and concisely describes the main change—adding transcription support for audio attachments in Discord channels.
Linked Issues check	✅ Passed	The PR fully addresses issue `#2686`: it detects audio attachments in Discord, integrates transcription via the shared Whisper module, respects duration limits, and provides functional parity with other channels like Telegram.
Out of Scope Changes check	✅ Passed	All changes are narrowly scoped to fixing Discord audio transcription: adding transcription config wiring, audio detection, transcription integration, and supporting tests. No unrelated refactoring or scope creep detected.
Docstring Coverage	✅ Passed	Docstring coverage is 86.49% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch issue-2686-discord-transcription

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-04T08:53:05Z

PR intake checks found warnings (non-blocking)

Fast safe checks found advisory issues. CI lint/test/build gates still enforce merge quality.

Missing required PR template sections: ## Validation Evidence (required), ## Security Impact (required), ## Privacy and Data Hygiene (required), ## Rollback Plan (required)
Incomplete required PR template fields: summary problem, summary why it matters, summary what changed, validation commands, security risk/mitigation, privacy status, rollback plan
Missing Linear issue key reference (RMN-<id>, CDV-<id>, or COM-<id>) in PR title/body (recommended for traceability, non-blocking).

Action items:

Complete required PR template sections/fields.
(Recommended) Link this PR to one active Linear issue key (RMN-xxx/CDV-xxx/COM-xxx) for traceability.
Remove tabs, trailing whitespace, and merge conflict markers from added lines.
Re-run local checks before pushing:
- ./scripts/ci/rust_quality_gate.sh
- ./scripts/ci/rust_strict_delta_gate.sh
- ./scripts/ci/docs_quality_gate.sh

Detected Linear keys: none

Run logs: https://github.com/zeroclaw-labs/zeroclaw/actions/runs/22666032846

Detected blocking line issues (sample):

none

Detected advisory line issues (sample):

none

theonlyhennygod self-assigned this Mar 4, 2026

theonlyhennygod mentioned this pull request Mar 4, 2026

[Bug]: The transcription on the Discord channel isn't working. #2686

Closed

3 tasks

theonlyhennygod changed the base branch from main to dev March 4, 2026 10:33

fix(discord): transcribe inbound audio attachments

c7b58b5

theonlyhennygod force-pushed the issue-2686-discord-transcription branch from ae41b17 to c7b58b5 Compare March 4, 2026 10:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(discord): transcribe inbound audio attachments#2700

fix(discord): transcribe inbound audio attachments#2700
theonlyhennygod merged 1 commit intodevfrom
issue-2686-discord-transcription

theonlyhennygod commented Mar 4, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Mar 4, 2026

Uh oh!

coderabbitai bot commented Mar 4, 2026 •

edited

Loading

Rate limit exceeded

`.coderabbit.yaml` has unrecognized properties

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

❌ Failed checks (1 inconclusive)

Uh oh!

github-actions bot commented Mar 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

theonlyhennygod commented Mar 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Validation

Notes

Summary by CodeRabbit

Uh oh!

github-actions bot commented Mar 4, 2026

Uh oh!

coderabbitai bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

.coderabbit.yaml has unrecognized properties

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

❌ Failed checks (1 inconclusive)

Uh oh!

github-actions bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR intake checks found warnings (non-blocking)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

theonlyhennygod commented Mar 4, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 4, 2026 •

edited

Loading

`.coderabbit.yaml` has unrecognized properties

github-actions bot commented Mar 4, 2026 •

edited

Loading