fix(main): backport discord transcription + windows shell detection#2732
Conversation
PR intake checks found warnings (non-blocking)Fast safe checks found advisory issues. CI lint/test/build gates still enforce merge quality.
Action items:
Detected Linear keys: none Run logs: https://github.com/zeroclaw-labs/zeroclaw/actions/runs/22683645390 Detected blocking line issues (sample):
Detected advisory line issues (sample):
|
|
Thanks for contributing to ZeroClaw. For faster review, please ensure:
See |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
Note
|
| Cohort / File(s) | Summary |
|---|---|
Discord transcription & attachment processing src/channels/discord.rs, src/channels/mod.rs |
Adds transcription: Option<TranscriptionConfig> to DiscordChannel, with_transcription() builder, changes process_attachments(...) to accept Option<&TranscriptionConfig>, threads transcription through call sites, implements audio content-type handling, duration checks, filename inference, transcription invocation, and test updates. |
Windows native shell detection src/runtime/native.rs |
Adds is_windows_wsl_bash_launcher() helper and a Windows-specific guard in detect_native_shell_with to skip System32/Sysnative WSL bash launchers; adds tests for detection and fallback behavior. |
Sequence Diagram(s)
sequenceDiagram
participant Discord as DiscordChannel
participant HTTP as reqwest::Client
participant Transcriber as TranscriptionService (TranscriptionConfig)
participant Agent as MessageProcessor
Discord->>Agent: receive message with attachments
Agent->>Discord: call process_attachments(attachments, HTTP, transcription?)
alt attachment is image
Discord->>HTTP: GET image URL
HTTP-->>Discord: image bytes
Discord-->>Agent: inject [Image:<name>]
else attachment is audio
Discord->>Transcriber: check transcription config present?
alt transcription configured
Discord->>HTTP: GET audio URL
HTTP-->>Discord: audio bytes
Discord->>Transcriber: transcribe(audio bytes)
Transcriber-->>Discord: transcript text
Discord-->>Agent: inject [Voice:<filename>] transcript
else no transcription
Discord-->>Agent: skip audio or inject placeholder
end
end
Estimated code review effort
🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
- fix(discord): transcribe inbound audio attachments #2700 — Implements the same Discord transcription changes (transcription field, builder, process_attachments signature, audio handling and tests).
- fix(runtime): skip Windows WSL bash shim in shell detection #2701 — Implements the same Windows WSL bash launcher detection and related test updates in
src/runtime/native.rs. - fix(channel:discord): robust inbound image marker detection #2237 — Related changes to Discord attachment handling and content-type normalization that touch the same code paths.
Suggested labels
bug, tool: shell
Suggested reviewers
- chumyin
🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Description check | The description covers key areas but is missing most required template sections like risk label, size label, scope labels, module labels, change type, primary scope, linked issue numbers, and security/privacy/compatibility metadata. | Complete the PR description by filling in all required template sections including labels, change metadata, linked issue closure statements, validation evidence, security impact, and side effects analysis. |
✅ Passed checks (4 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | The title clearly identifies the main changes: backporting Discord transcription support and Windows shell detection fixes to main branch. |
| Linked Issues check | ✅ Passed | The code changes directly address both linked issues: Discord transcription is restored via new with_transcription() method and attachment audio processing, while Windows WSL bash detection is fixed via is_windows_wsl_bash_launcher() helper. |
| Out of Scope Changes check | ✅ Passed | All changes are narrowly scoped to the two production bug fixes: Discord transcription addition, Windows shell detection guard, and supporting utilities. No unrelated modifications detected. |
| Docstring Coverage | ✅ Passed | Docstring coverage is 81.82% which is sufficient. The required threshold is 80.00%. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
🧪 Generate unit tests (beta)
- Create PR with unit tests
- Post copyable unit tests in a comment
- Commit unit tests in branch
backport/main-discord-transcription-wsl-shell
Comment @coderabbitai help to get the list of available commands and usage tips.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/channels/discord.rs (2)
2045-2047: Keep one non-ignored transcription happy-path test in the default suite.The end-to-end success-path test is currently
#[ignore], so default CI won’t catch regressions in the primary transcription flow.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/discord.rs` around lines 2045 - 2047, Remove the #[ignore = "requires local loopback TCP bind"] on the async test process_attachments_transcribes_audio_when_enabled (or add a new non-ignored happy-path test with the same name/behavior) so CI runs a successful transcription end-to-end test by default; if the existing test truly requires special networking, extract the network-dependent setup into a separate ignored test and keep a lightweight, deterministic happy-path test (e.g., using a mocked HTTP/TCP client or a prerecorded local file) in the main suite to validate the primary transcription flow.
69-74: Makewith_transcriptionoverwrite semantics explicit.Calling this with a disabled config currently leaves any prior enabled config intact; setting the field unconditionally avoids stale builder state.
♻️ Proposed refactor
pub fn with_transcription(mut self, config: TranscriptionConfig) -> Self { - if config.enabled { - self.transcription = Some(config); - } + self.transcription = config.enabled.then_some(config); self }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/discord.rs` around lines 69 - 74, The current with_transcription method leaves a previous enabled transcription config in place when called with a disabled TranscriptionConfig; change it to overwrite unconditionally by setting the struct field transcription based on the passed config (for example assign Some(config) when config.enabled is true and None when false) so with_transcription always updates/clears prior builder state; update the method referencing with_transcription, TranscriptionConfig, and the transcription field.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/channels/discord.rs`:
- Around line 291-301: The function is_audio_attachment wrongly treats explicit
non-audio MIME types (except application/octet-stream) as non-audio, which
causes application/ogg to be skipped; update is_audio_attachment to treat
application/ogg as audio (or more generally treat any content type that
audio_extension_from_content_type maps to an audio extension as audio) so that
when normalized_content_type == "application/ogg" the function returns true (or
consult audio_extension_from_content_type within is_audio_attachment) and allow
such attachments to reach transcription; reference function is_audio_attachment
and helper audio_extension_from_content_type to locate the logic to change.
---
Nitpick comments:
In `@src/channels/discord.rs`:
- Around line 2045-2047: Remove the #[ignore = "requires local loopback TCP
bind"] on the async test process_attachments_transcribes_audio_when_enabled (or
add a new non-ignored happy-path test with the same name/behavior) so CI runs a
successful transcription end-to-end test by default; if the existing test truly
requires special networking, extract the network-dependent setup into a separate
ignored test and keep a lightweight, deterministic happy-path test (e.g., using
a mocked HTTP/TCP client or a prerecorded local file) in the main suite to
validate the primary transcription flow.
- Around line 69-74: The current with_transcription method leaves a previous
enabled transcription config in place when called with a disabled
TranscriptionConfig; change it to overwrite unconditionally by setting the
struct field transcription based on the passed config (for example assign
Some(config) when config.enabled is true and None when false) so
with_transcription always updates/clears prior builder state; update the method
referencing with_transcription, TranscriptionConfig, and the transcription
field.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d26c0ad2-7eaa-4cf4-b40a-c55d10f2197d
📒 Files selected for processing (3)
src/channels/discord.rssrc/channels/mod.rssrc/runtime/native.rs
48c496f to
3ddbad5
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/channels/discord.rs (1)
291-301:⚠️ Potential issue | 🟠 MajorTreat
application/oggas audio in the attachment gate.At Line 295-300,
is_audio_attachmentrejects explicit non-audio/*MIME types (except octet-stream), soapplication/oggis skipped before transcription. That conflicts with Line 346 whereaudio_extension_from_content_typeexplicitly mapsapplication/ogg.🐛 Proposed fix
fn is_audio_attachment(content_type: &str, filename: &str, url: &str) -> bool { let normalized_content_type = normalize_content_type(content_type); if !normalized_content_type.is_empty() { - if normalized_content_type.starts_with("audio/") { + if normalized_content_type.starts_with("audio/") + || audio_extension_from_content_type(content_type).is_some() + { return true; } // Trust explicit non-audio MIME to avoid false positives from filename extensions. if normalized_content_type != "application/octet-stream" { return false; } } has_audio_extension(filename) || has_audio_extension(url) }Also applies to: 340-347
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/discord.rs` around lines 291 - 301, The is_audio_attachment function currently rejects non-audio MIME types (except application/octet-stream) which causes application/ogg to be skipped; update is_audio_attachment to treat "application/ogg" as audio by allowing it to pass the MIME check (e.g., consider "application/ogg" equivalent to audio/*) so transcription proceeds; ensure this aligns with the existing audio_extension_from_content_type mapping (keep audio_extension_from_content_type as-is or confirm it maps "application/ogg" to the correct extension) and reference is_audio_attachment and audio_extension_from_content_type when making the change.
🧹 Nitpick comments (1)
src/channels/discord.rs (1)
2045-2047: Avoid ignoring the only success-path transcription test by default.Line 2046 disables the one end-to-end assertion that Discord audio is actually transcribed, so regressions in the happy path can slip through CI.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/discord.rs` around lines 2045 - 2047, The test function process_attachments_transcribes_audio_when_enabled is marked #[ignore = "requires local loopback TCP bind"] which prevents the only success-path transcription assertion from running in CI; remove the #[ignore] attribute (or replace it with a deterministic approach) and make the test CI-safe by either (a) binding to an ephemeral loopback port or using the existing test helper that allocates a free port, or (b) refactoring the test to use a mock/fake network/audio provider so it no longer requires a local TCP bind; update the setup inside process_attachments_transcribes_audio_when_enabled to use that CI-safe binding or mock so the test can run unignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/channels/discord.rs`:
- Around line 291-301: The is_audio_attachment function currently rejects
non-audio MIME types (except application/octet-stream) which causes
application/ogg to be skipped; update is_audio_attachment to treat
"application/ogg" as audio by allowing it to pass the MIME check (e.g., consider
"application/ogg" equivalent to audio/*) so transcription proceeds; ensure this
aligns with the existing audio_extension_from_content_type mapping (keep
audio_extension_from_content_type as-is or confirm it maps "application/ogg" to
the correct extension) and reference is_audio_attachment and
audio_extension_from_content_type when making the change.
---
Nitpick comments:
In `@src/channels/discord.rs`:
- Around line 2045-2047: The test function
process_attachments_transcribes_audio_when_enabled is marked #[ignore =
"requires local loopback TCP bind"] which prevents the only success-path
transcription assertion from running in CI; remove the #[ignore] attribute (or
replace it with a deterministic approach) and make the test CI-safe by either
(a) binding to an ephemeral loopback port or using the existing test helper that
allocates a free port, or (b) refactoring the test to use a mock/fake
network/audio provider so it no longer requires a local TCP bind; update the
setup inside process_attachments_transcribes_audio_when_enabled to use that
CI-safe binding or mock so the test can run unignored.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 76cdc0ee-280f-4635-a0ea-c8fa01dae0c7
📒 Files selected for processing (3)
src/channels/discord.rssrc/channels/mod.rssrc/runtime/native.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- src/runtime/native.rs
- src/channels/mod.rs
bf27239 to
a749fde
Compare
a749fde to
b9f65fe
Compare
b9f65fe to
b35a7c1
Compare
Summary
System32\\bash.exelauncher[transcription]config on mainValidation
Notes
#2683TLS transport-root-cause formatting is already present onmaininsrc/providers/compatible.rs, so no additional code change was required here.dev -> mainpromotion PR#2713is currently conflict-heavy; this backport PR isolates the two production bug fixes needed now.Fixes #2686
Fixes #2684
Summary by CodeRabbit
New Features
Bug Fixes
Tests