feat(channel): Add Mistral Voxtral support for voice transcription#2778
feat(channel): Add Mistral Voxtral support for voice transcription#2778WAlexandreW wants to merge 4 commits intozeroclaw-labs:devfrom
Conversation
|
Thanks for contributing to ZeroClaw. For faster review, please ensure:
See |
PR intake checks found warnings (non-blocking)Fast safe checks found advisory issues. CI lint/test/build gates still enforce merge quality.
Action items:
Detected Linear keys: none Run logs: https://github.com/zeroclaw-labs/zeroclaw/actions/runs/22705515205 Detected blocking line issues (sample):
Detected advisory line issues (sample):
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
Note
|
| Cohort / File(s) | Summary |
|---|---|
Transcription logic src/channels/transcription.rs |
Adds fn is_mistral_host(api_url: &str) -> bool; changes transcribe_audio signature to accept &TranscriptionConfig and return Result<String>; implements provider-aware api_key resolution (config first, then MISTRAL_API_KEY or GROQ_API_KEY env fallback); selects proxy (transcription.mistral or transcription.groq) and uses build_runtime_proxy_client(proxy_name); updates error messaging and retains payload/response handling. |
Configuration schema src/config/schema.rs |
Adds pub api_key: Option<String> to TranscriptionConfig; Default sets api_key: None; Debug impl redacts the api_key; adds "transcription.mistral" to SUPPORTED_PROXY_SERVICE_KEYS; encrypts/decrypts transcription.api_key during config save/load. |
Tests src/channels/...tests, tests/... |
Adds tests for key-resolution (config key present, whitespace trimming, env fallback), is_mistral_host variants, and updates existing MIME/filename and rejection tests to follow new proxy/key resolution paths. |
Sequence Diagram(s)
sequenceDiagram
participant Caller
participant TranscriptionModule
participant ProxyBuilder
participant ProxyService
Caller->>TranscriptionModule: transcribe_audio(audio, filename, &config)
TranscriptionModule->>TranscriptionModule: is_mistral_host(config.api_url)
TranscriptionModule->>TranscriptionModule: resolve api_key (config.api_key or env by host)
TranscriptionModule->>ProxyBuilder: build_runtime_proxy_client(proxy_name)
ProxyBuilder->>ProxyService: create client for proxy_name
TranscriptionModule->>ProxyService: send transcription request (api_key, payload)
ProxyService-->>TranscriptionModule: return transcription response
TranscriptionModule-->>Caller: return transcript / error
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related issues
- [Feature]: Support voice transcription in Matrix channel #2668 — Callers (e.g., Matrix channel) must be updated to provide the new
TranscriptionConfigand adapt to the changedtranscribe_audiosignature and proxy/key resolution.
Possibly related PRs
- fix(discord): transcribe inbound audio attachments #2700 — Updates channel code to pass
TranscriptionConfiginto attachment processing and adjust calls totranscribe_audio. - feat(whatsapp-web): supersede #1992 transcription flow [RMN-205] #2192 — WhatsApp Web transcription flow that depends on the updated
TranscriptionConfigandtranscribe_audiobehavior. - feat: support config-level api_key for transcription #2112 — Prior changes that add
transcription.api_keyand modify transcription key resolution; overlaps with this PR's config work.
Suggested labels
channel: transcription
Suggested reviewers
- theonlyhennygod
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | The title clearly and concisely describes the main feature change: adding Mistral Voxtral support for voice transcription, which aligns with the primary code changes to enable multi-provider STT support. |
| Description check | ✅ Passed | The description comprehensively covers all required template sections including problem statement, what changed, validation evidence, security impact, compatibility, and rollback plan with proper detail and structure. |
| Docstring Coverage | ✅ Passed | Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
🧪 Generate unit tests (beta)
- Create PR with unit tests
- Post copyable unit tests in a comment
Comment @coderabbitai help to get the list of available commands and usage tips.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/config/schema.rs (1)
493-503: Strengthen schema-contract docs for transcription provider expansion.The updated comments improve clarity, but this config-surface change should explicitly document default behavior, compatibility impact, and migration/rollback guidance in-place.
📝 Suggested doc update
-/// Voice transcription configuration (Whisper API via Groq, Mistral, etc.). +/// Voice transcription configuration (Groq Whisper default; also supports Mistral Voxtral endpoints). +/// Default endpoint/model remain Groq: +/// - api_url: `https://api.groq.com/openai/v1/audio/transcriptions` +/// - model: `whisper-large-v3-turbo` +/// Compatibility: additive and backward-compatible for existing Groq configs. +/// Migration: set `transcription.api_url` to the Mistral endpoint and provide `api_key` or `MISTRAL_API_KEY`. +/// Rollback: revert `transcription.api_url`/model to Groq defaults. pub struct TranscriptionConfig { @@ - /// Whisper or Voxtral model name (e.g. `whisper-large-v3-turbo`, `voxtral-mini-latest`). + /// Whisper or Voxtral model name (e.g. `whisper-large-v3-turbo`, `voxtral-mini-latest`).As per coding guidelines, "Treat config keys as public contract: document defaults, compatibility impact, and migration/rollback path for schema changes in
src/config/schema.rs."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config/schema.rs` around lines 493 - 503, Update the doc comments on the TranscriptionConfig struct and its fields (TranscriptionConfig, enabled, api_url, default_transcription_api_url, default_transcription_model) to explicitly state the default behavior (what enabled=false means and the concrete default values returned by default_transcription_api_url/default_transcription_model), note compatibility impact when adding new transcription providers (which consumers may need to configure or may differ in supported models/encoding), and provide a short migration/rollback guidance line (how to revert to previous behavior by restoring prior config keys or setting enabled=false and specifying prior api_url/model); keep the wording concise and place these notes adjacent to the existing field docs so the schema file serves as the public contract.src/channels/transcription.rs (1)
62-80: Add unit tests for the new provider-branch key resolution paths.This block now has provider-dependent behavior and whitespace filtering, but current tests only cover the legacy GROQ-missing-key path. Please add focused tests for Mistral URL +
MISTRAL_API_KEY, Mistral URL without key, non-Mistral URL +GROQ_API_KEY, and whitespace-only keys.As per coding guidelines, "Run
cargo fmt --all -- --check,cargo clippy --all-targets -- -D warnings, andcargo testbefore PR submission."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/transcription.rs` around lines 62 - 80, Add unit tests for the provider-dependent API key resolution in transcription.rs that exercise the api_key computation: create tests that set config.api_url to a Mistral URL and verify behavior when MISTRAL_API_KEY is present (valid key) and when it is absent (error path), create a test for a non-Mistral URL that uses GROQ_API_KEY from the environment, and add a test that supplies whitespace-only keys (both via config.api_key and env vars) to ensure trimming/filtering removes them; use the same module/constructor that computes api_key (the logic referencing config.api_key and config.api_url and environment vars MISTRAL_API_KEY/GROQ_API_KEY) and assert success or the expected context error, then run cargo fmt, cargo clippy, and cargo test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/channels/transcription.rs`:
- Around line 69-74: The provider detection using
config.api_url.contains("mistral.ai") is fragile; parse config.api_url once
(e.g., with Url::parse) and inspect the host component (case-insensitive) to
decide provider, then reuse that boolean for both env-var selection and proxy
routing logic (replace the contains checks in the block around config.api_url
and the similar checks at the second occurrence). Ensure you handle parse errors
(fallback to explicit default) and match either exact host or valid subdomains
(e.g., host.ends_with("mistral.ai")) so MISTRAL_API_KEY is chosen only for true
Mistral hosts and GROQ_API_KEY otherwise.
---
Nitpick comments:
In `@src/channels/transcription.rs`:
- Around line 62-80: Add unit tests for the provider-dependent API key
resolution in transcription.rs that exercise the api_key computation: create
tests that set config.api_url to a Mistral URL and verify behavior when
MISTRAL_API_KEY is present (valid key) and when it is absent (error path),
create a test for a non-Mistral URL that uses GROQ_API_KEY from the environment,
and add a test that supplies whitespace-only keys (both via config.api_key and
env vars) to ensure trimming/filtering removes them; use the same
module/constructor that computes api_key (the logic referencing config.api_key
and config.api_url and environment vars MISTRAL_API_KEY/GROQ_API_KEY) and assert
success or the expected context error, then run cargo fmt, cargo clippy, and
cargo test.
In `@src/config/schema.rs`:
- Around line 493-503: Update the doc comments on the TranscriptionConfig struct
and its fields (TranscriptionConfig, enabled, api_url,
default_transcription_api_url, default_transcription_model) to explicitly state
the default behavior (what enabled=false means and the concrete default values
returned by default_transcription_api_url/default_transcription_model), note
compatibility impact when adding new transcription providers (which consumers
may need to configure or may differ in supported models/encoding), and provide a
short migration/rollback guidance line (how to revert to previous behavior by
restoring prior config keys or setting enabled=false and specifying prior
api_url/model); keep the wording concise and place these notes adjacent to the
existing field docs so the schema file serves as the public contract.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c7f77d99-06d1-4b49-b209-56c0a672d4e5
📒 Files selected for processing (2)
src/channels/transcription.rssrc/config/schema.rs
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/channels/transcription.rs (1)
172-241:⚠️ Potential issue | 🟠 MajorAdd test serialization or extract a pure key-resolution helper to eliminate environment variable leakage between tests.
Tests at lines 172–227 call
std::env::remove_var()without restoration or serialization. When tests run in parallel, mutations toGROQ_API_KEYandMISTRAL_API_KEYcan leak across test boundaries sincetranscribe_audioreads these variables at runtime (lines 97–99), not at config initialization. Either:
- Use
#[serial]ortokio::test(flavor = "multi_thread")with a global lock to serialize these four tests, or- Extract key resolution into a testable, dependency-injected function so tests can inject mock env lookups without mutating process state.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/transcription.rs` around lines 172 - 241, Tests mutate process environment (GROQ_API_KEY/MISTRAL_API_KEY) causing cross-test leakage; fix by either serializing the tests or extracting key-resolution into an injectable helper. Option A: mark the tests that touch env vars as serialized (e.g., use a global test lock or a serial attribute) so the four tests around transcribe_audio do not run concurrently. Option B (preferred): extract the environment lookup currently inside transcribe_audio (the logic that reads GROQ_API_KEY / MISTRAL_API_KEY and whitespace-checks keys) into a pure function like resolve_transcription_api_key(&TranscriptionConfig) -> Result<String, Error> and update transcribe_audio to accept an injected key or resolver; update tests to call the resolver directly or pass a mocked key so they no longer mutate std::env.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/config/schema.rs`:
- Around line 518-523: Config now contains a sensitive field
config.transcription.api_key that is not being processed by the existing secrets
encryption/decryption paths; update Config::load_or_init to decrypt
transcription.api_key when secrets.encrypt is enabled and update Config::save to
encrypt transcription.api_key before persisting (and ensure the in-memory struct
stores the decrypted value after load). Locate the existing secret handling
logic in Config::load_or_init and Config::save and mirror that flow for
transcription.api_key, reusing the same helper functions or utilities used for
other secret fields so encryption and decryption are consistent with the
secrets.encrypt flag.
---
Outside diff comments:
In `@src/channels/transcription.rs`:
- Around line 172-241: Tests mutate process environment
(GROQ_API_KEY/MISTRAL_API_KEY) causing cross-test leakage; fix by either
serializing the tests or extracting key-resolution into an injectable helper.
Option A: mark the tests that touch env vars as serialized (e.g., use a global
test lock or a serial attribute) so the four tests around transcribe_audio do
not run concurrently. Option B (preferred): extract the environment lookup
currently inside transcribe_audio (the logic that reads GROQ_API_KEY /
MISTRAL_API_KEY and whitespace-checks keys) into a pure function like
resolve_transcription_api_key(&TranscriptionConfig) -> Result<String, Error> and
update transcribe_audio to accept an injected key or resolver; update tests to
call the resolver directly or pass a mocked key so they no longer mutate
std::env.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cf40fd03-c167-44ae-85e8-57bbc87fd3a8
📒 Files selected for processing (2)
src/channels/transcription.rssrc/config/schema.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/config/schema.rs (1)
5519-5523: Nice fix wiringtranscription.api_keyinto secrets lifecycle; please add regression assertions.The load/save secret handling is correct now. I’d still add assertions in the existing config secret roundtrip test to explicitly cover
transcription.api_key, so this path doesn’t regress silently.Also applies to: 6358-6362
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config/schema.rs` around lines 5519 - 5523, Add explicit assertions to the existing config secrets roundtrip test to cover transcription.api_key: after the test saves the config secrets, assert that the secrets store contains an entry for "config.transcription.api_key" (or that it is present/absent as expected), and after loading/decrypting the config assert that config.transcription.api_key equals the original value (or Option state) used in the test; reference the decrypt_optional_secret call and the config.transcription.api_key field to locate where to add these checks so this secret path is exercised and cannot regress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/config/schema.rs`:
- Around line 512-523: TranscriptionConfig currently derives Debug which will
print the sensitive api_key; remove Debug from the derive list and add a manual
impl fmt::Debug for TranscriptionConfig that prints enabled normally but redacts
or masks api_key (e.g., show "<redacted>" or "Some(****)" when
api_key.is_some()); keep Serialize, Deserialize, JsonSchema derives intact and
ensure the impl references the struct name TranscriptionConfig and its field
api_key so any debug logging no longer exposes the secret.
---
Nitpick comments:
In `@src/config/schema.rs`:
- Around line 5519-5523: Add explicit assertions to the existing config secrets
roundtrip test to cover transcription.api_key: after the test saves the config
secrets, assert that the secrets store contains an entry for
"config.transcription.api_key" (or that it is present/absent as expected), and
after loading/decrypting the config assert that config.transcription.api_key
equals the original value (or Option state) used in the test; reference the
decrypt_optional_secret call and the config.transcription.api_key field to
locate where to add these checks so this secret path is exercised and cannot
regress.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2fb448bd-a5d9-4722-b691-9121176e3de2
📒 Files selected for processing (1)
src/config/schema.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/config/schema.rs`:
- Around line 554-572: The Debug implementation for TranscriptionConfig
currently prints api_url verbatim; update the impl of std::fmt::Debug for
TranscriptionConfig (the fmt method) to redact or sanitize the api_url field
before logging (similar to api_key). Replace the direct .field("api_url",
&self.api_url) with a masked representation (e.g., None if empty, or a string
that strips credentials and query params or shows only the scheme+host or
"<redacted_url>") so no embedded credentials or tokenized query params are
emitted; keep the rest of the fields unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 68417b91-09be-4ac4-b9ff-06eca09cd876
📒 Files selected for processing (1)
src/config/schema.rs
| impl std::fmt::Debug for TranscriptionConfig { | ||
| fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { | ||
| f.debug_struct("TranscriptionConfig") | ||
| .field("enabled", &self.enabled) | ||
| .field( | ||
| "api_key", | ||
| &if self.api_key.is_some() { | ||
| Some("<redacted>") | ||
| } else { | ||
| None::<&str> | ||
| }, | ||
| ) | ||
| .field("api_url", &self.api_url) | ||
| .field("model", &self.model) | ||
| .field("language", &self.language) | ||
| .field("max_duration_secs", &self.max_duration_secs) | ||
| .finish() | ||
| } | ||
| } |
There was a problem hiding this comment.
Redact or sanitize api_url in Debug output.
api_key is redacted, but api_url is still logged verbatim. URLs can contain embedded credentials or tokenized query params, which risks secret leakage in logs.
🔒 Proposed fix
impl std::fmt::Debug for TranscriptionConfig {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("TranscriptionConfig")
.field("enabled", &self.enabled)
.field(
"api_key",
&if self.api_key.is_some() {
Some("<redacted>")
} else {
None::<&str>
},
)
- .field("api_url", &self.api_url)
+ .field("api_url_configured", &!self.api_url.trim().is_empty())
.field("model", &self.model)
.field("language", &self.language)
.field("max_duration_secs", &self.max_duration_secs)
.finish()
}
}Based on learnings: "Never log secrets, raw tokens, or sensitive payloads".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| impl std::fmt::Debug for TranscriptionConfig { | |
| fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { | |
| f.debug_struct("TranscriptionConfig") | |
| .field("enabled", &self.enabled) | |
| .field( | |
| "api_key", | |
| &if self.api_key.is_some() { | |
| Some("<redacted>") | |
| } else { | |
| None::<&str> | |
| }, | |
| ) | |
| .field("api_url", &self.api_url) | |
| .field("model", &self.model) | |
| .field("language", &self.language) | |
| .field("max_duration_secs", &self.max_duration_secs) | |
| .finish() | |
| } | |
| } | |
| impl std::fmt::Debug for TranscriptionConfig { | |
| fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { | |
| f.debug_struct("TranscriptionConfig") | |
| .field("enabled", &self.enabled) | |
| .field( | |
| "api_key", | |
| &if self.api_key.is_some() { | |
| Some("<redacted>") | |
| } else { | |
| None::<&str> | |
| }, | |
| ) | |
| .field("api_url_configured", &!self.api_url.trim().is_empty()) | |
| .field("model", &self.model) | |
| .field("language", &self.language) | |
| .field("max_duration_secs", &self.max_duration_secs) | |
| .finish() | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/config/schema.rs` around lines 554 - 572, The Debug implementation for
TranscriptionConfig currently prints api_url verbatim; update the impl of
std::fmt::Debug for TranscriptionConfig (the fmt method) to redact or sanitize
the api_url field before logging (similar to api_key). Replace the direct
.field("api_url", &self.api_url) with a masked representation (e.g., None if
empty, or a string that strips credentials and query params or shows only the
scheme+host or "<redacted_url>") so no embedded credentials or tokenized query
params are emitted; keep the rest of the fields unchanged.
- Replace fragile contains("mistral.ai") with proper URL host parsing
via is_mistral_host() using reqwest::Url
- Add api_key field to TranscriptionConfig for explicit key configuration
- Enrich TranscriptionConfig docs with defaults, compatibility, migration
- Add 8 new unit tests: Mistral/Groq key resolution, whitespace
filtering, URL host detection, and spoofed-path rejection
Add decrypt_optional_secret and encrypt_optional_secret calls for config.transcription.api_key in Config::load_or_init and Config::save, matching the pattern used by other sensitive credential fields.
138240a to
5b05f2f
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/channels/transcription.rs (1)
172-241:⚠️ Potential issue | 🟡 MinorAdd env synchronization guard to prevent test race condition.
Multiple tests call
std::env::remove_var()without synchronization. Sincecargo testruns tests in parallel by default, concurrent env mutations can interfere with each other's state, leading to flaky results.The codebase already uses
env_override_lock()insrc/config/schema.rsfor exactly this purpose (40+ tests rely on it). Wrap each env-dependent test with this existing pattern:#[tokio::test] async fn rejects_missing_api_key() { let _env_guard = env_override_lock().await; // Add this std::env::remove_var("GROQ_API_KEY"); // ... rest of test }First, export
env_override_lockfromsrc/config/schema.rsor define a local equivalent in the transcription test module. This ensures deterministic test behavior per coding guidelines.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/transcription.rs` around lines 172 - 241, Tests mutate environment variables concurrently causing races; wrap each env-dependent test (e.g., rejects_missing_api_key, uses_config_api_key_without_groq_env, mistral_url_falls_back_to_mistral_env_key, whitespace_only_api_key_is_rejected) with the existing env_override_lock() guard to serialize env changes, and ensure env_override_lock is imported/exported into the transcription test module (or provide a local equivalent) so transcribe_audio and TranscriptionConfig-based tests use let _env_guard = env_override_lock().await before calling std::env::remove_var or modifying config.
♻️ Duplicate comments (1)
src/config/schema.rs (1)
603-621:⚠️ Potential issue | 🟠 MajorSanitize
api_urlinDebugoutput to avoid secret leakage.At Line 615,
api_urlis logged verbatim. URLs can carry embedded credentials or tokenized query params.🔒 Proposed fix
impl std::fmt::Debug for TranscriptionConfig { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { f.debug_struct("TranscriptionConfig") .field("enabled", &self.enabled) .field( "api_key", &if self.api_key.is_some() { Some("<redacted>") } else { None::<&str> }, ) - .field("api_url", &self.api_url) + .field("api_url_configured", &!self.api_url.trim().is_empty()) .field("model", &self.model) .field("language", &self.language) .field("max_duration_secs", &self.max_duration_secs) .finish() } }Based on learnings: "Never log secrets, raw tokens, or sensitive payloads."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config/schema.rs` around lines 603 - 621, The Debug impl for TranscriptionConfig currently prints api_url verbatim in fmt (impl std::fmt::Debug for TranscriptionConfig -> fn fmt), which can leak credentials in userinfo or query params; change the api_url field rendering to a sanitized representation: parse the stored api_url (e.g., with url::Url) and emit only non-sensitive parts (scheme + host + optional path) or a fixed placeholder like "<redacted>" when userinfo or query/fragment exist or parsing fails, similar to how api_key is redacted, ensuring the field name "api_url" is present but its value never contains credentials or query tokens.
🧹 Nitpick comments (1)
src/channels/transcription.rs (1)
102-104: Error message could be more precise about provider-specific env vars.The message suggests setting either
MISTRAL_API_KEYorGROQ_API_KEY, but the env-var fallback is provider-specific—onlyMISTRAL_API_KEYis checked for Mistral URLs, and onlyGROQ_API_KEYfor others. A user with a Mistral URL who setsGROQ_API_KEYwill still see this error.Proposed fix for clearer error messaging
- .context( - "Missing transcription API key: set [transcription].api_key, MISTRAL_API_KEY, or GROQ_API_KEY environment variable", - )?; + .with_context(|| { + let env_hint = if mistral { "MISTRAL_API_KEY" } else { "GROQ_API_KEY" }; + format!( + "Missing transcription API key: set [transcription].api_key or {env_hint} environment variable" + ) + })?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/channels/transcription.rs` around lines 102 - 104, Update the error context message to be provider-aware: instead of suggesting both MISTRAL_API_KEY and GROQ_API_KEY unconditionally in the .context(...) call, detect the transcription provider (e.g., inspect the transcription URL or the branch that handles Mistral vs others) and produce a provider-specific message that tells users which env var to set (for Mistral URLs recommend MISTRAL_API_KEY, otherwise recommend GROQ_API_KEY). Locate the .context(...) invocation that builds the "Missing transcription API key..." message and change it to provide conditional text based on the provider detection logic used elsewhere in this module (same code path that checks for Mistral URLs), ensuring the error references the correct env var for the selected provider.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/channels/transcription.rs`:
- Around line 172-241: Tests mutate environment variables concurrently causing
races; wrap each env-dependent test (e.g., rejects_missing_api_key,
uses_config_api_key_without_groq_env, mistral_url_falls_back_to_mistral_env_key,
whitespace_only_api_key_is_rejected) with the existing env_override_lock() guard
to serialize env changes, and ensure env_override_lock is imported/exported into
the transcription test module (or provide a local equivalent) so
transcribe_audio and TranscriptionConfig-based tests use let _env_guard =
env_override_lock().await before calling std::env::remove_var or modifying
config.
---
Duplicate comments:
In `@src/config/schema.rs`:
- Around line 603-621: The Debug impl for TranscriptionConfig currently prints
api_url verbatim in fmt (impl std::fmt::Debug for TranscriptionConfig -> fn
fmt), which can leak credentials in userinfo or query params; change the api_url
field rendering to a sanitized representation: parse the stored api_url (e.g.,
with url::Url) and emit only non-sensitive parts (scheme + host + optional path)
or a fixed placeholder like "<redacted>" when userinfo or query/fragment exist
or parsing fails, similar to how api_key is redacted, ensuring the field name
"api_url" is present but its value never contains credentials or query tokens.
---
Nitpick comments:
In `@src/channels/transcription.rs`:
- Around line 102-104: Update the error context message to be provider-aware:
instead of suggesting both MISTRAL_API_KEY and GROQ_API_KEY unconditionally in
the .context(...) call, detect the transcription provider (e.g., inspect the
transcription URL or the branch that handles Mistral vs others) and produce a
provider-specific message that tells users which env var to set (for Mistral
URLs recommend MISTRAL_API_KEY, otherwise recommend GROQ_API_KEY). Locate the
.context(...) invocation that builds the "Missing transcription API key..."
message and change it to provide conditional text based on the provider
detection logic used elsewhere in this module (same code path that checks for
Mistral URLs), ensuring the error references the correct env var for the
selected provider.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c6f78bee-76c9-4846-a885-be05ca824fd2
📒 Files selected for processing (2)
src/channels/transcription.rssrc/config/schema.rs
|
@willsarg Hi ! why changing from dev to master please? Thanks |
Summary
devvoxtral-mini-2507(viavoxtral-mini-latest) offers a competitive, zero-friction alternative transcription API with an endpoint-compatible multipart request format.src/channels/transcription.rsnow infers the proxy key (transcription.mistralvstranscription.groq) and env-var fallback (MISTRAL_API_KEYvsGROQ_API_KEY) dynamically from the configuredapi_url.src/config/schema.rsaddstranscription.mistralto the supported proxy service keys and updates documentation comments.whisper-large-v3-turbo. No new dependencies. No breaking changes to existing config.Label Snapshot (required)
risk: lowsize: XSchannel, configchannel: telegramChange Metadata
featurechannelLinked Issue
Supersede Attribution (required when
Supersedes #is used)N/A
Validation Evidence (required)
Security Impact (required)
MISTRAL_API_KEYenv var is now checked as a fallback when the configuredapi_urlpoints tomistral.aiPrivacy and Data Hygiene (required)
passCompatibility / Migration
MISTRAL_API_KEYenv var now recognized as a fallback (additive only)i18n Follow-Through (required when docs or user-facing wording changes)
Human Verification (required)
api_keyexplicitly set in config takes priority over all env vars regardless of endpoint.MISTRAL_API_KEY).Side Effects / Blast Radius (required)
channels/transcription.rsandconfig/schema.rsonly.Agent Collaboration Notes (recommended)
AGENTS.md+CONTRIBUTING.md.Rollback Plan (required)
src/channels/transcription.rsandsrc/config/schema.rs— no DB or config migrations needed.api_urlare completely unaffected.MISTRAL_API_KEYis unset with a Mistral endpoint configured (clear error message in logs).Risks and Mitigations
Summary by CodeRabbit