Skip to content

fix: always use default_input_config to avoid shared-mode stream rejection#1163

Open
dan-vernon wants to merge 2 commits intocjpais:mainfrom
dan-vernon:fix/shared-mode-audio-stream-config
Open

fix: always use default_input_config to avoid shared-mode stream rejection#1163
dan-vernon wants to merge 2 commits intocjpais:mainfrom
dan-vernon:fix/shared-mode-audio-stream-config

Conversation

@dan-vernon
Copy link
Copy Markdown

@dan-vernon dan-vernon commented Mar 27, 2026

Summary

  • Replaces get_preferred_config()'s format-selection logic with a direct call to default_input_config()
  • Fixes consistent Failed to build input stream: The requested stream configuration is not supported by the device error on macOS and Windows

Problem

get_preferred_config() iterated supported_input_configs() and picked the highest-scored sample format (preferring F32 over I16). However, supported_input_configs() enumerates exclusive-mode formats on both CoreAudio (macOS) and WASAPI (Windows). Since cpal opens streams in shared mode, the OS requires the stream to use the mixer's native format — which is what default_input_config() returns. Passing a different format caused build_input_stream to fail immediately on every attempt.

This root cause was identified by @VirenMohindra in #990. The existing FrameResampler in run_consumer() already handles downsampling to the 16 kHz rate required by Whisper/Parakeet, so no other changes are needed.

Testing

  • Tested locally on macOS 26 with built-in mic
  • Tested with standard transcription and post-processing modes
  • rustfmt --check passes
  • Existing unit tests unaffected (they test string matching helpers, not device config)

Related issues

Closes #990

AI assistance

This PR was written with the help of Claude Code. It was used extensively — log analysis, root cause diagnosis, code change, and PR writeup.

…ction

CoreAudio (macOS) and WASAPI (Windows) only accept streams in the OS
mixer's native shared-mode format. The previous get_preferred_config()
tried to find a higher-quality format (preferring F32 over I16) by
iterating supported_input_configs(), but those enumerate exclusive-mode
formats that both backends reject with "stream configuration not
supported" when opening in shared mode.

Fixes the consistent recording failure on macOS and Windows where
build_input_stream fails immediately. Resolves the same class of issue
as cjpais#990.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Some devices (notably Bluetooth HFP on macOS, e.g. Sony MDR-1000X)
report F32 in default_input_config() but CoreAudio rejects that format
when the stream is actually opened — only I16 works for the HFP
shared-mode stream. Pre-clone the sender and stop-flag (both cheap
ref-count bumps) so we can retry with I16 before giving up.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@VirenMohindra
Copy link
Copy Markdown
Contributor

what was the reasoning behind closing this PR - did the fix not work?

@dan-vernon
Copy link
Copy Markdown
Author

what was the reasoning behind closing this PR - did the fix not work?

Hey @VirenMohindra - same error reoccurred after subsequently testing the fix with a Bluetooth input device, so I need to isolate this from a possible MacOS bluetooth / CoreAudio issue before re-opening. Thx

@dan-vernon
Copy link
Copy Markdown
Author

Reopening with additional retry with I16 if stream creation still fails

Looking through git blame, the old implementation came from two separate changes. First, an explicit format preference (F32 > I16 > I32) when the recorder was still trying to open directly at 16 kHz. Later, this switched to the device’s default sample rate and relied on run_consumer() to downsample, but kept that format-preference heuristic. That combination turns out not to be robust,

The root issue is that the old get_preferred_config() walked supported_input_configs() and preferred F32 over I16. That is not robust with the current cpal backend: enumeration can advertise F32 while default_input_config() reports the actual device format as I16. That mismatch can cause build_input_stream() to fail immediately.

Using default_input_config() fixes the reproduced macOS failure by deferring to the device's native/default format. The existing FrameResampler still handles conversion to the 16 kHz rate required by Whisper/Parakeet, so downstream processing is unchanged.

The new I16 retry is the additional change in this reopening. It is a defensive fallback for cases where the default format is still rejected at stream-open time. It improves robustness, but it is not meant as a claim that every backend or device-state issue is now covered.

For Windows, to be precise: this is still the safer, more API-aligned behaviour for shared-mode capture, but the exact failure mode reproduced here was on macOS, not tested on WASAPI.

Validation so far:

  • tested with the built-in mic and normal transcription/post-processing flows on macOS
  • confirmed effective during a window where a Bluetooth device was actively in HFP mode and the old code was failing
  • consistent HFP reproduction is difficult to force on demand because it depends on external Bluetooth/VOIP/OS state
  • cargo test --manifest-path src-tauri/Cargo.toml --lib passes with 62 tests
  • cargo fmt --manifest-path src-tauri/Cargo.toml --check passes

Overall assessment: strong confidence for the observed macOS bug, lower risk because this mostly removes custom config-selection logic, but still limited by the lack of automated hardware-level coverage.

@dan-vernon dan-vernon reopened this Mar 27, 2026
@VirenMohindra
Copy link
Copy Markdown
Contributor

thanks for the comment and the context -- im just not too sure about this one. we added custom code for a reason in #1084

always a fan of removing cruft, but we set this up for a reason to fix a number of other issues

what do you think @cjpais

@@ -282,56 +303,16 @@ impl AudioRecorder {
fn get_preferred_config(
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is effectively a dead function now

@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Mar 28, 2026

@VirenMohindra I will need more time with this one, or we need to get a test build out to everyone with this

I am generally in favor of using default configs always if possible. I do suspect it will help avoid some issues, but not sure if it will cause others. We just need to make sure Handy can handle the cases for the new configs it might be thrown

@github-actions
Copy link
Copy Markdown

🧪 Test Build Ready

Build artifacts for PR #1163 are available for testing.

Download artifacts from workflow run

Artifacts expire after 30 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Windows 11: "backend-specific error"

3 participants