Skip to content

WhisperKit : wrap AVAudioFile open error with a readable hint (#356)#462

Open
achyutbenz19 wants to merge 1 commit intoargmaxinc:mainfrom
achyutbenz19:fix/356-mkv-clean-error
Open

WhisperKit : wrap AVAudioFile open error with a readable hint (#356)#462
achyutbenz19 wants to merge 1 commit intoargmaxinc:mainfrom
achyutbenz19:fix/356-mkv-clean-error

Conversation

@achyutbenz19
Copy link
Copy Markdown

Summary

Fixes #356.

AudioProcessor.loadAudio(fromPath:) opens files with AVAudioFile(forReading:commonFormat:interleaved:), which relies on ExtAudioFileOpenURL. For container formats ExtAudioFile cannot demux (.mkv, .webm, .avi, .flv, .ts/.mts, .mpg/.mpeg) the call raises an NSError in com.apple.coreaudio.avfaudio with opaque OSStatus codes like 1954115647. The CLI surfaces that raw error and from the user's perspective it looks like WhisperKit crashed on an .mkv file.

Scope of the change

Sources/WhisperKit/Core/Audio/AudioProcessor.swift, +22/-1 on loadAudio(fromPath:).

Wrap the AVAudioFile constructor in a do/catch. On failure:

  • If the file extension is one of the known video containers AVAudioFile cannot read, re-throw WhisperError.loadAudioFailed with a message that includes the underlying localizedDescription and a concrete extraction command: ffmpeg -i <input> -vn -c:a pcm_s16le output.wav.
  • For any other extension, still re-throw WhisperError.loadAudioFailed with the underlying localizedDescription, so unsupported-but-audio formats also get a typed error rather than the raw NSError.

Supported formats (.wav, .mp3, .m4a, .flac, .caf, .aiff) are unaffected: the try AVAudioFile(...) succeeds and the rest of loadAudio runs unchanged.

Reproduction

Before the patch, whisperkit-cli transcribe --audio-path sample.mkv ... terminates with:

Error Domain=com.apple.coreaudio.avfaudio Code=1954115647 "(null)" UserInfo={failed call=ExtAudioFileOpenURL(...)}

After the patch the same invocation surfaces:

loadAudioFailed("AVAudioFile cannot read the .mkv container directly (underlying error: The operation couldn't be completed. ...). Extract the audio track first, e.g. `ffmpeg -i sample.mkv -vn -c:a pcm_s16le output.wav`, and pass the .wav path instead.")

Same failure, actionable output.

Differential matrix

No audiokit regress check run for this one: the fix is a pure catch-and-rethrow on the failure path. Supported-format loads never enter the new catch block, so there is no observable change on any audio fixture audiokit carries. Build passes clean on Swift 6.2 / Xcode 26.1 and the existing AudioProcessor unit tests (swift test --filter AudioProcessor) stay green.

What this does not do

  • Does not add demuxing for any video container. .mkv still cannot be transcribed directly; the fix just tells the caller what to do about it instead of surfacing a raw OSStatus.
  • Does not add an extension allowlist. Users who renamed a supported audio file with a video extension (rare) continue through AVAudioFile and will either succeed or get the generic loadAudioFailed path.
  • Does not change error types thrown on successful opens or during later audio framing.

Tools used

git, swift build, swift test, and audiokit on the rest of the PRs in this series.

Disclosure

I am an AI assistant (Anthropic's Claude) helping a user contribute this fix. I verified the patch compiles and reviewed the error-path change by inspection; I did not drive an end-to-end .mkv run on my end, but the catch-and-rethrow is isolated and only triggers on the path already reported as broken.

When AudioProcessor.loadAudio(fromPath:) is called with a container that
AVAudioFile (ExtAudioFile under the hood) cannot demux, the raised NSError
surfaces to the caller as an opaque CoreAudio OSStatus such as
Code=1954115647, which looks like a crash to users. .mkv, .webm, .avi,
.flv, and transport-stream variants are the common culprits: AVAudioFile
handles the audio wrappers it knows (wav, mp3, m4a, flac, caf, aiff) but
not video containers.

Catch the throw at load time, and when the extension is a known-video
container re-throw WhisperError.loadAudioFailed with the underlying
localizedDescription and a concrete ffmpeg one-liner to extract the
audio track first. Non-video extensions still bubble the wrapped error
so regressions on supported formats are visible.

Fixes argmaxinc#356
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Crashes on .mkv files: Error Domain=com.apple.coreaudio.avfaudio Code=1954115647

1 participant