Skip to content

WhisperKit : drop negative sentinels from SuppressTokensFilter (#392)#460

Open
achyutbenz19 wants to merge 1 commit intoargmaxinc:mainfrom
achyutbenz19:fix/392-suppress-tokens-negative
Open

WhisperKit : drop negative sentinels from SuppressTokensFilter (#392)#460
achyutbenz19 wants to merge 1 commit intoargmaxinc:mainfrom
achyutbenz19:fix/392-suppress-tokens-negative

Conversation

@achyutbenz19
Copy link
Copy Markdown

Summary

Fixes #392.

DecodingOptions.supressTokens defaults to [-1] using the OpenAI whisper convention, where -1 is a sentinel meaning "suppress every non-speech special token." WhisperKit never expands the sentinel, so the value flows through createLogitsFilters into SuppressTokensFilter, whose init builds the index array [[0, 0, -1]]. When filterLogits calls logits.fill(indexes:with:), the helper computes linearOffset = -1 * strides[2] and writes -FloatType.infinity at dataPointer - strides[2]. On iOS 26 (where CoreML began returning read-only MLMultiArray for performance), that negative write lands on a protected page and the process crashes with EXC_BAD_ACCESS (SIGBUS) / KERN_PROTECTION_FAILURE.

Scope of the change

Two one-line filter predicates plus one unit test.

  1. Sources/WhisperKit/Core/TextDecoder.swift, createLogitsFilters: change the pre-filter that was { $0 < specialTokenBegin } to { $0 >= 0 && $0 < specialTokenBegin }.
  2. Sources/WhisperKit/Core/Text/LogitsFilter.swift, SuppressTokensFilter.init: filter negatives out of suppressTokens before building suppressTokenIndexes. Defence in depth so that any caller building a filter with a negative id also gets a safe no-op.
  3. Tests/WhisperKitTests/UnitTests.swift, testSuppressTokensFilter: add a case with [-1, 0, -5, 3] and assert the logits at positions 0 and 3 are suppressed while negatives are ignored.

Reproduction

Prior to the patch, running the existing test suite with the new assertion produces a silent memory corruption (the negative write lands inside the test's own MLMultiArray allocation, shifting by -strides[2] bytes). On an iOS 26 device with a real transcription run, the same codepath segfaults as described by the reporter.

After the patch, the new test case passes:

Test Suite 'UnitTests' passed at 2026-04-18 18:34:31.286.
    Executed 1 test, with 0 failures (0 unexpected) in 0.002 (0.002) seconds

The existing three assertions in testSuppressTokensFilter continue to pass unchanged, confirming the positive-id path is not affected.

Differential matrix

This is a pure Swift library change with deterministic output; the regression signal is the unit test. I did not run audiokit regress check on this one because the fix has no effect when every suppressTokens value is already in [0, specialTokenBegin) (which is the case for all real-world WAV fixtures the matrix would exercise). The audit surface here is the invalid-id path, which is now covered by the test case.

What this does not do

  • Does not address the broader iOS 26 read-only MLMultiArray issue. If a caller builds SuppressTokensFilter with only valid positive ids on iOS 26, the in-place fill still lands on a read-only page. That is a separate, bigger problem (affects every LogitsFilter that mutates logits in place) and is best handled with a comprehensive switch to copy-on-modify, which is out of scope here.
  • Does not change what the -1 sentinel should mean. OpenAI whisper expands it to the set of non-speech tokens. If that expansion belongs somewhere, it can be added separately; for now, ignoring the sentinel matches the "do not crash" contract without changing which tokens get suppressed for any non-default configuration.

Tools used

git, swift build, swift test, and audiokit on other PRs in this series. No audio fixtures needed for this one.

Disclosure

I am an AI assistant (Anthropic's Claude) helping a user contribute this fix. I have not personally reproduced the crash on an iOS 26 device (I worked on macOS), but the code path is clear by inspection and the new unit test demonstrates the negative-id behavior is now safe.

DecodingOptions.supressTokens defaults to [-1] in the OpenAI convention
("-1 means suppress all non-speech tokens"), but WhisperKit never
expands the sentinel. The filter at TextDecoder.createLogitsFilters
passed -1 through to SuppressTokensFilter, whose init built the index
[0, 0, -1] and called MLMultiArray.fill, which reached into dataPointer
at a negative offset. On iOS 26 (where CoreML started returning
read-only MLMultiArrays) the negative write landed on a protected page
and crashed with EXC_BAD_ACCESS / SIGBUS.

Filter to [0, specialTokenBegin) at the call site and, as defence in
depth, filter negatives in SuppressTokensFilter.init. Add a unit test
covering mixed positive / negative suppressTokens input.

Fixes argmaxinc#392
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WhisperKit: SuppressTokensFilter writes to read-only MLMultiArray when suppressTokens includes -1 on iOS 26

1 participant