Skip to content

fix(silero): bump onnxruntime-node to 1.24.3 to fix libc++abi mutex abort (#1375)#1377

Merged
toubatbrian merged 1 commit intolivekit:mainfrom
sgzrov:fix/silero-onnxruntime-1.24-mutex-abort
May 4, 2026
Merged

fix(silero): bump onnxruntime-node to 1.24.3 to fix libc++abi mutex abort (#1375)#1377
toubatbrian merged 1 commit intolivekit:mainfrom
sgzrov:fix/silero-onnxruntime-1.24-mutex-abort

Conversation

@sgzrov
Copy link
Copy Markdown
Contributor

@sgzrov sgzrov commented May 1, 2026

Summary

Closes #1375.

@livekit/agents-plugin-silero pins onnxruntime-node to 1.21.0 exactly. That version (and 1.21.x..1.23.x) has a static-destructor race that aborts the Node process on shutdown when LiveKit's tokio runtime threads are still alive at exit — a 100%-reproducible crash on macOS arm64 with a libc++abi mutex error.

The bug was fixed upstream in onnxruntime-node@1.24.1. This PR bumps the pin (and the matching onnxruntime-common devDep) to 1.24.3.

Bisect

Same minimal repro, only changing the onnxruntime-node version:

onnxruntime-node Result
1.20.1 clean exit
1.21.0 crash (silero's pinned version)
1.21.1 / 1.22.0 / 1.23.0 / 1.23.2 crash
1.24.1 clean exit — fix landed here
1.24.2 / 1.24.3 / 1.25.1 clean exit

Native stack trace at the abort

frame #8:  libonnxruntime.1.21.0.dylib`__clang_call_terminate + 12
frame #9:  libonnxruntime.1.21.0.dylib`std::__1::unique_ptr<OrtEnv,
              std::__1::default_delete<OrtEnv>>::~unique_ptr() + 84
frame #10: libsystem_c.dylib`__cxa_finalize_ranges + 416
frame #11: libsystem_c.dylib`exit + 44
frame #12: node`node::Exit(node::ExitCode) + 12
frame #15: node`Builtins_CallApiCallbackGeneric + 184  ← process.exit(0) from JS

10 tokio-rt-worker threads from rtc-node.darwin-arm64.node are still alive at the moment of abort (parked in __psynch_cvwait/kevent). Full lldb trace and bisection in #1375.

Test plan

  • pnpm install resolves cleanly (lockfile regenerated, only the silero workspace is affected)
  • pnpm --filter @livekit/agents build succeeds
  • pnpm --filter @livekit/agents-plugin-silero build succeeds
  • Minimal repro from silero.VAD.load() causes libc++abi mutex abort on process exit (macOS arm64) #1375 (silero.VAD.load() + process.exit(0)) exits cleanly with status 0 against the new version
  • pnpm -w format:write and pnpm -w lint:fix produce no new warnings (existing 120 warnings in agents/src/voice/generation_tools.test.ts are unrelated)

onnxruntime-node 1.21 → 1.24 is a minor bump within the same major. The exposed API surface used by silero/src/onnx_model.ts (InferenceSession.create, Tensor, run-options) is unchanged across this range, and the silero plugin's own build + types pass.

…bort on shutdown (macOS arm64)

Loading the Silero VAD plugin and exiting Node aborts with
"libc++abi: terminating due to uncaught exception of type
std::__1::system_error: mutex lock failed" on macOS arm64.

The abort fires inside ~unique_ptr<OrtEnv> in
libonnxruntime.1.21.0.dylib's static destructor while LiveKit's
tokio runtime threads are still alive — a static-destructor race
present in onnxruntime-node 1.21.0..1.23.2 and fixed upstream in
1.24.1.

Bumping the pin to 1.24.3 (and matching onnxruntime-common) resolves
the crash. Verified via the minimal repro in livekit#1375.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 1, 2026

🦋 Changeset detected

Latest commit: c609294

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 1 additional finding.

Open in Devin Review

@sgzrov
Copy link
Copy Markdown
Contributor Author

sgzrov commented May 1, 2026

Upstream root cause confirmed.

The bisect range matches the documented upstream regression and fix:

  • Upstream issue: microsoft/onnxruntime#24579"mutex issue at process exit on MacOS since v1.21."* — identical error string, identical regression at 1.21, identical destructor-order diagnosis.
  • Upstream fix: microsoft/onnxruntime#26445"[node] Fix logging mutex crash at exit on macOS", merged 2025-10-31, included in onnxruntime-node 1.24.1. Description: "Now we don't let the destructor of OrtEnv to be called if the program exits unexpectedly." Files touched are exactly the OrtEnv lifecycle: js/node/src/ort_singleton_data.{cc,h} and inference_session_wrap.cc.

That matches the lldb trace in #1375 (frame ~unique_ptr<OrtEnv> in libonnxruntime.1.21.0.dylib) precisely. Bumping silero's onnxruntime-node pin from 1.21.01.24.3 picks up that upstream fix.

@toubatbrian toubatbrian merged commit 90a2b2b into livekit:main May 4, 2026
6 checks passed
@github-actions github-actions Bot mentioned this pull request May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

silero.VAD.load() causes libc++abi mutex abort on process exit (macOS arm64)

2 participants