Symptom
After the napi-rs v3 migration (PR #1385), the edr_napi mocha suite crashes during process teardown on macos-15:
46 passing (...)
SIGBUS / SIGSEGV (run-to-run alternation)
The crash fires after the last test prints, inside Node's atexit cleanup — well past mocha's success line. EDR #590 reports the same crash on Linux with a stack trace pointing at napi-rs's ThreadsafeFunction finalizer; macOS-15 is just a more visible expression of the same race.
Root cause
The Node bug is nodejs/node#55706: a UAF on ThreadsafeFunction references during env teardown. Fixed only in Node 25 (nodejs/node#55877, commit 350b0ea895); no LTS backport.
The napi-rs v3 migration did not introduce the bug — but it dropped v2's explicit tsfn.unref(env) escape hatch in favor of weak::<true>(), removing the workaround that was masking it. The race becomes visible against any ThreadsafeFunction (TSFN) dropped during atexit, which is every TSFN the EDR provider holds (subscriptions, console-log decode, print-line, coverage, gas-report, call-override, suite-completion).
Fix
Add an explicit async Provider.close() that runs the cleanup cascade — IntervalMiner::Drop → Runtime::shutdown_background() → TSFN release — on the JS event loop, before atexit runs. Hardhat's network manager (and EDR's mocha suite) call it in afterEach / shutdown.
Background
Full investigation (Node bug timeline, why macOS-15, why EDR's tokio runtime amplifies the race, alternatives considered, and the design of Provider.close()):
edr-provider-close-investigation.md (on the fix branch)
Related
Symptom
After the napi-rs v3 migration (PR #1385), the
edr_napimocha suite crashes during process teardown onmacos-15:The crash fires after the last test prints, inside Node's
atexitcleanup — well past mocha's success line. EDR #590 reports the same crash on Linux with a stack trace pointing at napi-rs'sThreadsafeFunctionfinalizer; macOS-15 is just a more visible expression of the same race.Root cause
The Node bug is nodejs/node#55706: a UAF on
ThreadsafeFunctionreferences during env teardown. Fixed only in Node 25 (nodejs/node#55877, commit350b0ea895); no LTS backport.The napi-rs v3 migration did not introduce the bug — but it dropped v2's explicit
tsfn.unref(env)escape hatch in favor ofweak::<true>(), removing the workaround that was masking it. The race becomes visible against any ThreadsafeFunction (TSFN) dropped during atexit, which is every TSFN the EDR provider holds (subscriptions, console-log decode, print-line, coverage, gas-report, call-override, suite-completion).Fix
Add an explicit async
Provider.close()that runs the cleanup cascade —IntervalMiner::Drop→Runtime::shutdown_background()→ TSFN release — on the JS event loop, before atexit runs. Hardhat's network manager (and EDR's mocha suite) call it inafterEach/ shutdown.Background
Full investigation (Node bug timeline, why macOS-15, why EDR's tokio runtime amplifies the race, alternatives considered, and the design of
Provider.close()):edr-provider-close-investigation.md (on the fix branch)
Related
macos-15to keep CI green pending this fix