[objective_c] Fix GC safepoint crash in ObjCProtocolBuilder.implementMethod#3307
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
2e8601e to
daf049c
Compare
PR HealthChangelog Entry ✔️
Changes to files need to be accounted for in their respective changelogs. This check can be disabled by tagging the PR with Breaking changes ✔️
This check can be disabled by tagging the PR with API leaks ✔️The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.
This check can be disabled by tagging the PR with |
|
Thanks for the PR @henawey-t. The first thing I'm interested in is the regression test. I'm still confused about what's going on with this bug, and I'd really like to be able to reproduce the failure locally. If this regression test is a simple repro of the error, that alone would be extremely helpful. But when I clone the test into my local main branch, it passes (after I comment out the compile time I have some ideas for how to fix this bug without making |
I was able to reproduce it, but I have to enforce Dart GC from the Objc before calling the actual implementation of I documented everything in the |
0a6ec2f to
f77d876
Compare
|
Great work on the test. Looks like it was pretty fiddly to implement, but it works. I can repro now, so I'll try some alternative fixes. |
|
Played around with some options. Here's a simple fix that doesn't require changing the block inheritance heirarchy: void implementMethod(
Pointer<r.ObjCSelector> sel,
Pointer<Char> signature,
Pointer<Void> trampoline,
ObjCBlockBase block,
) {
if (_built) {
throw StateError('Protocol is already built');
}
final blockRef = block.ref; // Store the ref in a local.
_builder.implementMethod(
sel,
withBlock: blockRef.pointer.cast(),
withTrampoline: trampoline,
withSignature: signature,
);
}We should probably do a quick scan of the repo to see if there's anywhere else that we're doing |
|
I did the changes as recommended and updated the tests. However, the repo scan is done. Running One important distinction though: the Since |
0a4c2c5 to
9aab841
Compare
Figures. I'll fix FFIgen myself in a follow up PR. Did you find any locations using this pattern outside of generated code? Did you want to try to land this PR, or just show me the repro? If you want to land it, start by removing REPRODUCTION.md, and reduce the amount of documentation in finalizable_test.dart and gc_inject.m. It's too verbose atm. A short comment of every non-obvious function, and maybe comment on any particularly obscure lines of code is sufficient. Also, the CHANGELOG entry is a bit too long. The first sentence is good, but the second is unnecessary. If people want to know the details of the fix they can click through on the bug link. |
No unsafe manual usages outside generated code — all 24 sites were in
I'm pushing a commit for these changes. Let me know if anything is unclear. |
ab92e7e to
91b4fa8
Compare
91b4fa8 to
b669edc
Compare
437ee0a to
387795b
Compare
36a7038 to
437e9eb
Compare
…Method Fixes dart-lang#3209 When implementMethod extracted a raw pointer via block.ref.pointer.cast() and passed it across the FFI boundary, the Dart compiler considered `block` dead at that point. A GC safepoint inside the native call could then fire before ObjC had a chance to retain the block pointer, triggering objc_release through the finalizer chain and causing EXC_BAD_ACCESS. Fix: add `implements Finalizable` to `_ObjCRefHolder`, which propagates to ObjCObject and ObjCBlockBase via inheritance. The Dart compiler now treats any local variable of those types as reachable until end of scope, keeping the block alive across the safepoint. Validated in production: crashes dropped from ~1.1K/week to zero after applying this fix via dependency_override on objective_c 9.1.0.
Adds runtime tests to complement the compile-time type assertions for issue dart-lang#3209. Includes: - Runtime isA<Finalizable>() check on NSObject instances - GC pressure test verifying a protocol object's retain count survives doGC() calls (exercises the retain-count machinery around build()) Note: the exact race (GC at a safepoint inside an FFI call) cannot be triggered with doGC(), which runs between Dart instructions. The compile-time assertions in the same file remain the primary regression guard.
…lder) The initial fix added implements Finalizable to _ObjCRefHolder, which propagated to ObjCObject. But Finalizable objects are non-sendable across Dart isolates — this broke NSInputStream tests that use Isolate.run() to pass ObjCObject instances across isolate boundaries. ObjCBlockBase is safe to annotate because blocks capture Dart closures and are never sent across isolates. ObjCObject intentionally stays sendable. Updated finalizable_test accordingly: removed ObjCObject compile-time assertion, added a runtime check that verifies ObjCObject is NOT Finalizable (preserving its isolate-sendability invariant).
…dart-lang#3209 Adds test infrastructure that deterministically reproduces the EXC_BAD_ACCESS crash described in issue dart-lang#3209, where a Dart GC event during an FFI safepoint collects an ObjCBlockBase wrapper before ObjC retains the raw pointer. gc_inject.m (new) ObjC method swizzle for -[DOBJCDartProtocolBuilder implementMethod:withBlock:withTrampoline:withSignature:]. The swizzle calls Dart_ExecuteInternalCommand("gc-now") while the Dart thread is blocked at an FFI safepoint — exactly the window where the bug fires. It then reads the block retain count from the flags field; a count of 0 means the block was freed before ObjC could retain it. hook/build.dart Compiles gc_inject.m into the test dylib on macOS alongside util.c. test/util.dart Dart @Native bindings for the gc_inject.m entry points. callGCNowFromNative and gcAndGetRetainCount intentionally omit isLeaf:true — they call Dart_ExecuteInternalCommand, which requires the Dart thread to be at a proper native-mode safepoint. test/finalizable_test.dart Extends the existing type assertions with: - Compile-time guard (_checkObjCBlockBaseIsFinalizable): dart analyze catches the regression before tests run. - Swizzle test: 1000 iterations, GC injected before ObjC retain; fails if wasBlockFreedBeforeRetain() returns true. - Direct liveness probe (_gcAndCheckBlock): never-inlined function with a local ObjCBlockBase, WeakReference, and a non-leaf FFI call that forces gc-now; sensitive to JIT liveness analysis. - Diagnostic test confirming gc-now works from native code. test/desymbolize.py (new) Script to symbolize AOT crash dumps using atos. Auto-detects the ASLR slide from Dart_UnloadMachODylib annotations (Dart VM format) with a fallback to the (in binary_name) format (macOS crash reporter). Auto-detects binary architecture via lipo. Only sends Unknown symbol frames to atos; already-annotated frames pass through unchanged. test/REPRODUCTION.md (new) Step-by-step guide to reproduce the crash (remove fix, AOT compile, run, crash) and verify the fix (restore, recompile, all pass). Explains the GC injection mechanism, the swizzle call chain, and how to interpret the desymbolized crash output. To reproduce the crash: # Remove `implements Finalizable` from ObjCBlockBase in internal.dart # Also comment out _checkObjCBlockBaseIsFinalizable in finalizable_test.dart dart compile exe test/finalizable_test.dart -o /tmp/ft/finalizable_test DYLD_INSERT_LIBRARIES=.dart_tool/lib/objective_c.dylib /tmp/ft/finalizable_test # EXC_BAD_ACCESS in objc_retain, called from implementMethod Fixes: dart-lang#3209
…approach Apply the Flutter team's recommended fix for the GC-at-FFI-safepoint crash: instead of adding `implements Finalizable` to ObjCBlockBase, extract `block.ref` into a local `blockRef` (type ObjCBlockRef, which transitively implements Finalizable via _ObjCReference) before each FFI call site. The Finalizable contract then keeps `blockRef` live across the non-leaf safepoint, preventing EXC_BAD_ACCESS before ObjC retains the pointer. Update finalizable_test.dart to match: - Remove the isA<Finalizable>() type assertion for ObjCBlockBase - Update _gcAndCheckBlock to use blockRef extraction (mirrors the fix) - Replace bare return with markTestSkipped in canDoGC guards - Return bool from _gcAndCheckBlock instead of int - Update group/test names and comments throughout
…e assets >=3.10.0 in AOT repro
- Remove moot ObjCObject Finalizable test (sendability covered by isolate_test) - Fix 'protocol object survives GC after build': add expect(obj, isNotNull) to keep obj live through doGC() calls; JIT optimizer drops it from the GC stack map after raw pointer extraction, causing premature collection - Rename group to 'block wrapper not freed at GC safepoints' - Remove unused dart:ffi import and issue dart-lang#3209 comment from file header
Extract obj.ref (ObjCObjectRef, which is Finalizable) into a local variable. The Finalizable contract keeps ref in the GC stack map for its entire scope, preventing the ObjC object from being released at GC safepoints inside objectRetainCount (non-leaf FFI calls via isValidClass/calloc). The previous expect(obj, isNotNull) approach was insufficient: in the async state machine, obj is considered dead before the first await, so the optimizer drops it from the GC stack map at safepoints in the initial synchronous segment, causing the first retain count check to see 0.
Replace the ad-hoc BlockHeader flags bit manipulation in gc_inject.m and gcAndGetRetainCount with the existing util.c functions: - gc_inject_imp: use getBlockRetainCount (extern from util.c) instead of raw (flags & 0xFFFF) >> 1 inline - _gcAndCheckBlock: split into callGCNowFromNative() + blockRetainCount(), which reads the block ABI flags field correctly on all architectures - Remove gcAndGetRetainCount (now unused) and the BlockHeader struct
The async test was unreliable on ARM64: Dart's Finalizable guarantee does not extend across await suspension points in async state machines. Variables not referenced after an await may be pruned from the continuation closure, allowing GC to release the ObjC object. Restructure into a @pragma('vm:never-inline') synchronous helper, mirroring _gcAndCheckBlock, where ref is a true stack local with reliable Finalizable semantics.
437e9eb to
47d6a58
Compare
@liamappelbe done |
47d6a58 to
fe08d03
Compare
…e redundant object test The 'protocol object survives GC after build' test kept failing on CI (ARM64) because doGC() is inlinable — after inlining, the optimizer dropped ref from the GC root set before the GC trigger, causing a spurious retain-count-0. The test didn't cover the issue dart-lang#3209 fix; general ObjCObjectRef lifecycle is already covered by autorelease_test, nsarray_test, nsset_test, nsdictionary_test, ns_input_stream_test, and observer_test. Renamed to protocol_builder_test.dart to match the <feature>_test.dart convention.
fe08d03 to
4d3343f
Compare
|
Great, the PR is green now. @liamappelbe let me know if anything is needed from my side to merge the PR. |
liamappelbe
left a comment
There was a problem hiding this comment.
Looks good, but you'll have a merge conflict with #3325. You're adding a test-only util, so it should be gated behind the includeTestUtils flag, just like test/util.c
|
Actually it's a small enough merge that I can do it myself through github's UI |
Fixes #3209
Root cause
In
ObjCProtocolBuilder.implementMethod, the call toblock.ref.pointer.cast()extracts a raw pointer and passes it across the FFI boundary. After that expression the Dart compiler considersblockdead — no more Dart-level references. A GC safepoint inside the native call can fire before ObjC has had a chance toretainthe block pointer, triggeringobjc_releasethrough the finalizer chain and producingEXC_BAD_ACCESSatprotocol.m:33.Why the existing
Finalizableon_ObjCReference/ObjCBlockRefdoesn't help: Dart'sFinalizableguarantee only protects local variables whose static type is a subtype ofFinalizable. The parameterblockhas static typeObjCBlockBase, which was not in theFinalizablehierarchy. The transient field accessblock.refis never stored in a local variable of aFinalizabletype, so noreachabilityFencewas inserted for any part of the chain.Fix
Add
implements FinalizabletoObjCBlockBasein_ObjCRefHolder's subclass chain:The Dart compiler now keeps
blockalive until the end ofimplementMethod's scope — including across the native safepoint — giving ObjC time toretainthe block pointer before any finalizer can fire.Why
ObjCObjectis intentionally excluded:ObjCObjectinstances can be sent across Dart isolates (e.g.NSInputStreamusesIsolate.run).Finalizableobjects are non-sendable; adding it toObjCObjectwould break that. Blocks capture Dart closures and are never sent across isolates, soFinalizableis safe forObjCBlockBaseonly.Production validation
Applied this fix (as
implements FinalizableonObjCBlockBase) to a forkedobjective_c 9.1.0viadependency_override. Crashlytics confirmed crashes dropped from ~1.1K events/week to zero in the app version that previously exhibited the crash. No new crashes or memory regressions observed.Testing
finalizable_test.dart): a static type check thatdart analyzeenforces — ifObjCBlockBaseever losesimplements Finalizablea type error is emitted immediately.expect(obj, isNot(isA<Finalizable>()))onNSObjectverifiesObjCObjectremains non-Finalizable(preserving isolate sendability).doGC(), and verifies the retain count remains non-zero.doGC(), which runs between Dart instructions rather than at FFI safepoints. The compile-time assertion is the primary regression guard.Pre-existing test failures (unrelated)
Two tests fail identically on
mainbefore any of these changes:nsdate_test: NSDate from DateTime— date format locale differencensdictionary/nsarray_test: ref counting— flaky GC-timing interference between test files🤖 Generated with Claude Code