Skip to content

[objective_c] Fix GC safepoint crash in ObjCProtocolBuilder.implementMethod#3307

Merged
liamappelbe merged 15 commits into
dart-lang:mainfrom
Henawey:DOBJCDartProtocolBuilder-crash-fix
Apr 28, 2026
Merged

[objective_c] Fix GC safepoint crash in ObjCProtocolBuilder.implementMethod#3307
liamappelbe merged 15 commits into
dart-lang:mainfrom
Henawey:DOBJCDartProtocolBuilder-crash-fix

Conversation

@henawey-t
Copy link
Copy Markdown
Contributor

Fixes #3209

Root cause

In ObjCProtocolBuilder.implementMethod, the call to block.ref.pointer.cast() extracts a raw pointer and passes it across the FFI boundary. After that expression the Dart compiler considers block dead — no more Dart-level references. A GC safepoint inside the native call can fire before ObjC has had a chance to retain the block pointer, triggering objc_release through the finalizer chain and producing EXC_BAD_ACCESS at protocol.m:33.

Why the existing Finalizable on _ObjCReference / ObjCBlockRef doesn't help: Dart's Finalizable guarantee only protects local variables whose static type is a subtype of Finalizable. The parameter block has static type ObjCBlockBase, which was not in the Finalizable hierarchy. The transient field access block.ref is never stored in a local variable of a Finalizable type, so no reachabilityFence was inserted for any part of the chain.

Fix

Add implements Finalizable to ObjCBlockBase in _ObjCRefHolder's subclass chain:

class ObjCBlockBase extends _ObjCRefHolder<c.ObjCBlockImpl, ObjCBlockRef>
    implements Finalizable {

The Dart compiler now keeps block alive until the end of implementMethod's scope — including across the native safepoint — giving ObjC time to retain the block pointer before any finalizer can fire.

Why ObjCObject is intentionally excluded: ObjCObject instances can be sent across Dart isolates (e.g. NSInputStream uses Isolate.run). Finalizable objects are non-sendable; adding it to ObjCObject would break that. Blocks capture Dart closures and are never sent across isolates, so Finalizable is safe for ObjCBlockBase only.

Production validation

Applied this fix (as implements Finalizable on ObjCBlockBase) to a forked objective_c 9.1.0 via dependency_override. Crashlytics confirmed crashes dropped from ~1.1K events/week to zero in the app version that previously exhibited the crash. No new crashes or memory regressions observed.

Testing

  • Compile-time assertion (finalizable_test.dart): a static type check that dart analyze enforces — if ObjCBlockBase ever loses implements Finalizable a type error is emitted immediately.
  • Runtime assertion: expect(obj, isNot(isA<Finalizable>())) on NSObject verifies ObjCObject remains non-Finalizable (preserving isolate sendability).
  • GC pressure test: builds a protocol object, forces doGC(), and verifies the retain count remains non-zero.
  • The exact race (GC at a safepoint inside an FFI call) cannot be triggered with doGC(), which runs between Dart instructions rather than at FFI safepoints. The compile-time assertion is the primary regression guard.

Pre-existing test failures (unrelated)

Two tests fail identically on main before any of these changes:

  • nsdate_test: NSDate from DateTime — date format locale difference
  • nsdictionary/nsarray_test: ref counting — flaky GC-timing interference between test files

🤖 Generated with Claude Code

@google-cla
Copy link
Copy Markdown

google-cla Bot commented Apr 15, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 16, 2026

PR Health

Changelog Entry ✔️
Package Changed Files

Changes to files need to be accounted for in their respective changelogs.

This check can be disabled by tagging the PR with skip-changelog-check.

Breaking changes ✔️
Package Change Current Version New Version Needed Version Looking good?
objective_c Breaking 9.3.0 9.4.0-wip 9.4.0-wip ✔️

This check can be disabled by tagging the PR with skip-breaking-check.

API leaks ✔️

The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.

Package Leaked API symbol Leaking sources

This check can be disabled by tagging the PR with skip-leaking-check.

@liamappelbe
Copy link
Copy Markdown
Contributor

Thanks for the PR @henawey-t.

The first thing I'm interested in is the regression test. I'm still confused about what's going on with this bug, and I'd really like to be able to reproduce the failure locally. If this regression test is a simple repro of the error, that alone would be extremely helpful.

But when I clone the test into my local main branch, it passes (after I comment out the compile time Finalizable check), even when I add a for loop to run it 1000 times. I think the first step is to focus on trying to repro the failure, even if it's a 0.1% flake. For example, if it's a threading issue, we have other tests that call into native code, spawn multiple threads, and even poke at the Dart embedder APIs. Take a look at the pkgs/ffigen/test/native_objc_test directory for examples (a good example is callOnSameThreadOutsideIsolate in block_test.m)

I have some ideas for how to fix this bug without making ObjCBlockBase finalizable, but I can't try them out without a repro.

@henawey-t
Copy link
Copy Markdown
Contributor Author

henawey-t commented Apr 16, 2026

Thanks for the PR @henawey-t.

The first thing I'm interested in is the regression test. I'm still confused about what's going on with this bug, and I'd really like to be able to reproduce the failure locally. If this regression test is a simple repro of the error, that alone would be extremely helpful.

But when I clone the test into my local main branch, it passes (after I comment out the compile time Finalizable check), even when I add a for loop to run it 1000 times. I think the first step is to focus on trying to repro the failure, even if it's a 0.1% flake. For example, if it's a threading issue, we have other tests that call into native code, spawn multiple threads, and even poke at the Dart embedder APIs. Take a look at the pkgs/ffigen/test/native_objc_test directory for examples (a good example is callOnSameThreadOutsideIsolate in block_test.m)

I have some ideas for how to fix this bug without making ObjCBlockBase finalizable, but I can't try them out without a repro.

I was able to reproduce it, but I have to enforce Dart GC from the Objc before calling the actual implementation of -[DOBJCDartProtocolBuilder implementMethod:withBlock:...], which produces the same issue we are facing. And I enforce it exactly at the start of the swizzled method before calling the original objc method.

I documented everything in the test/REPRODUCTION.md, let me know if anything need to clarfication.

@henawey-t henawey-t force-pushed the DOBJCDartProtocolBuilder-crash-fix branch from 0a6ec2f to f77d876 Compare April 16, 2026 16:49
@liamappelbe
Copy link
Copy Markdown
Contributor

Great work on the test. Looks like it was pretty fiddly to implement, but it works. I can repro now, so I'll try some alternative fixes.

@liamappelbe
Copy link
Copy Markdown
Contributor

Played around with some options. Here's a simple fix that doesn't require changing the block inheritance heirarchy:

  void implementMethod(
    Pointer<r.ObjCSelector> sel,
    Pointer<Char> signature,
    Pointer<Void> trampoline,
    ObjCBlockBase block,
  ) {
    if (_built) {
      throw StateError('Protocol is already built');
    }
    final blockRef = block.ref;  // Store the ref in a local.
    _builder.implementMethod(
      sel,
      withBlock: blockRef.pointer.cast(),
      withTrampoline: trampoline,
      withSignature: signature,
    );
  }

We should probably do a quick scan of the repo to see if there's anywhere else that we're doing block.ref.pointer or object.ref.pointer, and change them to store the .ref in a local.

@henawey-t
Copy link
Copy Markdown
Contributor Author

I did the changes as recommended and updated the tests. However, the repo scan is done. Running dart run ffigen revealed 24 unsafe sites in objective_c_bindings_generated.dart where block parameters have .ref.pointer passed inline to non-leaf FFI calls, for example:

  // NSArray.enumerateObjectsAtIndexes:options:usingBlock:
  _objc_msgSend_a3wp08(
    object$.ref.pointer,
    _sel_enumerateObjectsAtIndexes_options_usingBlock_,
    s.ref.pointer,
    options,
    usingBlock.ref.pointer,  // ← unsafe, same pattern
  );

One important distinction though: the object$.ref.pointer and s.ref.pointer usages are safe — those variables are NSArray/NSIndexSet which extend ObjCObject, and ObjCObjectRef transitively implementsFinalizable via _ObjCReference. The VM keeps them live. Only block parameters are at risk because ObjCBlockBase does not implement Finalizable.

Since objective_c_bindings_generated.dart is auto-generated, patching it directly would be overwritten on the next dart run ffigen. The fix should go into the FFIgen generator, whenever a method parameter has type ObjCBlock, emit the .ref extraction pattern instead of inlining .ref.pointer in the argument list.

@henawey-t henawey-t force-pushed the DOBJCDartProtocolBuilder-crash-fix branch 2 times, most recently from 0a4c2c5 to 9aab841 Compare April 17, 2026 15:55
@liamappelbe
Copy link
Copy Markdown
Contributor

Since objective_c_bindings_generated.dart is auto-generated, patching it directly would be overwritten on the next dart run ffigen. The fix should go into the FFIgen generator, whenever a method parameter has type ObjCBlock, emit the .ref extraction pattern instead of inlining .ref.pointer in the argument list.

Figures. I'll fix FFIgen myself in a follow up PR. Did you find any locations using this pattern outside of generated code?

Did you want to try to land this PR, or just show me the repro? If you want to land it, start by removing REPRODUCTION.md, and reduce the amount of documentation in finalizable_test.dart and gc_inject.m. It's too verbose atm. A short comment of every non-obvious function, and maybe comment on any particularly obscure lines of code is sufficient. Also, the CHANGELOG entry is a bit too long. The first sentence is good, but the second is unnecessary. If people want to know the details of the fix they can click through on the bug link.

@henawey-t
Copy link
Copy Markdown
Contributor Author

Figures. I'll fix FFIgen myself in a follow up PR. Did you find any locations using this pattern outside of generated code?

No unsafe manual usages outside generated code — all 24 sites were in objective_c_bindings_generated.dart (auto-generated, so FFIgen itself needs fixing).

Did you want to try to land this PR, or just show me the repro? If you want to land it, start by removing REPRODUCTION.md, and reduce the amount of documentation in finalizable_test.dart and gc_inject.m. It's too verbose atm. A short comment of every non-obvious function, and maybe comment on any particularly obscure lines of code is sufficient. Also, the CHANGELOG entry is a bit too long. The first sentence is good, but the second is unnecessary. If people want to know the details of the fix they can click through on the bug link.

I'm pushing a commit for these changes. Let me know if anything is unclear.

@henawey-t henawey-t force-pushed the DOBJCDartProtocolBuilder-crash-fix branch from ab92e7e to 91b4fa8 Compare April 20, 2026 15:08
Comment thread pkgs/objective_c/test/gc_safepoint_test.dart Outdated
Comment thread pkgs/objective_c/test/gc_safepoint_test.dart Outdated
Comment thread pkgs/objective_c/test/protocol_builder_test.dart
@liamappelbe
Copy link
Copy Markdown
Contributor

liamappelbe commented Apr 24, 2026

Formatting errors: https://github.com/dart-lang/native/actions/runs/24822639870/job/72798914910?pr=3307

@henawey-t henawey-t force-pushed the DOBJCDartProtocolBuilder-crash-fix branch 2 times, most recently from 36a7038 to 437e9eb Compare April 24, 2026 04:42
Henawey added 4 commits April 24, 2026 08:43
…Method

Fixes dart-lang#3209

When implementMethod extracted a raw pointer via block.ref.pointer.cast()
and passed it across the FFI boundary, the Dart compiler considered `block`
dead at that point. A GC safepoint inside the native call could then fire
before ObjC had a chance to retain the block pointer, triggering objc_release
through the finalizer chain and causing EXC_BAD_ACCESS.

Fix: add `implements Finalizable` to `_ObjCRefHolder`, which propagates to
ObjCObject and ObjCBlockBase via inheritance. The Dart compiler now treats
any local variable of those types as reachable until end of scope, keeping
the block alive across the safepoint.

Validated in production: crashes dropped from ~1.1K/week to zero after
applying this fix via dependency_override on objective_c 9.1.0.
Adds runtime tests to complement the compile-time type assertions for
issue dart-lang#3209. Includes:
- Runtime isA<Finalizable>() check on NSObject instances
- GC pressure test verifying a protocol object's retain count survives
  doGC() calls (exercises the retain-count machinery around build())

Note: the exact race (GC at a safepoint inside an FFI call) cannot be
triggered with doGC(), which runs between Dart instructions. The
compile-time assertions in the same file remain the primary regression
guard.
…lder)

The initial fix added implements Finalizable to _ObjCRefHolder, which
propagated to ObjCObject. But Finalizable objects are non-sendable across
Dart isolates — this broke NSInputStream tests that use Isolate.run() to
pass ObjCObject instances across isolate boundaries.

ObjCBlockBase is safe to annotate because blocks capture Dart closures and
are never sent across isolates. ObjCObject intentionally stays sendable.

Updated finalizable_test accordingly: removed ObjCObject compile-time
assertion, added a runtime check that verifies ObjCObject is NOT Finalizable
(preserving its isolate-sendability invariant).
…dart-lang#3209

Adds test infrastructure that deterministically reproduces the
EXC_BAD_ACCESS crash described in issue dart-lang#3209, where a Dart GC event
during an FFI safepoint collects an ObjCBlockBase wrapper before ObjC
retains the raw pointer.

gc_inject.m (new)
  ObjC method swizzle for -[DOBJCDartProtocolBuilder
  implementMethod:withBlock:withTrampoline:withSignature:]. The swizzle
  calls Dart_ExecuteInternalCommand("gc-now") while the Dart thread is
  blocked at an FFI safepoint — exactly the window where the bug fires.
  It then reads the block retain count from the flags field; a count of 0
  means the block was freed before ObjC could retain it.

hook/build.dart
  Compiles gc_inject.m into the test dylib on macOS alongside util.c.

test/util.dart
  Dart @Native bindings for the gc_inject.m entry points.
  callGCNowFromNative and gcAndGetRetainCount intentionally omit
  isLeaf:true — they call Dart_ExecuteInternalCommand, which requires
  the Dart thread to be at a proper native-mode safepoint.

test/finalizable_test.dart
  Extends the existing type assertions with:
  - Compile-time guard (_checkObjCBlockBaseIsFinalizable): dart analyze
    catches the regression before tests run.
  - Swizzle test: 1000 iterations, GC injected before ObjC retain;
    fails if wasBlockFreedBeforeRetain() returns true.
  - Direct liveness probe (_gcAndCheckBlock): never-inlined function
    with a local ObjCBlockBase, WeakReference, and a non-leaf FFI call
    that forces gc-now; sensitive to JIT liveness analysis.
  - Diagnostic test confirming gc-now works from native code.

test/desymbolize.py (new)
  Script to symbolize AOT crash dumps using atos. Auto-detects the
  ASLR slide from Dart_UnloadMachODylib annotations (Dart VM format)
  with a fallback to the (in binary_name) format (macOS crash reporter).
  Auto-detects binary architecture via lipo. Only sends Unknown symbol
  frames to atos; already-annotated frames pass through unchanged.

test/REPRODUCTION.md (new)
  Step-by-step guide to reproduce the crash (remove fix, AOT compile,
  run, crash) and verify the fix (restore, recompile, all pass).
  Explains the GC injection mechanism, the swizzle call chain, and
  how to interpret the desymbolized crash output.

To reproduce the crash:
  # Remove `implements Finalizable` from ObjCBlockBase in internal.dart
  # Also comment out _checkObjCBlockBaseIsFinalizable in finalizable_test.dart
  dart compile exe test/finalizable_test.dart -o /tmp/ft/finalizable_test
  DYLD_INSERT_LIBRARIES=.dart_tool/lib/objective_c.dylib /tmp/ft/finalizable_test
  # EXC_BAD_ACCESS in objc_retain, called from implementMethod

Fixes: dart-lang#3209
Henawey added 9 commits April 24, 2026 08:43
…approach

Apply the Flutter team's recommended fix for the GC-at-FFI-safepoint crash:
instead of adding `implements Finalizable` to ObjCBlockBase, extract
`block.ref` into a local `blockRef` (type ObjCBlockRef, which transitively
implements Finalizable via _ObjCReference) before each FFI call site. The
Finalizable contract then keeps `blockRef` live across the non-leaf safepoint,
preventing EXC_BAD_ACCESS before ObjC retains the pointer.

Update finalizable_test.dart to match:
- Remove the isA<Finalizable>() type assertion for ObjCBlockBase
- Update _gcAndCheckBlock to use blockRef extraction (mirrors the fix)
- Replace bare return with markTestSkipped in canDoGC guards
- Return bool from _gcAndCheckBlock instead of int
- Update group/test names and comments throughout
- Remove moot ObjCObject Finalizable test (sendability covered by isolate_test)
- Fix 'protocol object survives GC after build': add expect(obj, isNotNull)
  to keep obj live through doGC() calls; JIT optimizer drops it from the
  GC stack map after raw pointer extraction, causing premature collection
- Rename group to 'block wrapper not freed at GC safepoints'
- Remove unused dart:ffi import and issue dart-lang#3209 comment from file header
Extract obj.ref (ObjCObjectRef, which is Finalizable) into a local variable.
The Finalizable contract keeps ref in the GC stack map for its entire scope,
preventing the ObjC object from being released at GC safepoints inside
objectRetainCount (non-leaf FFI calls via isValidClass/calloc).

The previous expect(obj, isNotNull) approach was insufficient: in the async
state machine, obj is considered dead before the first await, so the optimizer
drops it from the GC stack map at safepoints in the initial synchronous
segment, causing the first retain count check to see 0.
Replace the ad-hoc BlockHeader flags bit manipulation in gc_inject.m and
gcAndGetRetainCount with the existing util.c functions:
- gc_inject_imp: use getBlockRetainCount (extern from util.c) instead of
  raw (flags & 0xFFFF) >> 1 inline
- _gcAndCheckBlock: split into callGCNowFromNative() + blockRetainCount(),
  which reads the block ABI flags field correctly on all architectures
- Remove gcAndGetRetainCount (now unused) and the BlockHeader struct
The async test was unreliable on ARM64: Dart's Finalizable guarantee
does not extend across await suspension points in async state machines.
Variables not referenced after an await may be pruned from the
continuation closure, allowing GC to release the ObjC object.

Restructure into a @pragma('vm:never-inline') synchronous helper,
mirroring _gcAndCheckBlock, where ref is a true stack local with
reliable Finalizable semantics.
@Henawey Henawey force-pushed the DOBJCDartProtocolBuilder-crash-fix branch from 437e9eb to 47d6a58 Compare April 24, 2026 04:45
@henawey-t
Copy link
Copy Markdown
Contributor Author

@Henawey Henawey force-pushed the DOBJCDartProtocolBuilder-crash-fix branch from 47d6a58 to fe08d03 Compare April 24, 2026 05:26
…e redundant object test

The 'protocol object survives GC after build' test kept failing on CI (ARM64)
because doGC() is inlinable — after inlining, the optimizer dropped ref from
the GC root set before the GC trigger, causing a spurious retain-count-0.
The test didn't cover the issue dart-lang#3209 fix; general ObjCObjectRef lifecycle is
already covered by autorelease_test, nsarray_test, nsset_test, nsdictionary_test,
ns_input_stream_test, and observer_test.

Renamed to protocol_builder_test.dart to match the <feature>_test.dart convention.
@Henawey Henawey force-pushed the DOBJCDartProtocolBuilder-crash-fix branch from fe08d03 to 4d3343f Compare April 24, 2026 05:28
@henawey-t
Copy link
Copy Markdown
Contributor Author

Great, the PR is green now. @liamappelbe let me know if anything is needed from my side to merge the PR.

Copy link
Copy Markdown
Contributor

@liamappelbe liamappelbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but you'll have a merge conflict with #3325. You're adding a test-only util, so it should be gated behind the includeTestUtils flag, just like test/util.c

@liamappelbe
Copy link
Copy Markdown
Contributor

Actually it's a small enough merge that I can do it myself through github's UI

@liamappelbe liamappelbe merged commit 8452189 into dart-lang:main Apr 28, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Crash EXC_BAD_ACCESS (KERN_INVALID_ADDRESS) on [DOBJCDartProtocolBuilder implementMethod:withBlock:withTrampoline:withSignature:] + 33 (protocol.m:33)

3 participants