fix(bindings): refcount Pool to fix teardown panic#352
fix(bindings): refcount Pool to fix teardown panic#352
Conversation
NAPI env cleanup hooks fire before module-level JS holders of native objects (e.g. `const seedState = createFromBytes(...)`) get finalized. The previous unconditional `pool.state.deinit()` in cleanup freed the Pool's MultiArrayList while live `BeaconStateView` instances still held references. When their finalizers later ran, the chained `pool.unref()` walked freed memory and panicked with "incorrect alignment" — process exited 134 / SIGABRT. Wraps `Node.Pool` in the existing `RefCount(T)`. Module init holds ref=1; each `BeaconStateView` that owns a `cached_state` holds another. `pool.state.deinit()` releases the module ref but only actually destroys the pool when the count reaches zero, deferring teardown past every `BeaconStateView` finalizer. - `state_transition/root.zig`: export `RefCount` - `bindings/napi/pool.zig`: wrap pool, expose `pool()` and `poolRc()` - `bindings/napi/BeaconStateView.zig`: hold a `pool_rc` alongside `cached_state`, ref on each constructor that builds one, unref in `deinit` after `cached_state.deinit` finishes its `pool.unref` calls. Adds `bindings/test/teardown.test.ts`: spawns a subprocess that creates a module-scope `BeaconStateView` and exits, expects exit 0 and no "panic:" on stderr. Reproduces the bug on main; passes with this fix.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical teardown panic occurring in native bindings when Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces reference counting for the shared Node.Pool used by BeaconStateView to prevent 'incorrect alignment' panics during NAPI environment cleanup at process exit. The pool.zig module now wraps the pool in a RefCount structure, and BeaconStateView instances hold a strong reference to this pool for their lifetime. A new teardown test ensures that module-scope views exit cleanly. Feedback focuses on adhering to the repository's style guide regarding proper acronym capitalization (e.g., renaming PoolRc to PoolRC) and increasing assertion density to verify that the pool is initialized before access.
|
|
||
| const default_pool_size: u32 = 0; | ||
|
|
||
| pub const PoolRc = RefCount(Node.Pool); |
There was a problem hiding this comment.
According to the style guide (Rule 224), acronyms should use proper capitalization. Since RC stands for Reference Count, it should be capitalized as PoolRC.
pub const PoolRC = RefCount(Node.Pool);
References
- Use proper capitalization for acronyms (VSRState, not VsrState). (link)
| pub const State = struct { | ||
| pool: Node.Pool = undefined, | ||
| initialized: bool = false, | ||
| pool_rc: ?*PoolRc = null, |
| if (self.pool_rc != null) return; | ||
| var pool_value = try Node.Pool.init(allocator, default_pool_size); | ||
| errdefer pool_value.deinit(); | ||
| self.pool_rc = try PoolRc.init(allocator, pool_value); |
| pub fn pool(self: *State) *Node.Pool { | ||
| return &self.pool_rc.?.instance; | ||
| } |
There was a problem hiding this comment.
The style guide (Rule 51) requires asserting function arguments and invariants. Adding an explicit assertion here improves safety and follows the 'fail-fast' principle.
pub fn pool(self: *State) *Node.Pool {
std.debug.assert(self.pool_rc != null);
return &self.pool_rc.?.instance;
}
References
- Assert all function arguments and return values, pre/postconditions and invariants. (link)
| pub fn poolRc(self: *State) *PoolRc { | ||
| return self.pool_rc.?; | ||
| } |
There was a problem hiding this comment.
| /// `cached_state`. Released in `deinit` AFTER `cached_state.deinit` finishes | ||
| /// its `pool.unref` calls, so the pool stays alive even if the module's | ||
| /// NAPI env cleanup hook fired before this view's JS finalizer. | ||
| pool_rc: ?*pool.PoolRc = null, |
| const slot_value = fork_types.readSlotFromAnyBeaconStateBytes(byte_slice); | ||
| const fork_seq = config.state.config.forkSeq(slot_value); | ||
| state.* = try AnyBeaconState.deserialize(allocator, &pool.state.pool, fork_seq, byte_slice); | ||
| state.* = try AnyBeaconState.deserialize(allocator, pool.state.pool(), fork_seq, byte_slice); |
There was a problem hiding this comment.
| return .{ .cached_state = cached_state }; | ||
| return .{ | ||
| .cached_state = cached_state, | ||
| .pool_rc = pool.state.poolRc().ref(), |
|
|
||
| try st.processSlots(allocator, napi_io.get(), post_state, slot_value, .{}); | ||
| return .{ .cached_state = post_state }; | ||
| return .{ |
| return .{ .cached_state = post_state }; | ||
| return .{ | ||
| .cached_state = post_state, | ||
| .pool_rc = pool.state.poolRc().ref(), |
Motivation
Reproduces on
mainand every binding-feature branch (#165, #346, #351): any process that holds aBeaconStateViewat module scope panics during process exit:Process exits with code 134 (SIGABRT) instead of 0.
Root cause: NAPI env cleanup hook (
napi_add_env_cleanup_hook) fires before module-level JS holders of native objects are finalized. The previouscleanupcallback unconditionally calledpool.state.deinit(), freeing theNode.Pool'sMultiArrayListstorage. LiveBeaconStateViewinstances rooted by module-levelconsthad not yet been finalized; when their finalizers eventually ran, the chainedpool.unref(self.root)walked freed memory and panicked.Module-scope is the standard pattern for benchmark fixtures (
@chainsafe/benchmark+ perf tests). Without the fix, every such test exits non-zero even when the bench itself succeeds.Description
Wrap
Node.Poolin the existingRefCount(T)(src/state_transition/utils/ref_count.zig):ref = 1.BeaconStateViewthat owns acached_stateholds another ref.pool.state.deinit()releases the module ref but only actually destroys the pool when the count reaches zero — deferring destruction past every live view's finalizer.Order in
BeaconStateView.deinit:cached_state.deinit()— does all thepool.unref(...)callspool_rc.unref()— releases this view's pool ref; if last, realpool.deinitruns here