Skip to content

fix(bindings): refcount Pool to fix teardown panic#352

Open
GrapeBaBa wants to merge 5 commits intomainfrom
gr/binding-pool-refcount
Open

fix(bindings): refcount Pool to fix teardown panic#352
GrapeBaBa wants to merge 5 commits intomainfrom
gr/binding-pool-refcount

Conversation

@GrapeBaBa
Copy link
Copy Markdown
Contributor

@GrapeBaBa GrapeBaBa commented May 7, 2026

Motivation

Reproduces on main and every binding-feature branch (#165, #346, #351): any process that holds a BeaconStateView at module scope panics during process exit:

thread N panic: incorrect alignment
src/persistent_merkle_tree/Node.zig:387:9 in unref
src/ssz/tree_view/chunks.zig:38:30 in deinit
src/ssz/tree_view/container.zig:97:49 in clearChildrenDataCache
src/state_transition/cache/state_cache.zig:102:26 in deinit

Process exits with code 134 (SIGABRT) instead of 0.

Root cause: NAPI env cleanup hook (napi_add_env_cleanup_hook) fires before module-level JS holders of native objects are finalized. The previous cleanup callback unconditionally called pool.state.deinit(), freeing the Node.Pool's MultiArrayList storage. Live BeaconStateView instances rooted by module-level const had not yet been finalized; when their finalizers eventually ran, the chained pool.unref(self.root) walked freed memory and panicked.

Module-scope is the standard pattern for benchmark fixtures (@chainsafe/benchmark + perf tests). Without the fix, every such test exits non-zero even when the bench itself succeeds.

Description

Wrap Node.Pool in the existing RefCount(T) (src/state_transition/utils/ref_count.zig):

  • Module init holds ref = 1.
  • Each BeaconStateView that owns a cached_state holds another ref.
  • pool.state.deinit() releases the module ref but only actually destroys the pool when the count reaches zero — deferring destruction past every live view's finalizer.

Order in BeaconStateView.deinit:

  1. cached_state.deinit() — does all the pool.unref(...) calls
  2. pool_rc.unref() — releases this view's pool ref; if last, real pool.deinit runs here

GrapeBaBa added 2 commits May 7, 2026 20:59
NAPI env cleanup hooks fire before module-level JS holders of native
objects (e.g. `const seedState = createFromBytes(...)`) get finalized.
The previous unconditional `pool.state.deinit()` in cleanup freed the
Pool's MultiArrayList while live `BeaconStateView` instances still
held references. When their finalizers later ran, the chained
`pool.unref()` walked freed memory and panicked with "incorrect
alignment" — process exited 134 / SIGABRT.

Wraps `Node.Pool` in the existing `RefCount(T)`. Module init holds
ref=1; each `BeaconStateView` that owns a `cached_state` holds
another. `pool.state.deinit()` releases the module ref but only
actually destroys the pool when the count reaches zero, deferring
teardown past every `BeaconStateView` finalizer.

- `state_transition/root.zig`: export `RefCount`
- `bindings/napi/pool.zig`: wrap pool, expose `pool()` and `poolRc()`
- `bindings/napi/BeaconStateView.zig`: hold a `pool_rc` alongside
  `cached_state`, ref on each constructor that builds one, unref in
  `deinit` after `cached_state.deinit` finishes its `pool.unref` calls.

Adds `bindings/test/teardown.test.ts`: spawns a subprocess that creates
a module-scope `BeaconStateView` and exits, expects exit 0 and no
"panic:" on stderr. Reproduces the bug on main; passes with this fix.
@GrapeBaBa GrapeBaBa requested a review from a team as a code owner May 7, 2026 13:10
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical teardown panic occurring in native bindings when BeaconStateView instances are held at the module scope. By implementing a reference counting wrapper for the Node.Pool, the system now correctly defers pool destruction until all active views have been finalized, ensuring memory safety during NAPI environment cleanup.

Highlights

  • Reference Counting for Node.Pool: Introduced a reference counting mechanism for the Node.Pool to ensure it remains alive until all dependent BeaconStateView instances are finalized, preventing premature memory deallocation.
  • NAPI Teardown Fix: Updated the NAPI environment cleanup hook to release the module's reference to the pool rather than unconditionally destroying it, resolving SIGABRT panics during process exit.
  • Regression Testing: Added a new integration test that verifies module-scope BeaconStateView instances do not cause panics during process teardown.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces reference counting for the shared Node.Pool used by BeaconStateView to prevent 'incorrect alignment' panics during NAPI environment cleanup at process exit. The pool.zig module now wraps the pool in a RefCount structure, and BeaconStateView instances hold a strong reference to this pool for their lifetime. A new teardown test ensures that module-scope views exit cleanly. Feedback focuses on adhering to the repository's style guide regarding proper acronym capitalization (e.g., renaming PoolRc to PoolRC) and increasing assertion density to verify that the pool is initialized before access.

Comment thread bindings/napi/pool.zig

const default_pool_size: u32 = 0;

pub const PoolRc = RefCount(Node.Pool);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

According to the style guide (Rule 224), acronyms should use proper capitalization. Since RC stands for Reference Count, it should be capitalized as PoolRC.

pub const PoolRC = RefCount(Node.Pool);
References
  1. Use proper capitalization for acronyms (VSRState, not VsrState). (link)

Comment thread bindings/napi/pool.zig
pub const State = struct {
pool: Node.Pool = undefined,
initialized: bool = false,
pool_rc: ?*PoolRc = null,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the type name to PoolRC to adhere to the acronym capitalization rule.

    pool_rc: ?*PoolRC = null,

Comment thread bindings/napi/pool.zig
if (self.pool_rc != null) return;
var pool_value = try Node.Pool.init(allocator, default_pool_size);
errdefer pool_value.deinit();
self.pool_rc = try PoolRc.init(allocator, pool_value);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the type name to PoolRC to adhere to the acronym capitalization rule.

        self.pool_rc = try PoolRC.init(allocator, pool_value);

Comment thread bindings/napi/pool.zig
Comment on lines +40 to +42
pub fn pool(self: *State) *Node.Pool {
return &self.pool_rc.?.instance;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The style guide (Rule 51) requires asserting function arguments and invariants. Adding an explicit assertion here improves safety and follows the 'fail-fast' principle.

    pub fn pool(self: *State) *Node.Pool {
        std.debug.assert(self.pool_rc != null);
        return &self.pool_rc.?.instance;
    }
References
  1. Assert all function arguments and return values, pre/postconditions and invariants. (link)

Comment thread bindings/napi/pool.zig
Comment on lines +44 to 46
pub fn poolRc(self: *State) *PoolRc {
return self.pool_rc.?;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Rename the function to poolRC and update the return type to PoolRC to adhere to the acronym capitalization rule (Rule 224). Additionally, add an assertion for the pool initialization invariant.

    pub fn poolRC(self: *State) *PoolRC {
        std.debug.assert(self.pool_rc != null);
        return self.pool_rc.?;
    }

/// `cached_state`. Released in `deinit` AFTER `cached_state.deinit` finishes
/// its `pool.unref` calls, so the pool stays alive even if the module's
/// NAPI env cleanup hook fired before this view's JS finalizer.
pool_rc: ?*pool.PoolRc = null,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the type reference to PoolRC to match the updated naming convention in pool.zig.

pool_rc: ?*pool.PoolRC = null,

const slot_value = fork_types.readSlotFromAnyBeaconStateBytes(byte_slice);
const fork_seq = config.state.config.forkSeq(slot_value);
state.* = try AnyBeaconState.deserialize(allocator, &pool.state.pool, fork_seq, byte_slice);
state.* = try AnyBeaconState.deserialize(allocator, pool.state.pool(), fork_seq, byte_slice);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Add an assertion to verify that the shared pool is initialized before use, as required by the style guide's assertion density and safety rules (Rule 51).

    std.debug.assert(pool.state.pool_rc != null);
    state.* = try AnyBeaconState.deserialize(allocator, pool.state.pool(), fork_seq, byte_slice);

return .{ .cached_state = cached_state };
return .{
.cached_state = cached_state,
.pool_rc = pool.state.poolRc().ref(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the call to poolRC() to match the renamed function in pool.zig.

        .pool_rc = pool.state.poolRC().ref(),


try st.processSlots(allocator, napi_io.get(), post_state, slot_value, .{});
return .{ .cached_state = post_state };
return .{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Add an assertion to verify that the shared pool is initialized before returning a new view, ensuring the invariant holds and increasing assertion density.

    std.debug.assert(pool.state.pool_rc != null);
    return .{

return .{ .cached_state = post_state };
return .{
.cached_state = post_state,
.pool_rc = pool.state.poolRc().ref(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the call to poolRC() to match the renamed function in pool.zig.

        .pool_rc = pool.state.poolRC().ref(),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants