Skip to content

fix: make ReferenceCount thread-safe with atomic operations#261

Closed
lodekeeper-z wants to merge 1 commit intoChainSafe:mainfrom
lodekeeper-z:fix/refcount-atomics
Closed

fix: make ReferenceCount thread-safe with atomic operations#261
lodekeeper-z wants to merge 1 commit intoChainSafe:mainfrom
lodekeeper-z:fix/refcount-atomics

Conversation

@lodekeeper-z
Copy link
Copy Markdown
Contributor

Motivation

ReferenceCount uses plain usize for its reference counter, which is a data race under concurrent access. This is used by:

  • EffectiveBalanceIncrements
  • SyncCommitteeCache
  • EpochShuffling

These caches are accessed via EpochCache which will be shared across threads when NAPI bindings are used with multi-threaded BLS verification.

Changes

  • Atomic ref count: Switch _ref_count from usize to std.atomic.Value(usize)
  • Acquire: fetchAdd(1, .monotonic) — no ordering needed for increment
  • Release: fetchSub(1, .acq_rel) — ensures all prior writes are visible before deinit
  • Fix clone() bug: Was missing allocator argument (compile error if ever called)
  • Remove resolved TODO comment

The memory ordering follows the standard reference counting pattern (similar to Rust's Arc).

🤖 Generated with AI assistance

- Switch _ref_count from plain usize to std.atomic.Value(usize)
- Use fetchAdd(.monotonic) for acquire, fetchSub(.acq_rel) for release
- Fix clone() bug: was missing allocator argument (would not compile if called)
- Remove stale TODO comment about switching to std.atomic

The ReferenceCount utility is used by EffectiveBalanceIncrements,
SyncCommitteeCache, and EpochShuffling caches. Without atomics,
concurrent acquire/release from multiple threads (e.g. via NAPI
bindings with multi-threaded BLS) would be a data race.

🤖 Generated with AI assistance
@lodekeeper-z lodekeeper-z requested a review from a team as a code owner March 22, 2026 04:08
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a data race condition in the ReferenceCount struct by replacing the plain usize reference counter with an atomic counter. This change ensures thread safety when the ReferenceCount is accessed concurrently, particularly in scenarios involving multi-threaded BLS verification. Additionally, a bug in the clone() method was fixed, and a resolved TODO comment was removed.

Highlights

  • Thread Safety: The ReferenceCount struct was updated to use atomic operations for its reference counter, ensuring thread safety.
  • Memory Ordering: Implemented acquire/release semantics using fetchAdd and fetchSub for proper memory ordering during reference counting.
  • Bug Fix: Corrected a bug in the clone() method by adding the missing allocator argument.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly implements thread-safety for ReferenceCount by using atomic operations, which is a crucial improvement given its use in shared caches. The memory ordering choices for acquire and release are appropriate and follow standard practices for atomic reference counting. The bug fix in the clone function is also a good catch. I've included a couple of suggestions to further align the code with the repository's style guide, focusing on using explicitly-sized integers and adding assertions for increased robustness.

Comment on lines +9 to +17
_ref_count: std.atomic.Value(usize),
instance: T,

pub fn init(allocator: Allocator, instance: T) !*@This() {
const ptr = try allocator.create(@This());
errdefer allocator.destroy(ptr);
ptr.* = .{
.allocator = allocator,
._ref_count = 1,
._ref_count = std.atomic.Value(usize).init(1),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

According to the repository style guide (L44), architecture-specific types like usize should be avoided in favor of explicitly-sized types. Please consider using u32 for the reference counter, as it's unlikely to be exceeded and improves portability.

        _ref_count: std.atomic.Value(u32),
        instance: T,

        pub fn init(allocator: Allocator, instance: T) !*@This() {
            const ptr = try allocator.create(@This());
            errdefer allocator.destroy(ptr);
            ptr.* = .{
                .allocator = allocator,
                ._ref_count = std.atomic.Value(u32).init(1),
References
  1. Use explicitly-sized types like u32 for everything, avoid architecture-specific usize. (link)

Comment on lines 37 to 47
pub fn acquire(self: *@This()) *@This() {
self._ref_count += 1;
_ = self._ref_count.fetchAdd(1, .monotonic);
return self;
}

pub fn release(self: *@This()) void {
self._ref_count -= 1;
if (self._ref_count == 0) {
const prev = self._ref_count.fetchSub(1, .acq_rel);
if (prev == 1) {
self.deinit();
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The repository style guide (L51) recommends asserting pre/postconditions and invariants to improve safety. In acquire, you can assert against overflow, and in release, you can assert against underflow (which could indicate a double-free).

        pub fn acquire(self: *@This()) *@This() {
            const prev = self._ref_count.fetchAdd(1, .monotonic);
            // Assert against overflow.
            std.debug.assert(prev < @typeInfo(@TypeOf(prev)).Int.max);
            return self;
        }

        pub fn release(self: *@This()) void {
            const prev = self._ref_count.fetchSub(1, .acq_rel);
            // Assert against underflow (e.g. double release).
            std.debug.assert(prev > 0);
            if (prev == 1) {
                self.deinit();
            }
        }
References
  1. Assert all function arguments and return values, pre/postconditions and invariants. A function must not operate blindly on data it has not checked. The assertion density of the code must average a minimum of two assertions per function. (link)

@GrapeBaBa
Copy link
Copy Markdown
Contributor

There is already a PR #62 for this.

@lodekeeper-z
Copy link
Copy Markdown
Contributor Author

Closing in favor of #62 which addresses the same issue and includes the rename to RefCount with ref/unref API. Thanks for pointing that out @GrapeBaBa!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants