Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the performance of BLS signature verification by introducing multithreaded execution capabilities. It achieves this by updating the core Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces multi-threaded verification for verifyMultipleAggregateSignatures by updating the blst-z dependency and leveraging its new thread pool capabilities, including initializing and de-initializing the thread pool, adapting to the updated blst-z API, and using *const pointers. However, it also introduces a critical security vulnerability due to several instances of unsafe alignment casts using @alignCast on buffers that are not explicitly aligned, which can lead to Undefined Behavior, crashes, or instability. These buffers must be properly aligned at their declaration point to mitigate this risk. Additionally, there are concerns regarding build configuration and dependency management that could impact safety and reproducibility.
| }, | ||
| "scripts": { | ||
| "prepare": "zig build build-lib:bindings -Doptimize=ReleaseSafe", | ||
| "prepare": "zig build build-lib:bindings -Doptimize=ReleaseFast", |
There was a problem hiding this comment.
Changing the optimization level to ReleaseFast disables all runtime safety checks in Zig, which can lead to undefined behavior (e.g., from out-of-bounds memory access) going undetected. The style guide (line 5) explicitly prioritizes safety over performance. For a cryptographic library, sacrificing these safety checks for performance is a critical risk. Please revert this to ReleaseSafe to maintain the project's safety guarantees. If ReleaseFast is deemed essential for performance, this trade-off should be explicitly justified and ideally confined to benchmark-specific build configurations, not the default prepare script.
| "prepare": "zig build build-lib:bindings -Doptimize=ReleaseFast", | |
| "prepare": "zig build build-lib:bindings -Doptimize=ReleaseSafe", |
References
- The style guide prioritizes safety over performance: "Our design goals are safety, performance, and developer experience. In that order." The change to
ReleaseFastcontradicts this by disabling runtime safety checks. (link)
There was a problem hiding this comment.
The underlying blst library uses -O2 for the cryptographic operations, but the blst-ts library uses -O3. So we should do the same
| const result = sig.aggregateVerify( | ||
| sig_groupcheck, | ||
| &pairing_buf, | ||
| @alignCast(&pairing_buf), |
There was a problem hiding this comment.
The use of @alignCast on &pairing_buf is unsafe because pairing_buf is declared as a byte array [Pairing.sizeOf()]u8 (line 549), which has a default alignment of 1. If the aggregateVerify function expects a pointer with a higher alignment (e.g., aligned to the Pairing struct), this results in Undefined Behavior if the buffer is not correctly aligned at runtime. This can lead to crashes on architectures with strict alignment requirements or when using SIMD instructions. To fix this, the declaration of pairing_buf should be updated to include an explicit alignment: var pairing_buf: [Pairing.sizeOf()]u8 align(@alignOf(Pairing)) = undefined;
| var pairing_buf: [Pairing.sizeOf()]u8 = undefined; | ||
| // `pks_validate` is always false here since we assume proof of possession for public keys. | ||
| const result = sig.fastAggregateVerify(sigs_groupcheck, &pairing_buf, msg_info.data[0..32], DST, pks, false) catch { | ||
| const result = sig.fastAggregateVerify(sigs_groupcheck, @alignCast(&pairing_buf), msg_info.data[0..32], DST, pks, false) catch { |
There was a problem hiding this comment.
The use of @alignCast on &pairing_buf is unsafe because pairing_buf is declared as a byte array [Pairing.sizeOf()]u8 (line 599), which has a default alignment of 1. If the fastAggregateVerify function expects a pointer with a higher alignment, this results in Undefined Behavior. To fix this, the declaration of pairing_buf should be updated to include an explicit alignment: var pairing_buf: [Pairing.sizeOf()]u8 align(@alignOf(Pairing)) = undefined;
|
On cayman/blst-mt, it seems to outperform this in larger sets. Probably because the pooling is done on the binding level instead of the native layer: |
|
Integrated the changes from cayman/blst-mt here, and numbers are a lot more reasonable for |
92a3dd5 to
36b5d38
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request aims to significantly improve performance by introducing a persistent thread pool (PairingPool) for parallelizing BLS verification operations, reducing thread management overhead and memory allocations. However, the current implementation of the PairingPool in bindings/napi/blst.zig contains critical race conditions due to the use of a global singleton and shared scratch buffers without proper synchronization. This poses a significant security risk, including potential signature bypass and denial of service in multi-threaded environments. Additionally, the review identified areas for improvement in adherence to the project's style guide, specifically regarding constant naming, explicit type usage, assertion density, an unused import, and safer handling of pointer operations.
4d086ed to
ed21a61
Compare
d2e52ea to
e923570
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a persistent thread pool in Zig to parallelize BLS verification, which significantly improves performance as demonstrated by the new benchmarks. The implementation is well-structured, and the zero-copy optimizations are a great addition. My review includes several suggestions to align the new code with the repository's style guide, a critical fix for a potential memory alignment issue, and a point for discussion regarding build optimization settings.
Note: Security Review did not run due to the size of the PR.
| const AggVerifyJob = struct { | ||
| pairing_bufs: [][Pairing.sizeOf()]u8, | ||
| has_work: []bool, | ||
| counter: std.atomic.Value(usize), | ||
| err_flag: std.atomic.Value(bool), | ||
| sig: *const Signature, | ||
| sig_groupcheck: bool, | ||
| msgs: []const *const [32]u8, | ||
| dst: []const u8, | ||
| pks: []const *const PublicKey, | ||
| pks_validate: bool, | ||
| n_elems: usize, | ||
| }; |
There was a problem hiding this comment.
This struct definition violates a few style guide rules:
- Rule 216:
snake_caseis used for field names (e.g.,pairing_bufs,has_work) instead of the requiredcamelCase. - Rule 221: Several field names are abbreviated (e.g.,
sig,msgs,pks,dst,n_elems).
Please use descriptive, camelCase names to improve readability and adhere to the style guide. This applies to VerifyMultiJob and other new variables as well.
const AggVerifyJob = struct {
pairingBufs: [][Pairing.sizeOf()]u8,
hasWork: []bool,
counter: std.atomic.Value(usize),
errorFlag: std.atomic.Value(bool),
signature: *const Signature,
signatureGroupcheck: bool,
messages: []const *const [32]u8,
domainSeparationTag: []const u8,
publicKeys: []const *const PublicKey,
publicKeysValidate: bool,
elementsCount: usize,
};
There was a problem hiding this comment.
think this is very pedantic, we should probably change the style guide
| .scratch_msg_ptrs = allocator.alloc(*const [32]u8, SCRATCH_MAX) catch @panic("PairingPool: OOM"), | ||
| .scratch_pk_ptrs = allocator.alloc(*const PublicKey, SCRATCH_MAX) catch @panic("PairingPool: OOM"), | ||
| .scratch_sig_ptrs = allocator.alloc(*const Signature, SCRATCH_MAX) catch @panic("PairingPool: OOM"), |
There was a problem hiding this comment.
The style guide (line 400) enforces a hard line length limit of 100 columns. These lines exceed that limit. Please reformat them to comply. zig fmt can often help with this if you add a trailing comma inside the function call.
.scratch_msg_ptrs = allocator.alloc(*const [32]u8, SCRATCH_MAX) catch @panic(
"PairingPool: OOM",
),
.scratch_pk_ptrs = allocator.alloc(*const PublicKey, SCRATCH_MAX) catch @panic(
"PairingPool: OOM",
),
.scratch_sig_ptrs = allocator.alloc(*const Signature, SCRATCH_MAX) catch @panic(
"PairingPool: OOM",
),
References
- Hard limit all line lengths, without exception, to at most 100 columns for a good typographic "measure". (link)
There was a problem hiding this comment.
I think having it all in one line looks cleaner
There was a problem hiding this comment.
In this test file we have both benchmarks for blst-z and blst-ts but this is really just for the PRs sake, we should remove the typescript benchmarks once we're happy with results.
81f6a42 to
e5c9294
Compare
twoeths
left a comment
There was a problem hiding this comment.
did not finish review, overall looks good
- I want to improve the documentation part to make it easier to maintain and follow
- also I feel like PairingPool should stay at the native layer
only pull napi specific logic to this binding layer (maybe scratch buffers are the only ones)
the goal is to also use PairingPool for a native zig client. We will need it anyway.
| for (work_ready) |*e| e.* = .{}; | ||
| for (work_done) |*e| e.* = .{}; | ||
|
|
||
| const threads = allocator.alloc(std.Thread, n_bg) catch @panic("PairingPool: OOM"); |
There was a problem hiding this comment.
when is it free()?
How about add a deinit and wire it into the existing InstanceData.finalize path:
fn deinit(pool: *PairingPool) void {
// Signal all workers to exit, then wait (or just accept the leak is intentional)
allocator.free(pool.workers);
allocator.free(pool.work_items);
// ... all other fields
allocator.destroy(pool);
instance = null;
}
hmm, that's kinda the problem I ran into. My previous iteration put the memory pools at the native layer, but that was slow versus @wemeetagain 's suggestion branch. Perhaps my previous approach was wrong, i'll try again in another branch and see if it beats this. I suspect the reason is that managing memory on the layer above the native layer is better than at the native layer itself, since we can batch things before crossing the napi boundary |
63dfbd4 to
1ec9d3f
Compare
Is it caused by turn on the protable mode in the ChainSafe/blst-z#60? Rust binding uses false for default. |
Its possible but not guaranteed. We did testing for perf when we made that change and there was virtually no difference. We have a number of users that had hard crashes on older cpus and it required the portable flag to resolve the issue. |
|
consolidating in #306 |
pre-requisite for ChainSafe/lodestar#8900
Integrated the changes from cayman/blst-mt here, and numbers are a lot more reasonable for
verifyMultipleAggregateSignatures.Turns out the pooling done on the binding layer instead of natively in zig was the difference. The previous attempts were targeted at doing multi-threaded verification on the native layer (see ChainSafe/blst-z#60), but that turned out to be pretty bad - see upstream PR). This PR instead uses
blst-zas is without any changes and does pooling/multi-threading on the napi layer instead.I ran this a few times, mostly got consistently ~=
@chainsafe/blstresults.To run:
On an M2 Pro:
With @wemeetagain 's branch, it was about similar: