-
Notifications
You must be signed in to change notification settings - Fork 64
Description
We now aggregate buckets inside each thread. Instead, we should try to aggregate only once when all the buckets from all the threads are calculated. There should be a global buckets instance that simply accumulates buckets from each thread on thread completion. So when each thread finishes calculating buckets it simply adds those buckets to the global buckets. When all the threads finish then global buckets are aggregated. This likely will not significantly change parallel version total time, but it should greatly reduce the CPU cycles because the buckets aggregation will be done only once. However, the sequential version should become significantly faster. PeerDAS functions that use a lot of MSM should improve significantly.
@ArtiomTr could you try to adjust the current BGMW implementation in a separate branch?