-
Notifications
You must be signed in to change notification settings - Fork 231
Add support for syncing replica_2_9 data from peers #812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ldmberman
wants to merge
90
commits into
master
Choose a base branch
from
lb/sync-replica-2-9
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
deed12b to
f72704b
Compare
Member
Author
|
A summary of the recent changes:
|
f2accdb to
391647e
Compare
66bd51b to
7587176
Compare
b9c681a to
64e57cd
Compare
34960ff to
5c9ad26
Compare
7f282f1 to
6e05931
Compare
2491973 to
e25c853
Compare
when syncing data roots
…nt count to avoid constant recalculating
also try to reduce some of the coordinator and peer_worker reductions be replacing gen_server:call with gen_server:cast and caching the formatted ppeer
…unk from each footprint This prevents the node from syncing the same footprint 1024 times (once for each chunk)
also remove a bunch of debugging logs
7347030 to
169551d
Compare
ldmberman
commented
Jan 14, 2026
This change make `enqueue_task` synchronous, but should not impact performance. Previously enqueue_task was asynchronous, but process_queue was synchronous... meaning we lost the benefit of the aynchronous call *and* caused a race condition. New update removes the async call but doesn't add any net new synchronous calls.
eacf584 to
b01edcc
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The main challenge with syncing replica_2_9 data is limiting the amount of entropy that has to be generated. In the optimal case a given set of entropy is only generated once. We refer to a set of entropy as a "footprint" of entropy because the entropy is assigned to a set of non-contiguous chunks that are distributed throughout the partition. Not the best analogy but it avoids yet another version of batch/set/group/chunk/interval/etc...
In order to optimize the entropy generation during syncing 3 things need to happen:
Changing the order in which chunks are synced
This PR introduces a new
footprintsyncing mode in addition to the pre-existingnormalsync mode. This mode has its own data discovery, peer interval gathering, and syncing logic. To aid with data discovery two new endpoints are added:GET /footprint_buckets: returns an ETF-serialized compact but imprecise representation of the synced footprints - a bucket size and a map where every key is the sequence number of the bucket, every value -the percentage of data synced in the reported bucket. Every bucket contains one or more footprints. Note that while footprints in the buckets are adjacent in the sense that the first chunk in the first footprint is adjacent to the first chunk in the second footprint, chunks are generally not adjacent because of how the logic of 2.9 footprints goes.GET /footprints/<partition>/<footprint>: Return the information about the presence of the data from the given footprint in the given partition. The returned intervals contain the numbers of the chunks starting from 0 belonging to the given footprint (and present on this node). The footprint is constructed like a replica 2.9 entropy footprint where chunks are spread out across the partition. Therefore, the interval [0, 2] does not denote%% two adjacent chunks but rather two chunks separated by
%% ar_block:get_replica_2_9_entropy_count() chunks.
The node will query the above endpoints from peers and maintain a mapping of peer->footprint->chunk so that it can sync all chunks in a given footprint from the same peer. The node will then alternate between
normalsyncing (not footprint-aligned, and only from nodes advertisingunpackedorspora_2_6data) andfootprintsyncing (footprint-aligned and only from nodes advertisingreplica_2_9data)Entropy cache
The size allocated to the entropy cache is set via the new
replica_2_9_entropy_cache_size_mbconfig option: the maximum cache size (in MiB) to allocate for for replica.2.9 entropy. Each cached entropy is 256 MiB. The bigger the cache, the more replica.2.9 data can be synced concurrently.Limiting how many footprints can be synced at once
The
ar_data_sync_coordinatorandar_peer_workermodules now track how many footprints are being synced at once and will queue up additional sync requests once the maximum number is hit. Maximum number of concurrent footprints is determined by the size of the footprint cache.Other Changes
data_rootsindex that gets downloaded from peers in the background. With this index a node is able to validate a historical chunk even if the node hasn't synced the block and transaction headers corresponding to the chunk. Thedata_rootsindex is much much smaller than the full blockchain data../bin/testand./bin/e2ecan now run individual tests and not just modules. e.g../bin/test ar_my_test_module:test1 ar_my_test_module:test2 ar_your_test_module ar_their_test_module:test4sync_tasksanddata_discovery./bin/benchmark packing