Add support for syncing replica_2_9 data from peers #812

ldmberman · 2025-05-22T15:54:38Z

The main challenge with syncing replica_2_9 data is limiting the amount of entropy that has to be generated. In the optimal case a given set of entropy is only generated once. We refer to a set of entropy as a "footprint" of entropy because the entropy is assigned to a set of non-contiguous chunks that are distributed throughout the partition. Not the best analogy but it avoids yet another version of batch/set/group/chunk/interval/etc...

In order to optimize the entropy generation during syncing 3 things need to happen:

We need change the order in which chunks are synced from peers so that we sync one complete footprint worth of chunks (or as complete a footprint as possible) from the same peer. This will allow us to generate one footprint worth of entropy in order to unpack all those chunks.
We need to cache entropy after we generate it. Even once the sync order is updated, the node will receive chunks asynchronously from many peers across many footprints, an entropy cache makes it easier to attach these chunks with the appropriate bit of entropy when they come in.
We need to restrict how many footprints we sync at one time. Even with the two changes above, if a node tries to sync too many footprints in parallel it will overrun its entropy cache and force entropy to be regenerated multiple times.

Changing the order in which chunks are synced

This PR introduces a new footprint syncing mode in addition to the pre-existing normal sync mode. This mode has its own data discovery, peer interval gathering, and syncing logic. To aid with data discovery two new endpoints are added:

GET /footprint_buckets: returns an ETF-serialized compact but imprecise representation of the synced footprints - a bucket size and a map where every key is the sequence number of the bucket, every value -the percentage of data synced in the reported bucket. Every bucket contains one or more footprints. Note that while footprints in the buckets are adjacent in the sense that the first chunk in the first footprint is adjacent to the first chunk in the second footprint, chunks are generally not adjacent because of how the logic of 2.9 footprints goes.
GET /footprints/<partition>/<footprint>: Return the information about the presence of the data from the given footprint in the given partition. The returned intervals contain the numbers of the chunks starting from 0 belonging to the given footprint (and present on this node). The footprint is constructed like a replica 2.9 entropy footprint where chunks are spread out across the partition. Therefore, the interval [0, 2] does not denote
%% two adjacent chunks but rather two chunks separated by
%% ar_block:get_replica_2_9_entropy_count() chunks.

The node will query the above endpoints from peers and maintain a mapping of peer->footprint->chunk so that it can sync all chunks in a given footprint from the same peer. The node will then alternate between normal syncing (not footprint-aligned, and only from nodes advertising unpacked or spora_2_6 data) and footprint syncing (footprint-aligned and only from nodes advertising replica_2_9 data)

Entropy cache

The size allocated to the entropy cache is set via the new replica_2_9_entropy_cache_size_mb config option: the maximum cache size (in MiB) to allocate for for replica.2.9 entropy. Each cached entropy is 256 MiB. The bigger the cache, the more replica.2.9 data can be synced concurrently.

Limiting how many footprints can be synced at once

The ar_data_sync_coordinator and ar_peer_worker modules now track how many footprints are being synced at once and will queue up additional sync requests once the maximum number is hit. Maximum number of concurrent footprints is determined by the size of the footprint cache.

Other Changes

Introduce a new data_roots index that gets downloaded from peers in the background. With this index a node is able to validate a historical chunk even if the node hasn't synced the block and transaction headers corresponding to the chunk. The data_roots index is much much smaller than the full blockchain data.
./bin/test and ./bin/e2e can now run individual tests and not just modules. e.g. ./bin/test ar_my_test_module:test1 ar_my_test_module:test2 ar_your_test_module ar_their_test_module:test4
Add some new metrics to help monitor data discovery and syncing: sync_tasks and data_discovery
Fix up ./bin/benchmark packing
Better handling of rebar3 symlinks. Track whether we're running from source (need symlinks) or pre-built release (have real dirs) and then create or delete the symlinks as necessary when launching the node.

apps/arweave/src/ar_data_sync.erl

apps/arweave/src/ar_http_iface_middleware.erl

apps/arweave/src/ar_http_iface_server.erl

apps/arweave/src/ar_data_discovery.erl

apps/arweave/src/ar_data_sync.erl

apps/arweave/src/ar_footprint_record.erl

apps/arweave/include/ar_data_discovery.hrl

ldmberman · 2025-07-04T11:55:08Z

A summary of the recent changes:

The footprint record is now maintained only for 256 KiB chunks before the strict threshold and all chunks after the threshold.
The footprint record is maintained only for replica 2.9 data.
The “normal” and “footprint” syncing procedures are scheduled in phases: first, the normal procedure completes one iteration over the partition range; then the footprint procedure completes one iteration; then the normal procedure runs again, and so on.
The “footprint” procedure syncs only replica 2.9 data by ignoring non-replica 2.9 packing returned by GET /footprints.
The “normal” procedure syncs only non-replica 2.9 data (replica 2.9 data is already excluded from the normal sync record on master).
Because both procedures are running, a two-phase deployment is no longer necessary.
Footprint records are initialized from the existing sync records on startup.
Added tests for ar_footprint_record and ar_peer_intervals.
Replica 2.9 syncing does not start before the entropy generation has been completed.
The server-side blocking of serving of replica 2.9 chunks (along with the BLOCK_2_9_SYNCING flag) is removed.
Put entropy generation behind a mutex to avoid redundant entropy generation; logging entropy_generation_lock_collision on collision.
Added tracking and reporting of redundant entropy generation (the number of entropy generations per key in the last 30 minutes).

…ut not written

when syncing data roots

…nt count to avoid constant recalculating

also try to reduce some of the coordinator and peer_worker reductions be replacing gen_server:call with gen_server:cast and caching the formatted ppeer

… peer

…unk from each footprint This prevents the node from syncing the same footprint 1024 times (once for each chunk)

also remove a bunch of debugging logs

apps/arweave/src/ar_peer_worker.erl

…ust atom table)

This change make `enqueue_task` synchronous, but should not impact performance. Previously enqueue_task was asynchronous, but process_queue was synchronous... meaning we lost the benefit of the aynchronous call *and* caused a race condition. New update removes the async call but doesn't add any net new synchronous calls.