-
Notifications
You must be signed in to change notification settings - Fork 37
Description
We need to optimize the blockchain syncing process.
Expected behavior
Indexing Bitcoin should take at most 1 week.
Actual behavior
Indexing Bitcoin takes months!
Steps to reproduce the behavior
Just mount a full Bitcoin node, and follow the steps to sync the explorer with it.
Notes
This is a very complex task, which involves work from the infra side to the backend work.
On the infra side, we need to use a load balancer for the bitcoind RPC API, based on previous experiences, the minimum requirements are:
- Each node has 8 CPUs, and 8GB on ram, and SSD.
- 3 bitcoind instances.
- The necessary config on bitcoind should be tweaked to accept lots of concurrent calls.
On the explorer side:
- A huge server with lots of CPUs (potentially 32/64 at least).
- The postgres instance should be tweaked accordingly, it's still unknown what's the ideal server capacity, but should handle 2TB of data properly.
On the approach to take, the syncing process should be done in several stages (looks like a good candidate for akka-streams):
- Block headers (mandatory before any other stage).
- Transaction headers.
- Transaction outputs (depends on transaction headers).
- Transaction inputs (depends on the outputs).
- Block filter (depends on the outputs).
- TPoS contracts (depends on the outputs)
- Block rewards (depends on the outputs, potentially could be synced after the block headers).
- Address balances (depends on the inputs)
- Address transaction details (depends on the inputs)
keeping 3 nodes at minimum, with 8GB on ram or more, and
As we don't require the whole data to be indexed, ideally we should be able to disable some stages to speed up the process, and save space, these are good candidates (sql tables):
- balances.
- tpos_contracts.
- block_rewards
- address_transaction_details
All of this would affect the exposed API, because we shouldn't return blocks that aren't fully synced, also, it's important to consider potential rollbacks while syncing the data.