Skip to content

go/worker/storage/committee: Refactor diff sync#6451

Draft
martintomazic wants to merge 1 commit intomasterfrom
martin/state-sync-refactor
Draft

go/worker/storage/committee: Refactor diff sync#6451
martintomazic wants to merge 1 commit intomasterfrom
martin/state-sync-refactor

Conversation

@martintomazic
Copy link
Contributor

@martintomazic martintomazic commented Feb 3, 2026

Improve worker performance and make it more readable, idiomatic, testable and benchmarkable. The idea of this PR is to first assess whether refactor is worth it.

Looking at 25.6 and 25.9 logs, the queuing of the tasks was not optimal. In practice even after this change we are still blocked by data availability when syncing older storage diffs (e.g. genesis sync) and or pruning at the same time**. So in practice performance hasn't changed (benchmarked already), except for maybe when syncing recent state (cca 50% faster but need to reproduce to confirm).

Things I like:
Got rid of two redundant structures; syncing rounds and summary cache together with corresponding locking. Given this is IO-bound task, and the work is primarily about concurrency (not parallelism) I believe moving from mutexes to channels is desirable.

Things I dislike: I will probably get rid of the new diffsync nested package, also overall it still feels to complex. Writing another fetcher, that will read from another local DB for the purpose of benchmarking will probably clarify design.

** As pointed out , and benchmarked again, runtime state pruning is incredibly slow even with only few versions locally. As ndb.Finalize and ndb.Prune are completely sequential due to metadata lock and with pruning critical section easily lasting 0.1-0.5s this means syncing interleaved with pruning is bounded with 1-5 rounds/s assuming all other operations would be negligible. As this is not the case, and there is likely big write amplification/compaction load on the DB in practice we are closer to 1-2 rounds/s.

Improve worker performance and make it more readable,
idiomatic, testable and bechmarkable.
@netlify
Copy link

netlify bot commented Feb 3, 2026

Deploy Preview for oasisprotocol-oasis-core canceled.

Name Link
🔨 Latest commit 228c363
🔍 Latest deploy log https://app.netlify.com/projects/oasisprotocol-oasis-core/deploys/698231978758bc0008a69a41

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor runtime storage committee worker into smaller and independent workers

1 participant