Skip to content

Conversation

@khorolets
Copy link
Member

This PR is a continuation of @kobayurii work done in #394 that we have rejected because we could not write (index) the data fast enough to support that idea.

After a bunch of different experiments and other ideas for improving the schema we got back to that schema and optimized it enough to increase a throughput on our side to be able to index the tip of the network or backfill.

TL;DR

Average blocks per second:

  • Before: 0.1 BPS
  • After: 5.0-12.0 BPS

Key changes:

  • state_changes_{family} tables have to be converted into state_changes_{family}_compact:
    • block_height changes to block_height_from representing the moment in time when the particular record becomes "active"
    • new field block_height_to together with the block_height_from builds up a range of "active" period for a particular record
    • Altogether this allows to dramatically shrink the lookup range for PostgreSQL thus increasing the speed or reads for all the state changes related data in ReadRPC (state, access keys, account, contract)
  • Inserts and updates (introduced logic to support block_height_to) are refactored to be done in parallel on per partition basis.
    • A function get_text_partition on the PostgreSQL side that allows to map account_id to the partition is added
  • And a cherry on top: we need to migrate block_height_from and block_height_to column types from numeric(20, 0) to bigint. Correspondingly, they represent u64 and i64. And while block height is u64 (the reason we used numeric(20, 0)) we still can fit it in i64 for a long time from now. The bigint field is much more performant on the PostgreSQL side for our inserts and updates.

NB! Before releasing this we need to fix/update migrations to ensure we create the proper schema. Also we need to migrate our entire databases for a new schema.

@khorolets khorolets requested a review from kobayurii July 30, 2025 06:53
@khorolets khorolets added enhancement New feature or request performance labels Jul 30, 2025
@khorolets khorolets force-pushed the refactor/read-optimized-schema branch from d5cf306 to b9bfcdd Compare July 30, 2025 07:06
@khorolets khorolets changed the base branch from main to develop July 30, 2025 07:25
@kobayurii kobayurii force-pushed the refactor/read-optimized-schema branch 2 times, most recently from 2491ff4 to 41a471b Compare July 31, 2025 06:30
khorolets and others added 9 commits August 6, 2025 14:43
…ndexer into files. Remove redundant block_hash from handle_state_changes method in logic-state-indexer
… long state indexer writes take time and how many partitions touched
…artition number. Switch CTE to unnest for updates
…eights to biging (i64) to speed inserts and updates up
@kobayurii kobayurii force-pushed the refactor/read-optimized-schema branch from 4af4ca4 to 1bcc8c6 Compare August 6, 2025 11:43
@kobayurii kobayurii marked this pull request as ready for review August 11, 2025 15:37
Copy link
Member

@kobayurii kobayurii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grate! Thank you!

@kobayurii kobayurii merged commit 591ab22 into develop Aug 12, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants