-
Notifications
You must be signed in to change notification settings - Fork 7
refactor(database, state-indexer): State schema improvements for reads and updates #410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
d5cf306 to
b9bfcdd
Compare
2491ff4 to
41a471b
Compare
…changes that is more efficient to read from
…ndexer into files. Remove redundant block_hash from handle_state_changes method in logic-state-indexer
… long state indexer writes take time and how many partitions touched
…artition number. Switch CTE to unnest for updates
…eights to biging (i64) to speed inserts and updates up
… to use i64 instead of BigDecimal
4af4ca4 to
1bcc8c6
Compare
kobayurii
approved these changes
Aug 11, 2025
Member
kobayurii
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grate! Thank you!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is a continuation of @kobayurii work done in #394 that we have rejected because we could not write (index) the data fast enough to support that idea.
After a bunch of different experiments and other ideas for improving the schema we got back to that schema and optimized it enough to increase a throughput on our side to be able to index the tip of the network or backfill.
TL;DR
Average blocks per second:
Key changes:
state_changes_{family}tables have to be converted intostate_changes_{family}_compact:block_heightchanges toblock_height_fromrepresenting the moment in time when the particular record becomes "active"block_height_totogether with theblock_height_frombuilds up a range of "active" period for a particular recordblock_height_to) are refactored to be done in parallel on per partition basis.get_text_partitionon the PostgreSQL side that allows to mapaccount_idto the partition is addedblock_height_fromandblock_height_tocolumn types fromnumeric(20, 0)tobigint. Correspondingly, they representu64andi64. And while block height isu64(the reason we usednumeric(20, 0)) we still can fit it ini64for a long time from now. Thebigintfield is much more performant on the PostgreSQL side for our inserts and updates.NB! Before releasing this we need to fix/update migrations to ensure we create the proper schema. Also we need to migrate our entire databases for a new schema.