Skip to content

Conversation

@duyquang6
Copy link
Contributor

Summary

  • Optimizes the append_history_index

Profiling append_history_index:

image
  • Vec collect + delete_current + IntegerList::new_presorted cursor contribute dominate on append_history_index

Changes

  • Replace seek_exact + delete + insert with seek_exact + upsert: reduce delete_current operation hotpath
  • fast path optimization**: When appended indices fit within the current shard (≤2000 indices), directly upsert without collecting into a Vec, avoiding allocate, clone & chunk overhead

Before

image

After

image

Profile result:
image

@mediocregopher
Copy link
Collaborator

This is actually a footgun with MDBX implementation: DUPSORT tables do not support upsert as you would expect. When upsert is called on a DUPSORT table it will actually only insert and leave any previous entry in place.

/// For a DUPSORT table, `upsert` will not actually update-or-insert. If the key already exists,
/// it will append the value to the subkey, even if the subkeys are the same. So if you want
/// to properly upsert, you'll need to `seek_exact` & `delete_current` if the key+subkey was
/// found, before calling `upsert`.

So unless this PR only affects upsert on non-DUPSORT tables we cannot merge it.

@duyquang6
Copy link
Contributor Author

This is actually a footgun with MDBX implementation: DUPSORT tables do not support upsert as you would expect. When upsert is called on a DUPSORT table it will actually only insert and leave any previous entry in place.

/// For a DUPSORT table, `upsert` will not actually update-or-insert. If the key already exists,
/// it will append the value to the subkey, even if the subkeys are the same. So if you want
/// to properly upsert, you'll need to `seek_exact` & `delete_current` if the key+subkey was
/// found, before calling `upsert`.

So unless this PR only affects upsert on non-DUPSORT tables we cannot merge it.

Hi sir! Nice catch

I've added a runtime assertion to prevent append_history_index from being used with DUPSORT tables:
About the current use of this function, it used for only 2 tables AccountsHistory and StorageHistory, both are non-dupsort table.

@mediocregopher
Copy link
Collaborator

This is actually a footgun with MDBX implementation: DUPSORT tables do not support upsert as you would expect. When upsert is called on a DUPSORT table it will actually only insert and leave any previous entry in place.

/// For a DUPSORT table, `upsert` will not actually update-or-insert. If the key already exists,
/// it will append the value to the subkey, even if the subkeys are the same. So if you want
/// to properly upsert, you'll need to `seek_exact` & `delete_current` if the key+subkey was
/// found, before calling `upsert`.

So unless this PR only affects upsert on non-DUPSORT tables we cannot merge it.

Hi sir! Nice catch

I've added a runtime assertion to prevent append_history_index from being used with DUPSORT tables: About the current use of this function, it used for only 2 tables AccountsHistory and StorageHistory, both are non-dupsort table.

Ah nice, I thought the history tables were dupsort but you're right, they are not, my bad. Thanks for adding the assert 👍

Copy link
Collaborator

@mediocregopher mediocregopher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM but would definitely like a sanity check from @shekhirin or @joshieDo

@github-project-automation github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Nov 20, 2025
This reverts commit 6a16d17.
@shekhirin
Copy link
Collaborator

shekhirin commented Nov 21, 2025

^ committed above to a wrong branch, sorry -.-

@shekhirin
Copy link
Collaborator

I see, this makes sense! I think we can even improve it like that to avoid collecting last_shard into a vector first.

diff --git a/crates/storage/provider/src/providers/database/provider.rs b/crates/storage/provider/src/providers/database/provider.rs
index ca0564bbfa..b0baef3cd4 100644
--- a/crates/storage/provider/src/providers/database/provider.rs
+++ b/crates/storage/provider/src/providers/database/provider.rs
@@ -833,21 +833,19 @@ impl<TX: DbTxMut + DbTx + 'static, N: NodeTypes> DatabaseProvider<TX, N> {
             }
 
             // slow path: rechunk into multiple shards
-            let all_indices: Vec<u64> = last_shard.iter().collect();
-            let mut chunks = all_indices.chunks(sharded_key::NUM_OF_INDICES_IN_SHARD).peekable();
+            let chunks = last_shard.iter().chunks(sharded_key::NUM_OF_INDICES_IN_SHARD);
+            let mut chunks_peekable = chunks.into_iter().peekable();
 
-            while let Some(list) = chunks.next() {
-                let highest_block_number = if chunks.peek().is_some() {
-                    *list.last().expect("`chunks` does not return empty list")
+            while let Some(chunk) = chunks_peekable.next() {
+                let shard = BlockNumberList::new_pre_sorted(chunk);
+                let highest_block_number = if chunks_peekable.peek().is_some() {
+                    shard.iter().next_back().expect("`chunks` does not return empty list")
                 } else {
                     // Insert last list with `u64::MAX`.
                     u64::MAX
                 };
 
-                cursor.upsert(
-                    sharded_key_factory(partial_key, highest_block_number),
-                    &BlockNumberList::new_pre_sorted(list.iter().copied()),
-                )?;
+                cursor.upsert(sharded_key_factory(partial_key, highest_block_number), &shard)?;
             }
         }

@duyquang6
Copy link
Contributor Author

duyquang6 commented Nov 21, 2025

I see, this makes sense! I think we can even improve it like that to avoid collecting last_shard into a vector first.

diff --git a/crates/storage/provider/src/providers/database/provider.rs b/crates/storage/provider/src/providers/database/provider.rs
index ca0564bbfa..b0baef3cd4 100644
--- a/crates/storage/provider/src/providers/database/provider.rs
+++ b/crates/storage/provider/src/providers/database/provider.rs
@@ -833,21 +833,19 @@ impl<TX: DbTxMut + DbTx + 'static, N: NodeTypes> DatabaseProvider<TX, N> {
             }
 
             // slow path: rechunk into multiple shards
-            let all_indices: Vec<u64> = last_shard.iter().collect();
-            let mut chunks = all_indices.chunks(sharded_key::NUM_OF_INDICES_IN_SHARD).peekable();
+            let chunks = last_shard.iter().chunks(sharded_key::NUM_OF_INDICES_IN_SHARD);
+            let mut chunks_peekable = chunks.into_iter().peekable();
 
-            while let Some(list) = chunks.next() {
-                let highest_block_number = if chunks.peek().is_some() {
-                    *list.last().expect("`chunks` does not return empty list")
+            while let Some(chunk) = chunks_peekable.next() {
+                let shard = BlockNumberList::new_pre_sorted(chunk);
+                let highest_block_number = if chunks_peekable.peek().is_some() {
+                    shard.iter().next_back().expect("`chunks` does not return empty list")
                 } else {
                     // Insert last list with `u64::MAX`.
                     u64::MAX
                 };
 
-                cursor.upsert(
-                    sharded_key_factory(partial_key, highest_block_number),
-                    &BlockNumberList::new_pre_sorted(list.iter().copied()),
-                )?;
+                cursor.upsert(sharded_key_factory(partial_key, highest_block_number), &shard)?;
             }
         }

Thanks! this should be more efficient since we avoid the intermediate allocation, updated

@emhane emhane added C-perf A change motivated by improving speed, memory usage or disk footprint A-db Related to the database labels Nov 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-db Related to the database C-perf A change motivated by improving speed, memory usage or disk footprint

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

4 participants