Skip to content

Conversation

@yongkangc
Copy link
Member

@yongkangc yongkangc commented Dec 21, 2025

Summary

Extract the history lookup algorithm into a reusable function that can be shared between MDBX and RocksDB backends.

This is PR 1/3 of a stacked PR series that splits #20412 for easier review. Closes #20388

Review focus

  • Verify find_changeset_block_from_index correctly captures the original algorithm
  • Check documentation on HistoryInfo variants is accurate

Changes

  • Add HistoryInfo enum with detailed documentation explaining each variant
  • Extract find_changeset_block_from_index() function containing the core rank/select algorithm
  • Refactor HistoricalStateProviderRef::history_info() to use the shared function
  • Move the function below the impl block for better code organization
  • Export HistoryInfo and find_changeset_block_from_index from providers module

Stack

@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Dec 21, 2025
@yongkangc yongkangc changed the title refactor(storage): extract shared history_info_from_shard algorithm refactor(storage): extract shared history_info_from_shard algorithm [1/3] Dec 21, 2025
@yongkangc yongkangc self-assigned this Dec 22, 2025
@yongkangc yongkangc marked this pull request as ready for review December 22, 2025 03:03
@yongkangc yongkangc force-pushed the yk/pr1-history-info-abstraction branch from 6f8341e to 244c864 Compare December 22, 2025 03:04
@yongkangc yongkangc changed the title refactor(storage): extract shared history_info_from_shard algorithm [1/3] refactor(storage): extract shared find_changeset_block_from_index algorithm [1/3] Dec 22, 2025
}
if let Some(chunk) = cursor.seek(key)?.filter(|(key, _)| key_filter(key)).map(|x| x.1) {
// Check if there's a previous shard for the same key
let has_previous_shard = cursor.prev()?.is_some_and(|(key, _)| key_filter(&key));
Copy link
Collaborator

@joshieDo joshieDo Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// This check is worth it, the cursor.prev() check is rarely triggered (the if will
// short-circuit) and when it passes we save a full seek into the changeset/plain state
// table.

before, this line was rarely being triggered, but now it will always.

maybe we should make this a closure instead and pass that if we don't want to be passing the cursor?

Copy link
Member Author

@yongkangc yongkangc Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh before only when the target block is before the first write in the shard that it triggers with &&

good point, there might be a more elegant way to refactor this

before, this line was rarely being triggered, but now it will always.

@yongkangc yongkangc force-pushed the yk/pr1-history-info-abstraction branch from 7b5ad36 to ebb67cf Compare December 23, 2025 04:44
Add HistoryInfo enum with documentation and extract the shared
rank/select algorithm into history_info_from_shard function.

This prepares for RocksDB integration by making the history lookup
algorithm reusable across different storage backends (MDBX, RocksDB).

Changes:
- Add HistoryInfo enum with detailed doc comments
- Add history_info_from_shard function with the core algorithm
- Refactor history_info method to use the shared function
- Export HistoryInfo and history_info_from_shard from providers module
…_index

The new name better describes the function's purpose: given a history
index shard, find which changeset block contains the historical value
for a target block.
@yongkangc yongkangc force-pushed the yk/pr1-history-info-abstraction branch 4 times, most recently from 82d60a2 to 43961c9 Compare December 23, 2025 06:13
Split the history lookup logic:
- `needs_prev_shard_check`: pure helper to check if cursor.prev() is needed
- `find_changeset_block_from_index`: pure const fn for final decision

The cursor.prev() call uses && short-circuit at the call site, so it's
only executed in the rare case where rank==0 and found_block != target.

Also reorganized file to follow ordering convention:
enums -> structs -> functions
@yongkangc yongkangc force-pushed the yk/pr1-history-info-abstraction branch from 43961c9 to 03e7abb Compare December 23, 2025 08:20
// happen if this is the last chunk and so we need to look in the plain state.
Ok(HistoryInfo::InPlainState)
}
let is_before_first_write =
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshieDo instead of going with closure, i extracted to a pure method - thought it was simpler

@yongkangc yongkangc requested a review from joshieDo December 23, 2025 08:25
@yongkangc
Copy link
Member Author

@joshieDo just addressed your comments :)

Comment on lines -194 to -210
if let (Some(_), Some(block_number)) = (lowest_available_block_number, block_number)
{
// The key may have been written, but due to pruning we may not have changesets
// and history, so we need to make a changeset lookup.
Ok(HistoryInfo::InChangeset(block_number))
} else {
// The key is written to, but only after our block.
Ok(HistoryInfo::NotYetWritten)
}
} else if let Some(block_number) = block_number {
// The chunk contains an entry for a write after our block, return it.
Ok(HistoryInfo::InChangeset(block_number))
} else {
// The chunk does not contain an entry for a write after our block. This can only
// happen if this is the last chunk and so we need to look in the plain state.
Ok(HistoryInfo::InPlainState)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this becomes part of find_changeset_block_from_index

Comment on lines -190 to -191
if rank == 0 &&
block_number != Some(self.block_number) &&
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these 2 logic goes into needs_prev_shard_check imo its easier to read and understand

the part i didnt like about the previous code was that the code has too many conditionals and was hard to hold in my head

/// should be computed as: `rank == 0 && found_block != Some(block_number) && !has_previous_shard`
/// where `has_previous_shard` comes from a lazy `cursor.prev()` check.
/// * `lowest_available` - Lowest block where history is available (pruning boundary)
pub const fn find_changeset_block_from_index(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe as impl HistoryInfo?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HistoryInfo::from_lookup

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes that's much nicer

/// Indicates where to find the historical value for a given key at a specific block.
#[derive(Debug, Eq, PartialEq)]
pub enum HistoryInfo {
/// The key is written to, but only after our block (not yet written at the target block).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The key is written to, but only after our block (not yet written at the target block).
/// The key is written to, but only after our block (not yet written at the target block). Or it has never been written.

i think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the note

@github-project-automation github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Dec 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

Use EitherReader/EitherWriter in DatabaseProvider and HistoricalStateProvider

3 participants