Skip to content

Conversation

@yongkangc
Copy link
Member

@yongkangc yongkangc commented Dec 21, 2025

Add RocksDB infrastructure for history lookups without changing any behavior.

This is PR 2/3 of a stacked PR series that splits #20412 for easier review. Closes #20388

Depends on: #20542

Review focus:

  • Correctness of RocksTx history lookup methods
  • RocksDBBatch ownership change is safe

Changes

rocksdb/provider.rs

  • Change RocksDBBatch to own Arc<RocksDBProviderInner> instead of borrowing
    • Allows storing RocksDBBatch in DatabaseProvider without lifetime issues
  • Add RocksTx::account_history_info() method

Stack

Did some testing and benchmark here: https://gist.github.com/yongkangc/6dfedca18b842cd41ea2864f248c1a07
MDBX wins by 2x but we are still using Rocksdb for the optmised writes.

@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Dec 21, 2025
@yongkangc yongkangc changed the title feat(storage): add RocksDB history lookup methods and owned batch type feat(storage): add RocksDB history lookup methods and owned batch type [2/3] Dec 21, 2025
@yongkangc yongkangc force-pushed the yk/pr2-rocksdb-infrastructure branch from e45acf9 to 40f84f2 Compare December 21, 2025 02:12
@yongkangc yongkangc force-pushed the yk/pr2-rocksdb-infrastructure branch from 7b937ec to 9aefeba Compare December 22, 2025 03:05
@yongkangc yongkangc self-assigned this Dec 22, 2025
@yongkangc yongkangc moved this from Backlog to In Review in Reth Tracker Dec 22, 2025
@yongkangc yongkangc marked this pull request as ready for review December 22, 2025 04:50
pub struct RocksDBBatch<'a> {
provider: &'a RocksDBProvider,
pub struct RocksDBBatch {
provider: Arc<RocksDBProviderInner>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If RocksDBBatch<'a> borrows &'a RocksDBProvider, then DatabaseProvider would need to:

  1. Store both the RocksDBProvider and the RocksDBBatch<'a> that borrows from it
  2. This creates a self-referential struct

Copy link
Collaborator

@joshieDo joshieDo Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt this just RocksDBProvider? RocksDBProvider is already an arc over an inner.

I think we might have to revisit this struct in another PR, because it feels odd

Above we specify

Uses WriteBatchWithTransaction for atomic writes without full transaction overhead.

but in fact RocksDbProviderInner holds a Transaction.

And if we're implementing read, maybe the best option would be to use the optimistic TX directly? unsure if we need to read through our writes here, but seems like a footgun if we do now or in the future

@yongkangc yongkangc force-pushed the yk/pr2-rocksdb-infrastructure branch from e885749 to 6b3de83 Compare December 22, 2025 04:55
@yongkangc yongkangc force-pushed the yk/pr2-rocksdb-infrastructure branch 4 times, most recently from fad81ba to 59af41c Compare December 23, 2025 04:45
@yongkangc yongkangc force-pushed the yk/pr1-history-info-abstraction branch 2 times, most recently from be6d534 to bb4782b Compare December 23, 2025 05:10
Split the history lookup logic:
- `needs_prev_shard_check`: pure helper to check if cursor.prev() is needed
- `find_changeset_block_from_index`: pure const fn for final decision

The cursor.prev() call uses && short-circuit at the call site, so it's
only executed in the rare case where rank==0 and found_block != target.

Also reorganized file to follow ordering convention:
enums -> structs -> functions
…_index

The new name better describes the function's purpose: given a history
index shard, find which changeset block contains the historical value
for a target block.

Also moved the function below the impl block for better code organization.
Enhance RocksDB provider infrastructure for history lookups:

- Change RocksDBBatch to own Arc<RocksDBProviderInner> instead of borrowing
  This allows storing RocksDBBatch in DatabaseProvider without lifetime issues.

- Add RocksTx::account_history_info() and storage_history_info() methods
  These use the shared history_info_from_shard algorithm to lookup history
  in RocksDB with the same semantics as MDBX.

- Update rocksdb_stub.rs with matching API (batch(), commit(), len(), is_empty())

This prepares the infrastructure for PR3 which will wire RocksDB
into HistoricalStateProvider and DatabaseProvider HistoryWriter.
- Introduced `raw_iterator_cf` method for bidirectional traversal in RocksDB.
- Updated `account_history_info` and `storage_history_info` methods to utilize raw iterators, enabling efficient previous shard detection.
- Improved error handling and decoding for key/value retrieval in history lookups.
- Added comprehensive tests for various scenarios, including single and multiple shard lookups, and edge cases for history queries.
- Enhanced `raw_iterator_cf` method to support bidirectional traversal and added `raw_iter_status_ok` for error checking.
- Updated `account_history_info` and `storage_history_info` methods to utilize the new iterator features for efficient history lookups.
- Improved documentation for clarity on how history lookups work with sharded data.
- Added tests to validate the behavior of the new iterator methods and ensure correct handling of edge cases.
Change EitherWriter::RocksDB to borrow the batch instead of owning it.
This allows callers to retain ownership and access the batch after the
writer is dropped, which is needed for history lookups that must peek
at pending writes.

- RocksBatchArg changes from owned to `&'a mut RocksDBBatch`
- Remove `into_raw_rocksdb_batch()` - no longer needed with borrowing
- Update tests to use scoped writer pattern
Change EitherWriter::RocksDB to borrow the batch instead of owning it.
This allows callers to retain ownership and access the batch after the
writer is dropped, which is needed for history lookups that must peek
at pending writes.

- RocksBatchArg changes from owned to `&'a mut RocksDBBatch`
- Remove `into_raw_rocksdb_batch()` - no longer needed with borrowing
- Update tests to use scoped writer pattern
… RocksDBBatch

Address review feedback: RocksDBProvider is already an Arc wrapper, so use it
directly instead of duplicating the Arc access pattern.
Adapt RocksDB history functions to use the new closure-based
has_previous_shard parameter from find_changeset_block_from_index.
This aligns with the PR1 change to make cursor.prev() lazy.
@yongkangc yongkangc force-pushed the yk/pr2-rocksdb-infrastructure branch from 59af41c to f3021e2 Compare December 23, 2025 06:17
@yongkangc yongkangc force-pushed the yk/pr1-history-info-abstraction branch from 43961c9 to 03e7abb Compare December 23, 2025 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

3 participants