Skip to content

Conversation

@yongkangc
Copy link
Member

@yongkangc yongkangc commented Dec 23, 2025

Addresses #20392 - Initialize RocksDB provider and add consistency check on startup.

Changes

  • Add RocksDB check_consistency() call in node launch with correct ordering:
    1. StaticFileProvider::check_file_consistency() - heals interrupted writes
    2. RocksDB::check_consistency() - uses static file data for tx hash pruning
    3. StaticFileProvider::check_consistency() - compares with MDBX checkpoints

Note for future:

Related

Fix the ordering of consistency checks on node startup to ensure RocksDB
can access static file transaction data before static files potentially
prune it.

Changes:
- Add StaticFileProvider::check_file_consistency() for file healing
- Update node launch with correct 3-step ordering:
  1. File healing (no pruning)
  2. RocksDB check (needs SF tx data)
  3. Static file checkpoint check (may prune)
- Fix StoragesHistory behind-checkpoint detection
- Add checkpoint > 0 check when MDBX empty but SF has data
- Add missing stub implementations for non-Unix platforms
- Remove unnecessary bound from Cli::configure
The ChainSpec bound removal was unrelated to the consistency check ordering fix.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the consistency check ordering during node startup to prevent data inconsistencies across MDBX, static files, and RocksDB storage layers. The fix addresses issue #20392 by implementing a three-phase consistency check process that ensures proper sequencing of healing and validation operations.

  • Adds a new check_file_consistency() method that heals NippyJar file-level inconsistencies without pruning data
  • Implements RocksDB consistency checks that validate TransactionHashNumbers and StoragesHistory tables against stage checkpoints
  • Updates node launch to orchestrate all three consistency checks in the correct order with combined unwind target calculation

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
crates/storage/provider/src/providers/static_file/manager.rs Adds check_file_consistency() method to heal file-level inconsistencies across all segments before other checks depend on static file data
crates/storage/provider/src/providers/rocksdb_stub.rs Adds stub implementations for new RocksDB methods (batch, first, last, iter, check_consistency) to support non-Unix platforms
crates/storage/provider/src/providers/rocksdb/invariants.rs Enhances RocksDB consistency checks with additional unwind conditions when checkpoint/MDBX state mismatches are detected
crates/node/builder/src/launch/common.rs Implements three-phase consistency check ordering with combined unwind target calculation and enhanced error reporting

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Eliminate redundant warnings and checks related to checkpoint consistency when MDBX is empty. This simplifies the logic in the RocksDBProvider by focusing on essential conditions for pruning and unwinding.
Comment on lines +521 to +526
factory.rocksdb_provider().check_consistency(&factory.database_provider_ro()?)?;

// Step 3: Static file checkpoint consistency (may prune)
let static_file_unwind = factory
.static_file_provider()
.check_consistency(&factory.provider()?)?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, can use the same provider_ro i think

"Setting unwind target."
);
let target = highest_block.unwrap_or_default();
unwind_target = Some(unwind_target.map_or(target, |t| t.min(target)));
Copy link
Collaborator

@joshieDo joshieDo Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i understand this is the current behaviour, but im unsure if we need to return an unwind_target at all. target should always be >= stage checkpoint , which then gets healed on check_consistency

Comment on lines 537 to +554
// Highly unlikely to happen, and given its destructive nature, it's better to panic
// instead.
// instead. Unwinding to 0 would leave MDBX with a huge free list size.
let inconsistency_source = match (file_unwind, rocksdb_unwind, static_file_unwind) {
(Some(_), Some(_), Some(_)) => {
"static file healing, RocksDB <> MDBX, and static file <> MDBX"
}
(Some(_), Some(_), None) => "static file healing and RocksDB <> MDBX",
(Some(_), None, Some(_)) => "static file healing and static file <> MDBX",
(None, Some(_), Some(_)) => "RocksDB <> MDBX and static file <> MDBX",
(Some(_), None, None) => "static file healing",
(None, Some(_), None) => "RocksDB <> MDBX",
(None, None, Some(_)) => "static file <> MDBX",
(None, None, None) => unreachable!(),
};
assert_ne!(
unwind_target,
PipelineTarget::Unwind(0),
"A static file <> database inconsistency was found that would trigger an unwind to block 0"
unwind_block,
0,
"A {inconsistency_source} inconsistency was found that would trigger an unwind to block 0"
Copy link
Collaborator

@joshieDo joshieDo Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this to an helper, and maybe just increment the err str or smth, because this 3-way-match is ugly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good note, initially thought that it might be a shallow function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good note, initially thought that it might be a shallow function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

3 participants