feat: check checkpoint index and block height to identify reorg earlier#7266
feat: check checkpoint index and block height to identify reorg earlier#7266
Conversation
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7266 +/- ##
=====================================
Coverage 0.00% 0.00%
=====================================
Files 1 1
Lines 14 14
=====================================
Misses 14 14
🚀 New features to boost your workflow:
|
📝 WalkthroughWalkthroughThis PR introduces checkpoint persistence and reorg detection mechanisms. A new Changes
Sequence Diagram(s)sequenceDiagram
participant Validator
participant ValidateCP as validate_checkpoint()
participant Database
participant ReorgHandler as panic_with_reorg()
Validator->>Database: retrieve_latest_checkpoint_info()
Database-->>Validator: CheckpointInfo {local_idx, height}
Validator->>ValidateCP: latest_checkpoint (new)
ValidateCP->>ValidateCP: Compare indices & heights
alt Index increased (normal)
ValidateCP-->>Validator: Valid, proceed
else Index same or lower (reorg detected)
ValidateCP-->>Validator: Reorg event
Validator->>ReorgHandler: Latest checkpoint + block_height
ReorgHandler->>Database: store_latest_checkpoint_info()
ReorgHandler->>ReorgHandler: Log + report_reorg_with_checkpoint()
ReorgHandler->>Validator: Panic with ReorgEvent
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas requiring close attention:
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rust/main/agents/validator/src/submit.rs (1)
586-591: Compile-time type mismatch: tree_exceeds_checkpoint expects Checkpoint, callers pass CheckpointAtBlockThis won’t build as-is. Either change the helper to take CheckpointAtBlock or pass .checkpoint at call sites. Minimal change below.
Apply:
- if tree_exceeds_checkpoint(&latest_checkpoint, &tree) { + if tree_exceeds_checkpoint(&latest_checkpoint.checkpoint, &tree) {- assert!( - !tree_exceeds_checkpoint(correctness_checkpoint, tree), + assert!( + !tree_exceeds_checkpoint(&correctness_checkpoint.checkpoint, tree),Alternatively, change the helper signature to take &CheckpointAtBlock and read .index.
Also applies to: 174-187, 234-243
🧹 Nitpick comments (7)
rust/main/hyperlane-core/src/types/checkpoint.rs (2)
111-120: Binary encoding order is index-then-height; please call it out or align fieldsYou serialize checkpoint_index first and block_height second. Decode matches, so it’s correct, but the struct lists block_height first. Either:
- add a short comment noting the binary order, or
- reorder struct fields to match the encoding to avoid head‑scratching later.
Your call—just keep it tidy like a well‑kept swamp.
pub struct CheckpointInfo { - /// block height of checkpoint - pub block_height: u64, - /// index of checkpoint - pub checkpoint_index: u32, + /// index of checkpoint (serialized first) + pub checkpoint_index: u32, + /// block height of checkpoint (serialized second) + pub block_height: u64, }
122-135: Consider a quick round‑trip test for Encode/DecodeTiny ask: add a test asserting that encoding then decoding yields the same CheckpointInfo, and that the written size equals size_of::() + size_of::(). Helps catch future drift.
rust/main/hyperlane-base/src/settings/checkpoint_syncer.rs (1)
210-219: Add a mismatch test to exercise the new indicesYou set both indices to 56. Consider adding a case where local != canonical to prove we surface/propagate the right values end‑to‑end. Keeps the mud off future regressions.
rust/main/agents/validator/src/submit.rs (2)
198-221: Wording nit in warn; also OK to persist the new canonical after reorgMessage says “higher index” without checking it. Tweak copy to match the condition you check (lower block height).
- tracing::warn!( + tracing::warn!( ?latest_checkpoint, ?latest_seen_checkpoint, - "Receive a checkpoint with a higher index, but lower block height" + "Received a checkpoint with lower block height than previously seen" );Persisting latest_seen_checkpoint here is reasonable (helps converge after a reorg), and the extra store error log is good.
331-365: validate_checkpoint logic reads right; consider documenting the None-height caseRules cover higher/same index, lower index with old height, and lower index with same/new height (reorg). Consider a short doc comment noting: “If block_height is None, we accept the checkpoint.”
rust/main/hyperlane-base/src/db/rocks/hyperlane_db.rs (1)
43-44: Version the storage key to future-proof migrationsOther keys use versioned suffixes (e.g., v2/v3). Do the same here to avoid schema ambiguity later.
-const LATEST_CHECKPOINT_INFO: &str = "latest_checkpoint_info"; +const LATEST_CHECKPOINT_INFO: &str = "latest_checkpoint_info_v1";If data already exists in environments, gate this behind a migration toggle.
rust/main/agents/validator/src/submit/tests.rs (1)
583-635: Great coverage of validate_checkpoint scenariosHigher/same/lower index cases behave as intended. One more edge case worth adding: when latest_checkpoint.block_height is None, validate_checkpoint should accept regardless of indices; add a test to lock this in.
Happy to draft that test if you want it in this PR.
Also applies to: 637-681, 683-731, 732-777
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
rust/main/agents/relayer/src/msg/db_loader.rs(2 hunks)rust/main/agents/relayer/src/msg/pending_message.rs(1 hunks)rust/main/agents/validator/src/submit.rs(6 hunks)rust/main/agents/validator/src/submit/tests.rs(10 hunks)rust/main/hyperlane-base/src/db/mod.rs(2 hunks)rust/main/hyperlane-base/src/db/rocks/hyperlane_db.rs(3 hunks)rust/main/hyperlane-base/src/settings/checkpoint_syncer.rs(1 hunks)rust/main/hyperlane-core/src/types/checkpoint.rs(2 hunks)rust/main/hyperlane-core/src/types/reorg.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
rust/main/{hyperlane-core,hyperlane-base}/**/src/**/*.rs
📄 CodeRabbit inference engine (CLAUDE.md)
Keep shared Rust core crates in rust/main/{hyperlane-core,hyperlane-base}
Files:
rust/main/hyperlane-core/src/types/checkpoint.rsrust/main/hyperlane-base/src/db/mod.rsrust/main/hyperlane-core/src/types/reorg.rsrust/main/hyperlane-base/src/settings/checkpoint_syncer.rsrust/main/hyperlane-base/src/db/rocks/hyperlane_db.rs
rust/main/**/src/**/*.rs
📄 CodeRabbit inference engine (CLAUDE.md)
Run cargo clippy for Rust code linting
Files:
rust/main/hyperlane-core/src/types/checkpoint.rsrust/main/agents/relayer/src/msg/pending_message.rsrust/main/hyperlane-base/src/db/mod.rsrust/main/agents/validator/src/submit.rsrust/main/hyperlane-core/src/types/reorg.rsrust/main/hyperlane-base/src/settings/checkpoint_syncer.rsrust/main/hyperlane-base/src/db/rocks/hyperlane_db.rsrust/main/agents/validator/src/submit/tests.rsrust/main/agents/relayer/src/msg/db_loader.rs
rust/main/agents/relayer/**/src/**/*.rs
📄 CodeRabbit inference engine (CLAUDE.md)
Maintain relayer agent Rust sources under rust/main/agents/relayer
Files:
rust/main/agents/relayer/src/msg/pending_message.rsrust/main/agents/relayer/src/msg/db_loader.rs
rust/main/agents/validator/**/src/**/*.rs
📄 CodeRabbit inference engine (CLAUDE.md)
Maintain validator agent Rust sources under rust/main/agents/validator
Files:
rust/main/agents/validator/src/submit.rsrust/main/agents/validator/src/submit/tests.rs
🧬 Code graph analysis (8)
rust/main/hyperlane-core/src/types/checkpoint.rs (1)
rust/main/chains/hyperlane-ethereum/src/contracts/merkle_tree_hook.rs (1)
block_height(326-331)
rust/main/agents/relayer/src/msg/pending_message.rs (2)
rust/main/hyperlane-base/src/db/mod.rs (2)
store_latest_checkpoint_info(178-178)retrieve_latest_checkpoint_info(180-180)rust/main/hyperlane-base/src/db/rocks/hyperlane_db.rs (2)
store_latest_checkpoint_info(703-705)retrieve_latest_checkpoint_info(706-708)
rust/main/hyperlane-base/src/db/mod.rs (1)
rust/main/hyperlane-base/src/db/rocks/hyperlane_db.rs (2)
store_latest_checkpoint_info(703-705)retrieve_latest_checkpoint_info(706-708)
rust/main/agents/validator/src/submit.rs (3)
rust/main/chains/hyperlane-starknet/src/merkle_tree_hook.rs (3)
latest_checkpoint(67-91)tree(95-130)tree(107-111)rust/main/chains/hyperlane-ethereum/src/contracts/merkle_tree_hook.rs (3)
latest_checkpoint(253-277)tree(301-311)block_height(326-331)rust/main/hyperlane-core/src/traits/merkle_tree_hook.rs (2)
latest_checkpoint(49-50)tree(37-37)
rust/main/hyperlane-base/src/settings/checkpoint_syncer.rs (1)
rust/main/hyperlane-core/src/chain.rs (1)
from_blocks(53-57)
rust/main/hyperlane-base/src/db/rocks/hyperlane_db.rs (1)
rust/main/hyperlane-base/src/db/mod.rs (2)
store_latest_checkpoint_info(178-178)retrieve_latest_checkpoint_info(180-180)
rust/main/agents/validator/src/submit/tests.rs (5)
rust/main/hyperlane-core/src/test_utils.rs (1)
dummy_domain(44-53)rust/main/hyperlane-base/src/db/mod.rs (2)
store_latest_checkpoint_info(178-178)retrieve_latest_checkpoint_info(180-180)rust/main/hyperlane-base/src/db/rocks/hyperlane_db.rs (3)
store_latest_checkpoint_info(703-705)retrieve_latest_checkpoint_info(706-708)new(74-76)rust/main/agents/validator/src/submit.rs (3)
new(43-67)new(602-616)checkpoint_at_block(78-85)rust/main/hyperlane-core/src/chain.rs (1)
from_blocks(53-57)
rust/main/agents/relayer/src/msg/db_loader.rs (2)
rust/main/hyperlane-base/src/db/mod.rs (2)
store_latest_checkpoint_info(178-178)retrieve_latest_checkpoint_info(180-180)rust/main/hyperlane-base/src/db/rocks/hyperlane_db.rs (2)
store_latest_checkpoint_info(703-705)retrieve_latest_checkpoint_info(706-708)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (53)
- GitHub Check: infra-test
- GitHub Check: cli-evm-e2e-matrix (warp-read)
- GitHub Check: cli-evm-e2e-matrix (warp-extend-basic)
- GitHub Check: cli-evm-e2e-matrix (core-init)
- GitHub Check: cli-evm-e2e-matrix (warp-extend-recovery)
- GitHub Check: cli-evm-e2e-matrix (warp-check-3)
- GitHub Check: cli-evm-e2e-matrix (warp-init)
- GitHub Check: cli-evm-e2e-matrix (warp-send)
- GitHub Check: cli-evm-e2e-matrix (warp-apply-ism-updates)
- GitHub Check: cli-evm-e2e-matrix (warp-rebalancer)
- GitHub Check: cli-evm-e2e-matrix (warp-extend-config)
- GitHub Check: cli-evm-e2e-matrix (warp-check-1)
- GitHub Check: cli-evm-e2e-matrix (core-deploy)
- GitHub Check: cli-evm-e2e-matrix (relay)
- GitHub Check: cli-evm-e2e-matrix (warp-deploy)
- GitHub Check: cli-evm-e2e-matrix (warp-bridge-2)
- GitHub Check: cli-evm-e2e-matrix (warp-apply-2)
- GitHub Check: cli-evm-e2e-matrix (warp-apply-1)
- GitHub Check: cli-evm-e2e-matrix (warp-apply-submitters)
- GitHub Check: cli-evm-e2e-matrix (warp-check-2)
- GitHub Check: cli-evm-e2e-matrix (warp-bridge-1)
- GitHub Check: cli-evm-e2e-matrix (core-read)
- GitHub Check: cli-evm-e2e-matrix (core-apply)
- GitHub Check: env-test-matrix (mainnet3, inevm, core)
- GitHub Check: cli-evm-e2e-matrix (core-check)
- GitHub Check: env-test-matrix (testnet4, sepolia, core)
- GitHub Check: env-test-matrix (mainnet3, arbitrum, igp)
- GitHub Check: env-test-matrix (mainnet3, optimism, igp)
- GitHub Check: env-test-matrix (mainnet3, ethereum, igp)
- GitHub Check: env-test-matrix (mainnet3, inevm, igp)
- GitHub Check: env-test-matrix (mainnet3, optimism, core)
- GitHub Check: env-test-matrix (mainnet3, arbitrum, core)
- GitHub Check: cli-cosmos-e2e-matrix (warp-read)
- GitHub Check: env-test-matrix (mainnet3, ethereum, core)
- GitHub Check: cli-cosmos-e2e-matrix (core-read)
- GitHub Check: cli-cosmos-e2e-matrix (core-apply)
- GitHub Check: cli-cosmos-e2e-matrix (warp-deploy)
- GitHub Check: cli-cosmos-e2e-matrix (core-deploy)
- GitHub Check: cli-cosmos-e2e-matrix (core-check)
- GitHub Check: cli-install-test-run
- GitHub Check: cli-cross-chain-e2e-matrix (warp-apply)
- GitHub Check: cli-cross-chain-e2e-matrix (warp-deploy)
- GitHub Check: cosmos-sdk-e2e-run
- GitHub Check: build-and-push-to-gcr
- GitHub Check: e2e-matrix (starknet)
- GitHub Check: e2e-matrix (sealevel)
- GitHub Check: e2e-matrix (radix)
- GitHub Check: e2e-matrix (cosmosnative)
- GitHub Check: e2e-matrix (evm)
- GitHub Check: e2e-matrix (cosmwasm)
- GitHub Check: test-rs
- GitHub Check: lint-rs
- GitHub Check: lander-coverage
🔇 Additional comments (16)
rust/main/hyperlane-core/src/types/checkpoint.rs (1)
102-109: New CheckpointInfo type reads cleanShape and visibility look good; defaults won’t bite. No blockers from me.
rust/main/agents/relayer/src/msg/db_loader.rs (2)
425-428: Import looks rightPulling in CheckpointInfo here keeps tests in lockstep with the core type. All good.
740-742: Mock trait parity maintainedstore_latest_checkpoint_info / retrieve_latest_checkpoint_info signatures match the base trait. Nice and simple.
rust/main/agents/relayer/src/msg/pending_message.rs (1)
1206-1208: Mocks up to dateLatest checkpoint_info methods added to the mock trait—keeps things consistent with the DB API. Ship it.
rust/main/hyperlane-base/src/db/mod.rs (2)
6-9: Import of CheckpointInfo is appropriateBrings the right type into scope for the new API surface. No issues.
177-180: All implementors properly updated — breaking change successfully handledThe trait expansion for
store_latest_checkpoint_infoandretrieve_latest_checkpoint_infohas been comprehensively addressed across the codebase. HyperlaneRocksDB contains concrete implementations (lines 703-707), and all three test mocks in validator and relayer modules have been adapted with the required trait method declarations. The methods are actively used in production code (validator/src/submit.rs).rust/main/agents/validator/src/submit.rs (5)
19-21: New imports look goodNeeded for the new validation/reporting helpers.
136-161: Guard is fine, but there’s a type mismatch downstreamControl flow reads well. Note: calls to tree_exceeds_checkpoint currently pass CheckpointAtBlock; the helper expects Checkpoint. See fix below.
299-307: Reorg panic helper usage is consistentCorrect canonical vs local args; good to crash after persisting details and reporting.
367-401: panic_with_reorg builds ReorgEvent correctlyFields (local/canonical roots and indices) and timestamp look good.
403-414: Reporting path handles both height-aware and period-only chainsLooks fine; nice fallback when height isn’t available.
rust/main/hyperlane-base/src/db/rocks/hyperlane_db.rs (2)
8-12: Import of CheckpointInfo is correctRequired for the new persistence API.
703-708: Store/retrieve implementations are consistent with existing patternsUsing bool::default() as the singleton key matches highest_seen_message_nonce. Looks good.
rust/main/agents/validator/src/submit/tests.rs (3)
12-17: Mocks and imports updated for CheckpointInfoMock HyperlaneDb includes the new methods; imports are aligned.
Also applies to: 134-136
224-235: ReorgEvent assertions updated to new index fieldsGood checks on canonical/local indices after the struct change.
243-357: End-to-end reorg path test remains solidNice coverage: reorg detected, event persisted, and reporter invoked at block height.
| let mut latest_seen_checkpoint = self | ||
| .db | ||
| .retrieve_latest_checkpoint_info() | ||
| .unwrap_or_default() | ||
| .unwrap_or_default(); | ||
| loop { |
There was a problem hiding this comment.
Don’t swallow DB errors when loading latest_seen_checkpoint
Using unwrap_or_default twice hides real DB failures and can silently disable early reorg detection (we reset to 0/0 and accept lower indices as “fine”). Log and handle explicitly instead.
Apply:
- let mut latest_seen_checkpoint = self
- .db
- .retrieve_latest_checkpoint_info()
- .unwrap_or_default()
- .unwrap_or_default();
+ let mut latest_seen_checkpoint = match self.db.retrieve_latest_checkpoint_info() {
+ Ok(Some(info)) => info,
+ Ok(None) => CheckpointInfo::default(),
+ Err(err) => {
+ tracing::error!(
+ ?err,
+ "Failed to retrieve latest checkpoint info; starting with defaults"
+ );
+ CheckpointInfo::default()
+ }
+ };🤖 Prompt for AI Agents
In rust/main/agents/validator/src/submit.rs around lines 122 to 127, the double
unwrap_or_default on retrieve_latest_checkpoint_info() swallows DB errors and
silently resets latest_seen_checkpoint to defaults; replace this with explicit
error handling: match the Result/Option returned, log the DB error (including
the error details) and propagate or return an Err instead of defaulting to 0/0,
or if you must continue, use a clearly logged fallback branch that preserves
previous state; do not use unwrap_or_default twice—handle Result with ? or match
and handle Option explicitly so real DB failures are not ignored.
| /// the latest local checkpoint index | ||
| pub local_checkpoint_index: u32, | ||
| /// the latest onchain checkpoint index | ||
| pub canonical_checkpoint_index: u32, |
There was a problem hiding this comment.
Fail‑closed or add backward‑compatible deserialization for reorg_flag.json
Old files won’t have the new fields. With derive(Deserialize), parsing the old shape fails, and build_and_validate logs then proceeds “assuming no reorg.” That can let a node start despite a previously flagged reorg. Not great in the swamp.
Two options (prefer 1):
- Backward‑compatible Deserialize: accept either the old single checkpoint_index or the new pair, mapping old -> both indices.
- Fail‑closed in the syncer: if the reorg file exists but can’t be parsed, return a fatal error instead of assuming no reorg.
Here’s a compact approach for (1), replacing the derive(Deserialize) and accepting both shapes:
-use serde::{Deserialize, Serialize};
+use serde::{Deserialize, Deserializer, Serialize};
#[derive(Debug, Clone, Serialize, new, PartialEq, Default)]
pub struct ReorgEvent {
pub local_merkle_root: H256,
pub canonical_merkle_root: H256,
/// the latest local checkpoint index
pub local_checkpoint_index: u32,
/// the latest onchain checkpoint index
pub canonical_checkpoint_index: u32,
pub unix_timestamp: u64,
pub reorg_period: ReorgPeriod,
}
+#[derive(Deserialize)]
+#[serde(untagged)]
+enum ReorgEventCompat {
+ Old {
+ local_merkle_root: H256,
+ canonical_merkle_root: H256,
+ checkpoint_index: u32,
+ unix_timestamp: u64,
+ reorg_period: ReorgPeriod,
+ },
+ New {
+ local_merkle_root: H256,
+ canonical_merkle_root: H256,
+ local_checkpoint_index: u32,
+ canonical_checkpoint_index: u32,
+ unix_timestamp: u64,
+ reorg_period: ReorgPeriod,
+ },
+}
+
+impl<'de> Deserialize<'de> for ReorgEvent {
+ fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
+ where
+ D: Deserializer<'de>,
+ {
+ let v = ReorgEventCompat::deserialize(deserializer)?;
+ Ok(match v {
+ ReorgEventCompat::New {
+ local_merkle_root,
+ canonical_merkle_root,
+ local_checkpoint_index,
+ canonical_checkpoint_index,
+ unix_timestamp,
+ reorg_period,
+ } => ReorgEvent {
+ local_merkle_root,
+ canonical_merkle_root,
+ local_checkpoint_index,
+ canonical_checkpoint_index,
+ unix_timestamp,
+ reorg_period,
+ },
+ ReorgEventCompat::Old {
+ local_merkle_root,
+ canonical_merkle_root,
+ checkpoint_index,
+ unix_timestamp,
+ reorg_period,
+ } => ReorgEvent {
+ local_merkle_root,
+ canonical_merkle_root,
+ local_checkpoint_index: checkpoint_index,
+ canonical_checkpoint_index: checkpoint_index,
+ unix_timestamp,
+ reorg_period,
+ },
+ })
+ }
+}If you’d rather do (2), flip build_and_validate to error if the status file exists but deserialization fails. I can sketch that too.
Committable suggestion skipped: line range outside the PR's diff.
Description
Related issues
Backward compatibility
reorg_flag.jsoncontent has changedTesting
Summary by CodeRabbit