perf(l1): inverted address log index to narrow eth_getLogs candidate blocks#6884
perf(l1): inverted address log index to narrow eth_getLogs candidate blocks#6884ElFantasma wants to merge 7 commits into
Conversation
🤖 Kimi Code ReviewThis is a well-structured PR that introduces a background-built inverted log index to optimize Issues and suggestions: 1. Silent data corruption in // Suggested:
pub fn decode_offsets(bytes: &[u8]) -> Result<Vec<u16>, StoreError> {
if bytes.len() % 2 != 0 {
return Err(StoreError::Custom("corrupted log index: odd length".into()));
}
Ok(bytes.chunks_exact(2).map(|c| u16::from_be_bytes([c[0], c[1]])).collect())
}2. Transaction overhead in indexer ( 3. Arithmetic overflow safety ( // Use satur
---
*Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt* |
🤖 Claude Code ReviewHere is the review of PR #6884: PR #6884 —
|
Lines of code reportTotal lines added: Detailed view |
Greptile SummaryThis PR introduces an inverted address → blocks log index to make
Confidence Score: 3/5The query path always falls back to a full scan when the index cannot help, so existing behaviour is never regressed. The background indexer has a stall risk that would silently degrade eth_getLogs back to scan mode on affected nodes. The core indexing logic, key encoding, and query narrowing are all sound. The concern is in index_pending_log_sections: if get_canonical_block_hash_sync returns None for any block in a section, the function returns an error, the section counter is never advanced, and every subsequent poll retries and fails the same section forever, permanently blocking the index for all later sections. Any node where old canonical hashes are absent would silently regress to full-scan mode with no indicator beyond a repeating WARN log. crates/storage/store.rs — specifically the index_pending_log_sections loop and the two separate writes that commit section data and the section counter.
|
| Filename | Overview |
|---|---|
| crates/storage/log_index.rs | New module implementing the section-based address to block log index; encoding/decoding helpers are correct and well-tested. |
| crates/storage/store.rs | Adds indexer logic and query helpers; indexer can permanently stall on a missing canonical hash, and section entries and counter are committed non-atomically. |
| crates/networking/rpc/eth/logs.rs | Query path correctly falls back to full-range scan when the index returns None; bloom prefilter and exact topic/address filtering still apply on all visited blocks. |
| cmd/ethrex/initializers.rs | Background indexer spawned off the block-import path with cancellation support; 256-block confirmation depth is well above the 128-block reorg limit. |
| crates/storage/api/tables.rs | New LOG_ADDRESS_INDEX CF added to TABLES array; key schema is clearly documented. |
| crates/storage/lib.rs | Trivial change exposing the new log_index module. |
Sequence Diagram
%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant Client
participant RPC as fetch_logs_with_filter
participant Store
participant Index as LOG_ADDRESS_INDEX
participant Receipts as RECEIPTS_V2
Note over Store: Background: run_log_indexer every 12s
Store->>Store: index_pending_log_sections depth 256
Store->>Receipts: get_receipts_for_block_sync per block in section
Store->>Index: write_batch address to offsets per section
Store->>Store: set_indexed_log_sections section plus 1
Client->>RPC: eth_getLogs address fromBlock toBlock
RPC->>Store: get_candidate_blocks_by_address addresses from to
alt range is indexed
Store->>Index: read address section entries
Store-->>RPC: Some candidate block numbers
else range not indexed or no address filter
Store-->>RPC: None fall back to full range
end
loop for each candidate block
RPC->>Store: get_block_header bloom prefilter
RPC->>Store: get_receipts_for_block exact filter
RPC-->>Client: matching RpcLog entries
end
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant Client
participant RPC as fetch_logs_with_filter
participant Store
participant Index as LOG_ADDRESS_INDEX
participant Receipts as RECEIPTS_V2
Note over Store: Background: run_log_indexer every 12s
Store->>Store: index_pending_log_sections depth 256
Store->>Receipts: get_receipts_for_block_sync per block in section
Store->>Index: write_batch address to offsets per section
Store->>Store: set_indexed_log_sections section plus 1
Client->>RPC: eth_getLogs address fromBlock toBlock
RPC->>Store: get_candidate_blocks_by_address addresses from to
alt range is indexed
Store->>Index: read address section entries
Store-->>RPC: Some candidate block numbers
else range not indexed or no address filter
Store-->>RPC: None fall back to full range
end
loop for each candidate block
RPC->>Store: get_block_header bloom prefilter
RPC->>Store: get_receipts_for_block exact filter
RPC-->>Client: matching RpcLog entries
end
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
crates/storage/store.rs:1343-1367
**Permanent indexer stall on missing canonical hash**
If `get_canonical_block_hash_sync` returns `None` for any block within a section (e.g., a node that snap-synced and does not retain old canonical hashes, or any DB inconsistency), the `?` propagates the error out of the entire function. `set_indexed_log_sections` is never called, so the section counter stays put. Every subsequent poll (every 12 s) will retry and fail the same section, permanently blocking the index for that section and every section after it — the indexer is stuck forever and every `eth_getLogs` call will fall back to a full scan for those ranges.
Consider catching this error per-block (log + skip the block or the whole section) so the indexer can make forward progress rather than looping indefinitely.
### Issue 2 of 3
crates/storage/store.rs:1354-1360
**Missing `receipt.succeeded` filter in indexer**
`fetch_logs_with_filter` only collects logs from successful transactions (`if receipt.succeeded`), but `index_pending_log_sections` indexes address offsets from ALL receipt logs regardless of transaction status. Ethereum's protocol guarantee (reverted transactions have empty `receipt.logs`) makes this harmless in practice, but it contradicts the module docstring's claim of "no false positives" and creates an inconsistency that could matter on custom / pre-EIP-658 networks or if receipt handling ever diverges.
```suggestion
for receipt in &receipts {
if !receipt.succeeded {
continue;
}
for log in &receipt.logs {
if seen.insert(log.address) {
address_offsets.entry(log.address).or_default().push(offset);
}
}
}
```
### Issue 3 of 3
crates/storage/store.rs:1362-1364
**Non-atomic commit of section entries and section counter**
`write_batch(LOG_ADDRESS_INDEX, entries)` and `set_indexed_log_sections(section + 1)` are two independent writes to two different column families (`LOG_ADDRESS_INDEX` and `MISC_VALUES`). A crash between them leaves the section data written but the counter un-advanced; on restart the section would be re-indexed (writing identical data — idempotent) before the counter catches up. This is safe and self-healing, but if both writes could be issued as a single RocksDB write batch it would avoid the transient inconsistency entirely.
Reviews (1): Last reviewed commit: "perf(l1): background log indexer + use a..." | Re-trigger Greptile
| let block_hash = self.get_canonical_block_hash_sync(block_number)?.ok_or_else( | ||
| || { | ||
| StoreError::Custom(format!( | ||
| "log index: missing canonical hash for block {block_number}" | ||
| )) | ||
| }, | ||
| )?; | ||
| let receipts = self.get_receipts_for_block_sync(&block_hash)?; | ||
| let offset = log_index::offset_in_section(block_number); | ||
| // Each address contributes this block's offset at most once. | ||
| let mut seen = HashSet::new(); | ||
| for receipt in &receipts { | ||
| for log in &receipt.logs { | ||
| if seen.insert(log.address) { | ||
| address_offsets.entry(log.address).or_default().push(offset); | ||
| } | ||
| } | ||
| } | ||
| } | ||
| let entries = log_index::build_section_entries(section, address_offsets); | ||
| self.write_batch(LOG_ADDRESS_INDEX, entries)?; | ||
| self.set_indexed_log_sections(section + 1)?; | ||
| section += 1; | ||
| newly_indexed += 1; | ||
| } |
There was a problem hiding this comment.
Permanent indexer stall on missing canonical hash
If get_canonical_block_hash_sync returns None for any block within a section (e.g., a node that snap-synced and does not retain old canonical hashes, or any DB inconsistency), the ? propagates the error out of the entire function. set_indexed_log_sections is never called, so the section counter stays put. Every subsequent poll (every 12 s) will retry and fail the same section, permanently blocking the index for that section and every section after it — the indexer is stuck forever and every eth_getLogs call will fall back to a full scan for those ranges.
Consider catching this error per-block (log + skip the block or the whole section) so the indexer can make forward progress rather than looping indefinitely.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1343-1367
Comment:
**Permanent indexer stall on missing canonical hash**
If `get_canonical_block_hash_sync` returns `None` for any block within a section (e.g., a node that snap-synced and does not retain old canonical hashes, or any DB inconsistency), the `?` propagates the error out of the entire function. `set_indexed_log_sections` is never called, so the section counter stays put. Every subsequent poll (every 12 s) will retry and fail the same section, permanently blocking the index for that section and every section after it — the indexer is stuck forever and every `eth_getLogs` call will fall back to a full scan for those ranges.
Consider catching this error per-block (log + skip the block or the whole section) so the indexer can make forward progress rather than looping indefinitely.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in 9d21113. A missing canonical hash for a buried block now makes the indexer defer the section — it breaks out of the loop, logs a warn!, and leaves that section (and everything after it) unindexed rather than propagating an error. Queries fall back to scanning the un-indexed range, so results stay correct; the next poll retries in case it was transient. The section is never marked indexed unless it was built in full, so we can't silently drop a block's logs (which would be a false negative on the indexed path).
| for receipt in &receipts { | ||
| for log in &receipt.logs { | ||
| if seen.insert(log.address) { | ||
| address_offsets.entry(log.address).or_default().push(offset); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Missing
receipt.succeeded filter in indexer
fetch_logs_with_filter only collects logs from successful transactions (if receipt.succeeded), but index_pending_log_sections indexes address offsets from ALL receipt logs regardless of transaction status. Ethereum's protocol guarantee (reverted transactions have empty receipt.logs) makes this harmless in practice, but it contradicts the module docstring's claim of "no false positives" and creates an inconsistency that could matter on custom / pre-EIP-658 networks or if receipt handling ever diverges.
| for receipt in &receipts { | |
| for log in &receipt.logs { | |
| if seen.insert(log.address) { | |
| address_offsets.entry(log.address).or_default().push(offset); | |
| } | |
| } | |
| } | |
| for receipt in &receipts { | |
| if !receipt.succeeded { | |
| continue; | |
| } | |
| for log in &receipt.logs { | |
| if seen.insert(log.address) { | |
| address_offsets.entry(log.address).or_default().push(offset); | |
| } | |
| } | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1354-1360
Comment:
**Missing `receipt.succeeded` filter in indexer**
`fetch_logs_with_filter` only collects logs from successful transactions (`if receipt.succeeded`), but `index_pending_log_sections` indexes address offsets from ALL receipt logs regardless of transaction status. Ethereum's protocol guarantee (reverted transactions have empty `receipt.logs`) makes this harmless in practice, but it contradicts the module docstring's claim of "no false positives" and creates an inconsistency that could matter on custom / pre-EIP-658 networks or if receipt handling ever diverges.
```suggestion
for receipt in &receipts {
if !receipt.succeeded {
continue;
}
for log in &receipt.logs {
if seen.insert(log.address) {
address_offsets.entry(log.address).or_default().push(offset);
}
}
}
```
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Fixed in 9d21113 — added if !receipt.succeeded { continue; } so the indexer mirrors the query path exactly. As you note it's a no-op on EIP-658 networks (reverted txs carry no logs), but it keeps the index exact everywhere and makes the docstring's "no false positives" claim hold unconditionally.
| let entries = log_index::build_section_entries(section, address_offsets); | ||
| self.write_batch(LOG_ADDRESS_INDEX, entries)?; | ||
| self.set_indexed_log_sections(section + 1)?; |
There was a problem hiding this comment.
Non-atomic commit of section entries and section counter
write_batch(LOG_ADDRESS_INDEX, entries) and set_indexed_log_sections(section + 1) are two independent writes to two different column families (LOG_ADDRESS_INDEX and MISC_VALUES). A crash between them leaves the section data written but the counter un-advanced; on restart the section would be re-indexed (writing identical data — idempotent) before the counter catches up. This is safe and self-healing, but if both writes could be issued as a single RocksDB write batch it would avoid the transient inconsistency entirely.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1362-1364
Comment:
**Non-atomic commit of section entries and section counter**
`write_batch(LOG_ADDRESS_INDEX, entries)` and `set_indexed_log_sections(section + 1)` are two independent writes to two different column families (`LOG_ADDRESS_INDEX` and `MISC_VALUES`). A crash between them leaves the section data written but the counter un-advanced; on restart the section would be re-indexed (writing identical data — idempotent) before the counter catches up. This is safe and self-healing, but if both writes could be issued as a single RocksDB write batch it would avoid the transient inconsistency entirely.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Fixed in 9d21113. Section entries and the counter now commit in a single write transaction (commit_indexed_log_section): one begin_write → put_batch(LOG_ADDRESS_INDEX, …) + put(MISC_VALUES, counter) → commit. RocksDB write batches span column families, so the two writes are now atomic and there's no crash window at all.
🤖 Codex Code Review
No consensus, gas-accounting, or EVM opcode logic is touched here; the main risks are around RPC correctness and operational safety. I did not run tests, and I don’t see coverage in this diff for partial-history stores or “indexed head + unindexed tail” log queries. Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
Benchmark Block Execution Results Comparison Against Main
|
|
Thanks for the reviews. Addressed in Fixed
Acknowledged, not changing
Test coverage (Claude) — the pure helpers are unit-tested and the storage methods are exercised end-to-end by the mainnet benchmark (6184 sections indexed, query results verified). A Store-level integration test for the partial-coverage ( |
| for block_num in from..=to { | ||
| // For each candidate block, load its receipts (bulk) and collect matching logs. | ||
| for block_num in block_numbers { |
There was a problem hiding this comment.
This hides the error in the case where from is lower than the earliest block we have. Previously we would have tried to obtain the body and failed doing so.
There was a problem hiding this comment.
Fixed in 8ec9842. The index now records the first section it actually covers (log_index_start_section), and get_candidate_blocks_by_address returns None (→ full scan) when the query's from is below that covered start. So a range beginning below the node's retained history takes the original scan path — surfacing the missing-body error as before — instead of the index silently reporting no logs for it.
9d21113 to
1057030
Compare
…tomic commit, corruption + overflow guards)
1057030 to
8ec9842
Compare
Motivation
eth_getLogswas 175–340× slower than geth (issue #6785). Two prior steps closed most of the constant-factor gap but left the algorithmic one:Both stay O(blocks-in-range): every block in the range is still visited. For popular signatures (e.g. the ERC-20
Transfertopic) the 2048-bit header bloom saturates — nearly every block looks like a candidate — so the prefilter can't narrow the range. The remaining gap to indexed clients is stage-1 candidate selection, which needs a real index.Description
Adds an inverted address → blocks log index and uses it to narrow
eth_getLogsto candidate blocks.log_indexstorage module +LOG_ADDRESS_INDEXcolumn family. Keyaddress (20B) || section (8B BE), value = big-endianu16in-section block offsets where that address emitted a log. Blocks are grouped into fixedSECTION_SIZE = 4096-block sections.run_log_indexer, spawned ininit_l1). Polls periodically and indexes any section once it is buriedCONFIRMATION_DEPTH = 256blocks — well clear of the 128-block reorg limit, so indexed sections are immutable and need no reorg invalidation. Runs off the block-import path (spawn_blocking); backfills on first run, then moves forward. Never touches the hot write path.fetch_logs_with_filter): when the filter has an address and the range is covered by the index, narrow to candidate blocks viaget_candidate_blocks_by_address; otherwise fall back to the full-range scan (the header-bloom prefilter still applies). Every visited block is still exact-filtered, so extra or missing index candidates can only ever cost a little time — results are identical.STORE_SCHEMA_VERSIONbump. The CF is registered inTABLESand auto-creates; on a DB without the index, or for a range that isn't indexed yet, the query returnsNoneand falls back to the existing scan. The most recent <4096 blocks (the current, not-yet-buried section) always serve from the scan path until they're indexed (minutes).Part of #6785. Stacks on #6852 → #6813; will retarget to
mainas those merge.Benchmarks
ethrex-mainnet-5, fully-synced mainnet, LUSDTransferover recent ranges, single-call median latency (same harness/methodology as the earlier runs).eth_getLogsis competitive with an indexed client — faster than geth at 100 & 1,000 blocks, on par at 10,000.Under load (vegeta rate-sweep, indexed ranges),
eth_getLogspreviously collapsed (30 s timeouts at 100 rps; 1k/10k saturated even at 10 rps). Now 100% success, no timeouts: 100-blk sustains the full 1,000 rps; 1,000-blk ~400 rps; 10,000-blk ~40 rps (the remaining ceiling is CPU-bound stage-2 receipt reads on the matched blocks).Checklist
Storere-sync required: new CF auto-creates; absence / unindexed range → scan fallback.