Skip to content

perf(l1): inverted address log index to narrow eth_getLogs candidate blocks#6884

Open
ElFantasma wants to merge 7 commits into
mainfrom
perf/getlogs-inverted-index
Open

perf(l1): inverted address log index to narrow eth_getLogs candidate blocks#6884
ElFantasma wants to merge 7 commits into
mainfrom
perf/getlogs-inverted-index

Conversation

@ElFantasma

Copy link
Copy Markdown
Contributor

Motivation

eth_getLogs was 175–340× slower than geth (issue #6785). Two prior steps closed most of the constant-factor gap but left the algorithmic one:

Both stay O(blocks-in-range): every block in the range is still visited. For popular signatures (e.g. the ERC-20 Transfer topic) the 2048-bit header bloom saturates — nearly every block looks like a candidate — so the prefilter can't narrow the range. The remaining gap to indexed clients is stage-1 candidate selection, which needs a real index.

Description

Adds an inverted address → blocks log index and uses it to narrow eth_getLogs to candidate blocks.

  • New log_index storage module + LOG_ADDRESS_INDEX column family. Key address (20B) || section (8B BE), value = big-endian u16 in-section block offsets where that address emitted a log. Blocks are grouped into fixed SECTION_SIZE = 4096-block sections.
  • Background indexer (run_log_indexer, spawned in init_l1). Polls periodically and indexes any section once it is buried CONFIRMATION_DEPTH = 256 blocks — well clear of the 128-block reorg limit, so indexed sections are immutable and need no reorg invalidation. Runs off the block-import path (spawn_blocking); backfills on first run, then moves forward. Never touches the hot write path.
  • Query path (fetch_logs_with_filter): when the filter has an address and the range is covered by the index, narrow to candidate blocks via get_candidate_blocks_by_address; otherwise fall back to the full-range scan (the header-bloom prefilter still applies). Every visited block is still exact-filtered, so extra or missing index candidates can only ever cost a little time — results are identical.
  • No migration / no STORE_SCHEMA_VERSION bump. The CF is registered in TABLES and auto-creates; on a DB without the index, or for a range that isn't indexed yet, the query returns None and falls back to the existing scan. The most recent <4096 blocks (the current, not-yet-buried section) always serve from the scan path until they're indexed (minutes).

Part of #6785. Stacks on #6852#6813; will retarget to main as those merge.

Benchmarks

ethrex-mainnet-5, fully-synced mainnet, LUSD Transfer over recent ranges, single-call median latency (same harness/methodology as the earlier runs).

range main baseline #6813 prefilter #6852 bulk receipts this PR (indexed) geth (ref)
100 blk 618 ms 415 ms 80 ms 0.3 ms 2.5 ms
1,000 blk 6,131 ms 3,529 ms 613 ms 14.4 ms 35 ms
10,000 blk 70,441 ms 31,857 ms 4,272 ms 195.7 ms 204 ms
  • ~270× / 43× / 22× vs the previous best (bulk receipts); ~360–2000× vs the original baseline.
  • First time ethrex eth_getLogs is competitive with an indexed client — faster than geth at 100 & 1,000 blocks, on par at 10,000.

Under load (vegeta rate-sweep, indexed ranges), eth_getLogs previously collapsed (30 s timeouts at 100 rps; 1k/10k saturated even at 10 rps). Now 100% success, no timeouts: 100-blk sustains the full 1,000 rps; 1,000-blk ~400 rps; 10,000-blk ~40 rps (the remaining ceiling is CPU-bound stage-2 receipt reads on the matched blocks).

Checklist

  • Background indexer runs off the block-import path; only finalized (256-block-buried) sections are indexed → no reorg invalidation.
  • Correctness preserved: every visited block is exact-filtered; index only selects candidates, with full-scan fallback.
  • No Store re-sync required: new CF auto-creates; absence / unindexed range → scan fallback.

@ElFantasma ElFantasma requested a review from a team as a code owner June 17, 2026 15:11
@ElFantasma ElFantasma added performance Block execution throughput and performance in general L1 Ethereum client labels Jun 17, 2026
@ethrex-project-sync ethrex-project-sync Bot moved this to In Review in ethrex_l1 Jun 17, 2026
@github-actions

Copy link
Copy Markdown

🤖 Kimi Code Review

This is a well-structured PR that introduces a background-built inverted log index to optimize eth_getLogs. The design correctly avoids reorg handling by only indexing buried sections and integrates cleanly with the existing bloom prefilter.

Issues and suggestions:

1. Silent data corruption in decode_offsets (crates/storage/log_index.rs:46)
If the stored value has an odd length (corruption), the last byte is silently dropped by chunks_exact(2).

// Suggested:
pub fn decode_offsets(bytes: &[u8]) -> Result<Vec<u16>, StoreError> {
    if bytes.len() % 2 != 0 {
        return Err(StoreError::Custom("corrupted log index: odd length".into()));
    }
    Ok(bytes.chunks_exact(2).map(|c| u16::from_be_bytes([c[0], c[1]])).collect())
}

2. Transaction overhead in indexer (crates/storage/store.rs:1316)
get_receipts_for_block_sync creates a new read transaction for every block (up to 4096 per section). Consider reusing the transaction across the section loop or documenting this as a future optimization.

3. Arithmetic overflow safety (crates/storage/store.rs:1294)
While unlikely in practice, last_block + confirmation_depth could overflow if latest is near u64::MAX.

// Use satur


---
*Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt*

@github-actions

Copy link
Copy Markdown

🤖 Claude Code Review

Here is the review of PR #6884:


PR #6884perf(l1): inverted address log index to narrow eth_getLogs candidate blocks

Overview

This PR adds a background-built inverted index (address → block offsets) stored in a new LOG_ADDRESS_INDEX RocksDB column family. The query path in eth_getLogs uses it to jump directly to candidate blocks instead of scanning the full range. Benchmark numbers are impressive and the design is sound overall (confirmation-depth guarantee, scan fallback, exact re-filtering at every visited block). Below are the findings.


Correctness Issues

Non-atomic section commit (store.rs, index_pending_log_sections)

self.write_batch(LOG_ADDRESS_INDEX, entries)?;
self.set_indexed_log_sections(section + 1)?;  // separate write

If the process crashes between these two writes, the next run will reprocess the same section (the counter was not bumped). Because the offsets are sorted/deduped and the entries are written again with the same keys and values, this is idempotent — correctness is preserved — but it is worth calling out. A single write_batch that also includes the counter key would make recovery cleaner and eliminate any window.


Indexer can get stuck on a missing canonical hash (store.rs, lines ~1330–1340)

let block_hash = self.get_canonical_block_hash_sync(block_number)?.ok_or_else(
    || StoreError::Custom(format!(
        "log index: missing canonical hash for block {block_number}"
    )),
)?;

If a block inside a section lacks a canonical hash (e.g., at the boundary of the retained range, or due to a DB inconsistency), index_pending_log_sections returns an error. The caller in run_log_indexer logs a warn! and retries next poll — but section was never incremented, so it retries the same section forever. Consider either:

  • Skipping the section and emitting a higher-severity log, or
  • Treating a missing block as "not buried yet" and breaking early rather than returning Err.

decode_offsets silently discards a trailing byte (log_index.rs, line ~52)

bytes.chunks_exact(2).map(|c| u16::from_be_bytes([c[0], c[1]])).collect()

chunks_exact(2) ignores a lone trailing byte. If a stored value is corrupted and has an odd length, the truncation is silent. At minimum, a debug_assert_eq!(bytes.len() % 2, 0) would make corruption detectable in debug builds. In production, logging a warning on odd lengths would aid diagnosis.


Design / Behaviour Notes

from >= indexed_until uses u64 comparison — block 0 edge case

In get_candidate_blocks_by_address:

let indexed_until = self.get_indexed_log_sections()?.saturating_mul(log_index::SECTION_SIZE);
if from >= indexed_until { return Ok(None); }

When indexed_sections == 0, indexed_until == 0, and from >= 0 is trivially true for any u64, so None is correctly returned. The subsequent indexed_until - 1 subtraction is never reached. This is correct, but the reasoning is non-obvious; a comment explaining why saturating_mul and the >= guard together prevent underflow would help future readers.


Large address lists result in N × M index reads

In get_candidate_blocks_by_address, for each (address, section) pair a separate DB read is issued. With N addresses and M sections in the query range this is N × M reads. For typical eth_getLogs usage (small address sets) this is fine. For callers that pass large address arrays (the spec allows it), this can become many reads. A comment in the docstring noting the scaling would set reader expectations.


get_receipts_for_block_sync has a redundant guard (store.rs, ~line 1295)

if !k.starts_with(prefix) { break; }

After a prefix iterator the backend should never return keys outside the prefix. The guard is defensive and harmless, but it slightly obscures that the iterator is already bounded. A brief inline comment ("defensive: prefix_iterator should guarantee this") would clarify intent.


Missing Test Coverage

The unit tests in log_index.rs cover all the pure helper functions well. However, there are no tests for:

  • index_pending_log_sections — the core indexing logic (receipts → entries → counter update).
  • get_candidate_blocks_by_address — particularly the mixed indexed+unindexed tail case and the edge where from falls mid-section.

Integration tests for these two methods would significantly raise confidence, especially for the partial-coverage path (from < indexed_until <= to).


Minor

  • CONFIRMATION_DEPTH is defined in initializers.rs rather than alongside SECTION_SIZE in log_index.rs (or passed as a named constant from there). Moving or re-exporting it from the storage module would make the policy discoverable from one place.
  • build_section_entries takes impl IntoIterator<Item = (Address, Vec<u16>)> and does sort+dedup internally. The caller already uses a HashMap, so duplicates per address within a block are impossible by construction. The dedup is still correct as a guard but the docstring could note this to prevent confusion.

Summary

The overall design is solid: the confirmation-depth guarantee makes the index append-only (no reorg invalidation), the scan fallback preserves correctness for unindexed ranges, and the benchmark numbers speak for themselves. The two items that need the most attention before merge are the stuck-indexer-on-missing-hash (Item 2) and the non-atomic section commit (Item 1). The silent decode truncation (Item 3) is low severity but easy to fix. Test coverage for the storage methods would meaningfully reduce risk.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown

Lines of code report

Total lines added: 287
Total lines removed: 0
Total lines changed: 287

Detailed view
+------------------------------------------+-------+------+
| File                                     | Lines | Diff |
+------------------------------------------+-------+------+
| ethrex/cmd/ethrex/initializers.rs        | 824   | +26  |
+------------------------------------------+-------+------+
| ethrex/crates/networking/rpc/eth/logs.rs | 390   | +10  |
+------------------------------------------+-------+------+
| ethrex/crates/storage/api/tables.rs      | 25    | +1   |
+------------------------------------------+-------+------+
| ethrex/crates/storage/lib.rs             | 17    | +1   |
+------------------------------------------+-------+------+
| ethrex/crates/storage/log_index.rs       | 105   | +105 |
+------------------------------------------+-------+------+
| ethrex/crates/storage/store.rs           | 3086  | +144 |
+------------------------------------------+-------+------+

@greptile-apps

greptile-apps Bot commented Jun 17, 2026

Copy link
Copy Markdown

Greptile Summary

This PR introduces an inverted address → blocks log index to make eth_getLogs competitive with indexed clients like geth. A background indexer writes one section (4096 blocks) at a time into a new LOG_ADDRESS_INDEX RocksDB column family once each section is buried 256 blocks deep; the query path narrows the block scan to only candidate blocks, falling back to the existing bloom-prefilter scan when the range is not indexed.

  • New log_index module encodes (address, section) → sorted u16 in-section block offsets; keys are address (20B) || section (8B BE).
  • Background indexer (run_log_indexer) polls every 12 s and indexes newly buried sections via spawn_blocking, never touching the block-import path; the new CF auto-creates so no migration is needed.
  • Query path in fetch_logs_with_filter calls get_candidate_blocks_by_address and falls back to the full range scan when the index cannot help; every returned candidate still goes through bloom + exact filtering.

Confidence Score: 3/5

The query path always falls back to a full scan when the index cannot help, so existing behaviour is never regressed. The background indexer has a stall risk that would silently degrade eth_getLogs back to scan mode on affected nodes.

The core indexing logic, key encoding, and query narrowing are all sound. The concern is in index_pending_log_sections: if get_canonical_block_hash_sync returns None for any block in a section, the function returns an error, the section counter is never advanced, and every subsequent poll retries and fails the same section forever, permanently blocking the index for all later sections. Any node where old canonical hashes are absent would silently regress to full-scan mode with no indicator beyond a repeating WARN log.

crates/storage/store.rs — specifically the index_pending_log_sections loop and the two separate writes that commit section data and the section counter.

Important Files Changed

Filename Overview
crates/storage/log_index.rs New module implementing the section-based address to block log index; encoding/decoding helpers are correct and well-tested.
crates/storage/store.rs Adds indexer logic and query helpers; indexer can permanently stall on a missing canonical hash, and section entries and counter are committed non-atomically.
crates/networking/rpc/eth/logs.rs Query path correctly falls back to full-range scan when the index returns None; bloom prefilter and exact topic/address filtering still apply on all visited blocks.
cmd/ethrex/initializers.rs Background indexer spawned off the block-import path with cancellation support; 256-block confirmation depth is well above the 128-block reorg limit.
crates/storage/api/tables.rs New LOG_ADDRESS_INDEX CF added to TABLES array; key schema is clearly documented.
crates/storage/lib.rs Trivial change exposing the new log_index module.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Client
    participant RPC as fetch_logs_with_filter
    participant Store
    participant Index as LOG_ADDRESS_INDEX
    participant Receipts as RECEIPTS_V2

    Note over Store: Background: run_log_indexer every 12s
    Store->>Store: index_pending_log_sections depth 256
    Store->>Receipts: get_receipts_for_block_sync per block in section
    Store->>Index: write_batch address to offsets per section
    Store->>Store: set_indexed_log_sections section plus 1

    Client->>RPC: eth_getLogs address fromBlock toBlock
    RPC->>Store: get_candidate_blocks_by_address addresses from to
    alt range is indexed
        Store->>Index: read address section entries
        Store-->>RPC: Some candidate block numbers
    else range not indexed or no address filter
        Store-->>RPC: None fall back to full range
    end
    loop for each candidate block
        RPC->>Store: get_block_header bloom prefilter
        RPC->>Store: get_receipts_for_block exact filter
        RPC-->>Client: matching RpcLog entries
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Client
    participant RPC as fetch_logs_with_filter
    participant Store
    participant Index as LOG_ADDRESS_INDEX
    participant Receipts as RECEIPTS_V2

    Note over Store: Background: run_log_indexer every 12s
    Store->>Store: index_pending_log_sections depth 256
    Store->>Receipts: get_receipts_for_block_sync per block in section
    Store->>Index: write_batch address to offsets per section
    Store->>Store: set_indexed_log_sections section plus 1

    Client->>RPC: eth_getLogs address fromBlock toBlock
    RPC->>Store: get_candidate_blocks_by_address addresses from to
    alt range is indexed
        Store->>Index: read address section entries
        Store-->>RPC: Some candidate block numbers
    else range not indexed or no address filter
        Store-->>RPC: None fall back to full range
    end
    loop for each candidate block
        RPC->>Store: get_block_header bloom prefilter
        RPC->>Store: get_receipts_for_block exact filter
        RPC-->>Client: matching RpcLog entries
    end
Loading
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
crates/storage/store.rs:1343-1367
**Permanent indexer stall on missing canonical hash**

If `get_canonical_block_hash_sync` returns `None` for any block within a section (e.g., a node that snap-synced and does not retain old canonical hashes, or any DB inconsistency), the `?` propagates the error out of the entire function. `set_indexed_log_sections` is never called, so the section counter stays put. Every subsequent poll (every 12 s) will retry and fail the same section, permanently blocking the index for that section and every section after it — the indexer is stuck forever and every `eth_getLogs` call will fall back to a full scan for those ranges.

Consider catching this error per-block (log + skip the block or the whole section) so the indexer can make forward progress rather than looping indefinitely.

### Issue 2 of 3
crates/storage/store.rs:1354-1360
**Missing `receipt.succeeded` filter in indexer**

`fetch_logs_with_filter` only collects logs from successful transactions (`if receipt.succeeded`), but `index_pending_log_sections` indexes address offsets from ALL receipt logs regardless of transaction status. Ethereum's protocol guarantee (reverted transactions have empty `receipt.logs`) makes this harmless in practice, but it contradicts the module docstring's claim of "no false positives" and creates an inconsistency that could matter on custom / pre-EIP-658 networks or if receipt handling ever diverges.

```suggestion
                for receipt in &receipts {
                    if !receipt.succeeded {
                        continue;
                    }
                    for log in &receipt.logs {
                        if seen.insert(log.address) {
                            address_offsets.entry(log.address).or_default().push(offset);
                        }
                    }
                }
```

### Issue 3 of 3
crates/storage/store.rs:1362-1364
**Non-atomic commit of section entries and section counter**

`write_batch(LOG_ADDRESS_INDEX, entries)` and `set_indexed_log_sections(section + 1)` are two independent writes to two different column families (`LOG_ADDRESS_INDEX` and `MISC_VALUES`). A crash between them leaves the section data written but the counter un-advanced; on restart the section would be re-indexed (writing identical data — idempotent) before the counter catches up. This is safe and self-healing, but if both writes could be issued as a single RocksDB write batch it would avoid the transient inconsistency entirely.

Reviews (1): Last reviewed commit: "perf(l1): background log indexer + use a..." | Re-trigger Greptile

Comment thread crates/storage/store.rs Outdated
Comment on lines +1343 to +1367
let block_hash = self.get_canonical_block_hash_sync(block_number)?.ok_or_else(
|| {
StoreError::Custom(format!(
"log index: missing canonical hash for block {block_number}"
))
},
)?;
let receipts = self.get_receipts_for_block_sync(&block_hash)?;
let offset = log_index::offset_in_section(block_number);
// Each address contributes this block's offset at most once.
let mut seen = HashSet::new();
for receipt in &receipts {
for log in &receipt.logs {
if seen.insert(log.address) {
address_offsets.entry(log.address).or_default().push(offset);
}
}
}
}
let entries = log_index::build_section_entries(section, address_offsets);
self.write_batch(LOG_ADDRESS_INDEX, entries)?;
self.set_indexed_log_sections(section + 1)?;
section += 1;
newly_indexed += 1;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Permanent indexer stall on missing canonical hash

If get_canonical_block_hash_sync returns None for any block within a section (e.g., a node that snap-synced and does not retain old canonical hashes, or any DB inconsistency), the ? propagates the error out of the entire function. set_indexed_log_sections is never called, so the section counter stays put. Every subsequent poll (every 12 s) will retry and fail the same section, permanently blocking the index for that section and every section after it — the indexer is stuck forever and every eth_getLogs call will fall back to a full scan for those ranges.

Consider catching this error per-block (log + skip the block or the whole section) so the indexer can make forward progress rather than looping indefinitely.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1343-1367

Comment:
**Permanent indexer stall on missing canonical hash**

If `get_canonical_block_hash_sync` returns `None` for any block within a section (e.g., a node that snap-synced and does not retain old canonical hashes, or any DB inconsistency), the `?` propagates the error out of the entire function. `set_indexed_log_sections` is never called, so the section counter stays put. Every subsequent poll (every 12 s) will retry and fail the same section, permanently blocking the index for that section and every section after it — the indexer is stuck forever and every `eth_getLogs` call will fall back to a full scan for those ranges.

Consider catching this error per-block (log + skip the block or the whole section) so the indexer can make forward progress rather than looping indefinitely.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9d21113. A missing canonical hash for a buried block now makes the indexer defer the section — it breaks out of the loop, logs a warn!, and leaves that section (and everything after it) unindexed rather than propagating an error. Queries fall back to scanning the un-indexed range, so results stay correct; the next poll retries in case it was transient. The section is never marked indexed unless it was built in full, so we can't silently drop a block's logs (which would be a false negative on the indexed path).

Comment thread crates/storage/store.rs
Comment on lines +1354 to +1360
for receipt in &receipts {
for log in &receipt.logs {
if seen.insert(log.address) {
address_offsets.entry(log.address).or_default().push(offset);
}
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing receipt.succeeded filter in indexer

fetch_logs_with_filter only collects logs from successful transactions (if receipt.succeeded), but index_pending_log_sections indexes address offsets from ALL receipt logs regardless of transaction status. Ethereum's protocol guarantee (reverted transactions have empty receipt.logs) makes this harmless in practice, but it contradicts the module docstring's claim of "no false positives" and creates an inconsistency that could matter on custom / pre-EIP-658 networks or if receipt handling ever diverges.

Suggested change
for receipt in &receipts {
for log in &receipt.logs {
if seen.insert(log.address) {
address_offsets.entry(log.address).or_default().push(offset);
}
}
}
for receipt in &receipts {
if !receipt.succeeded {
continue;
}
for log in &receipt.logs {
if seen.insert(log.address) {
address_offsets.entry(log.address).or_default().push(offset);
}
}
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1354-1360

Comment:
**Missing `receipt.succeeded` filter in indexer**

`fetch_logs_with_filter` only collects logs from successful transactions (`if receipt.succeeded`), but `index_pending_log_sections` indexes address offsets from ALL receipt logs regardless of transaction status. Ethereum's protocol guarantee (reverted transactions have empty `receipt.logs`) makes this harmless in practice, but it contradicts the module docstring's claim of "no false positives" and creates an inconsistency that could matter on custom / pre-EIP-658 networks or if receipt handling ever diverges.

```suggestion
                for receipt in &receipts {
                    if !receipt.succeeded {
                        continue;
                    }
                    for log in &receipt.logs {
                        if seen.insert(log.address) {
                            address_offsets.entry(log.address).or_default().push(offset);
                        }
                    }
                }
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9d21113 — added if !receipt.succeeded { continue; } so the indexer mirrors the query path exactly. As you note it's a no-op on EIP-658 networks (reverted txs carry no logs), but it keeps the index exact everywhere and makes the docstring's "no false positives" claim hold unconditionally.

Comment thread crates/storage/store.rs Outdated
Comment on lines +1362 to +1364
let entries = log_index::build_section_entries(section, address_offsets);
self.write_batch(LOG_ADDRESS_INDEX, entries)?;
self.set_indexed_log_sections(section + 1)?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Non-atomic commit of section entries and section counter

write_batch(LOG_ADDRESS_INDEX, entries) and set_indexed_log_sections(section + 1) are two independent writes to two different column families (LOG_ADDRESS_INDEX and MISC_VALUES). A crash between them leaves the section data written but the counter un-advanced; on restart the section would be re-indexed (writing identical data — idempotent) before the counter catches up. This is safe and self-healing, but if both writes could be issued as a single RocksDB write batch it would avoid the transient inconsistency entirely.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/storage/store.rs
Line: 1362-1364

Comment:
**Non-atomic commit of section entries and section counter**

`write_batch(LOG_ADDRESS_INDEX, entries)` and `set_indexed_log_sections(section + 1)` are two independent writes to two different column families (`LOG_ADDRESS_INDEX` and `MISC_VALUES`). A crash between them leaves the section data written but the counter un-advanced; on restart the section would be re-indexed (writing identical data — idempotent) before the counter catches up. This is safe and self-healing, but if both writes could be issued as a single RocksDB write batch it would avoid the transient inconsistency entirely.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9d21113. Section entries and the counter now commit in a single write transaction (commit_indexed_log_section): one begin_writeput_batch(LOG_ADDRESS_INDEX, …) + put(MISC_VALUES, counter)commit. RocksDB write batches span column families, so the two writes are now atomic and there's no crash window at all.

@github-actions

Copy link
Copy Markdown

🤖 Codex Code Review

  1. [crates/storage/store.rs:1328](/home/runner/work/ethrex/ethrex/crates/storage/store.rs:1328) can get permanently stuck on partial-history/pruned stores. The indexer always starts from section 0 and aborts on the first missing canonical hash (block_number inside the section). This repo already models pruned history via earliest_block_number in [crates/storage/store.rs:1078](/home/runner/work/ethrex/ethrex/crates/storage/store.rs:1078) and the CLI handles earliest > 0 explicitly in [cmd/ethrex/cli.rs:1205](/home/runner/work/ethrex/ethrex/cmd/ethrex/cli.rs:1205). On such a node, the background task will warn every 12s and never index any later retained section. The fix is to seed/advance from the earliest retained section, or skip pre-history holes instead of treating them as fatal.

  2. [crates/storage/store.rs:1379](/home/runner/work/ethrex/ethrex/crates/storage/store.rs:1379) introduces an RPC-facing memory/perf regression: it builds a BTreeSet of every candidate block, then collects it into a Vec, including the entire unindexed tail (from.max(indexed_until)..=to). For large eth_getLogs ranges that only partially overlap indexed history, this changes the old streaming scan into O(range) heap growth that a remote caller can trigger with one address-filtered query. Suggest returning/consuming a streaming iterator instead of materializing the full union up front.

  3. [crates/storage/log_index.rs:55](/home/runner/work/ethrex/ethrex/crates/storage/log_index.rs:55) is too permissive for an index that is later treated as exact. decode_offsets() uses chunks_exact(2) and silently ignores any trailing byte, so malformed auxiliary data would cause [crates/storage/store.rs:1406](/home/runner/work/ethrex/ethrex/crates/storage/store.rs:1406) to miss candidate blocks and under-return logs instead of falling back to a safe full scan. This should be fallible, with invalid entries treated as “index unusable/rebuild needed”.

No consensus, gas-accounting, or EVM opcode logic is touched here; the main risks are around RPC correctness and operational safety. I did not run tests, and I don’t see coverage in this diff for partial-history stores or “indexed head + unindexed tail” log queries.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown

Benchmark Block Execution Results Comparison Against Main

Command Mean [s] Min [s] Max [s] Relative
base 58.853 ± 0.264 58.582 59.495 1.00
head 58.961 ± 0.235 58.630 59.311 1.00 ± 0.01

@ElFantasma

Copy link
Copy Markdown
Contributor Author

Thanks for the reviews. Addressed in 9d2111315 (on top of the index commits):

Fixed

  • Indexer stall on missing canonical hash (greptile P1 / Claude chore: create project structure #2) — now defers the section gracefully instead of erroring; unindexed ranges fall back to scan, so results stay correct.
  • decode_offsets silent truncation (Kimi docs: add milestones #1 / Claude build: add Github actions boilerplate #3) — now returns an error on an odd-length value instead of dropping the trailing byte; added a unit test.
  • Non-atomic section commit (greptile P2 / Claude docs: add milestones #1) — entries + counter now commit in a single cross-CF write transaction.
  • Missing receipt.succeeded filter (greptile P2) — indexer now skips reverted receipts, matching the query path.
  • Overflow in last_block + confirmation_depth (Kimi build: add Github actions boilerplate #3) — switched to saturating_add.
  • Block-0 / underflow reasoning (Claude design note) — added a comment explaining why indexed_until - 1 can't underflow.

Acknowledged, not changing

  • New read txn per block in get_receipts_for_block_sync (Kimi chore: create project structure #2) — backfill is one-time and steady-state is one section per 12 s poll, off the hot path, so this hasn't shown up as a cost; can batch a section's reads into one txn later if profiling says otherwise.
  • N×M index reads for large address lists (Claude) — fine for typical small address sets; pathological cases still get the correct (if unoptimized) scan fallback.
  • Defensive starts_with(prefix) guard (Claude) — intentional belt-and-suspenders around the prefix iterator; harmless.
  • CONFIRMATION_DEPTH location (Claude) — kept in initializers.rs alongside the poll cadence since it's indexer-task policy, whereas SECTION_SIZE is the on-disk layout constant in log_index.

Test coverage (Claude) — the pure helpers are unit-tested and the storage methods are exercised end-to-end by the mainnet benchmark (6184 sections indexed, query results verified). A Store-level integration test for the partial-coverage (from < indexed_until <= to) path needs a block+receipt fixture harness that doesn't exist in this crate yet; happy to add it as a follow-up if you'd like it in this PR.

Comment on lines -151 to +163
for block_num in from..=to {
// For each candidate block, load its receipts (bulk) and collect matching logs.
for block_num in block_numbers {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hides the error in the case where from is lower than the earliest block we have. Previously we would have tried to obtain the body and failed doing so.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 8ec9842. The index now records the first section it actually covers (log_index_start_section), and get_candidate_blocks_by_address returns None (→ full scan) when the query's from is below that covered start. So a range beginning below the node's retained history takes the original scan path — surfacing the missing-body error as before — instead of the index silently reporting no logs for it.

@github-project-automation github-project-automation Bot moved this from In Review to In Progress in ethrex_l1 Jun 18, 2026
Base automatically changed from perf/getlogs-bulk-receipts to main June 18, 2026 20:30
@ElFantasma ElFantasma force-pushed the perf/getlogs-inverted-index branch from 9d21113 to 1057030 Compare June 18, 2026 21:38
@ElFantasma ElFantasma force-pushed the perf/getlogs-inverted-index branch from 1057030 to 8ec9842 Compare June 18, 2026 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L1 Ethereum client performance Block execution throughput and performance in general

Projects

Status: In Progress
Status: Todo

Development

Successfully merging this pull request may close these issues.

2 participants