Skip to content

Conversation

@m-Peter
Copy link
Collaborator

@m-Peter m-Peter commented Dec 18, 2025

Closes: #937

Description

Currently, as part of the EVM event ingestion engine, we replay each transaction and we also generate a basic set of traces, which are stored in a dedicated storage index. This does add some overhead to the ingestion engine, and it seems that the traces we generate, are also not used very frequently.
We could drop this part entirely, to increase the throughput of the ingestion engine.

Plus: QN, which hosts the public endpoints, has a caching layer for the debug_trace* related endpoints, so this functionality was a bit redundant anyway.


For contributor use:

  • Targeted PR against master branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the standards mentioned here.
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

Summary by CodeRabbit

  • Refactor
    • Removed call tracing integration and related tracer collection mechanisms
    • Eliminated database-backed trace shortcuts for transaction and block operations
    • Removed tracer field dependencies from BlocksProvider and event ingestion engine

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 18, 2025

Walkthrough

This PR removes the transaction trace generation and storage pipeline from the EVM event ingestion engine, eliminating the CallTracerCollector, trace indexing logic, tracer dependencies across the replayer, and simplifying the debug API's default tracer handling.

Changes

Cohort / File(s) Summary
Trace storage and collection removal
services/ingestion/engine.go, services/ingestion/engine_test.go
Removed traces storage field from Engine, eliminated trace collection and storage loop in transaction indexing, updated replayer initialization to pass nil instead of a tracer collector
CallTracerCollector and config cleanup
services/replayer/call_tracer_collector.go, services/replayer/config.go
Deleted entire call_tracer_collector.go file containing CallTracerCollector struct, EVMTracer interface, DefaultCallTracer, and NewSafeTxTracer; removed CallTracerCollector field from replayer Config struct
BlocksProvider tracer removal
services/replayer/blocks_provider.go, services/replayer/blocks_provider_test.go
Removed tracer field and parameter from BlocksProvider; updated NewBlocksProvider signature to remove tracer argument; updated BlockContext initialization to pass nil for tracer
Bootstrap and engine initialization
bootstrap/bootstrap.go
Removed CallTracerCollector creation; removed Traces storage parameter from ingestion engine constructor; updated BlocksProvider instantiation without tracer parameter
Debug API simplification
api/debug.go
Removed isDefaultCallTracer helper function and associated database-backed short-circuit path in traceTransaction and traceBlockByNumber; removed imports for errors, replayer, and related error type checks

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

  • Key areas requiring attention:
    • Verification that removal of trace collection doesn't break any dependent functionality or queries
    • Confirmation that debug API tracing still functions correctly without the default tracer short-circuit
    • Review of BlocksProvider and replayer integration changes to ensure nil tracer handling is complete and correct
    • Validation that public API signature changes (parameter removals from NewBlocksProvider, NewEventIngestionEngine) are reflected across all call sites

Possibly related PRs

Suggested labels

Technical Debt, EVM

Suggested reviewers

  • zhangchiqing
  • janezpodhostnik
  • peterargue

Poem

🐰 A tracer once hopped through each block,
Collecting each call with a tick-tock,
But overhead grew, the engine moved slow,
So we traced it right out—let the ingestion flow!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the primary change: removing tx trace generation and storage from the ingestion engine, matching the changeset scope.
Linked Issues check ✅ Passed The PR fully addresses issue #937 by eliminating transaction trace generation/storage from ingestion engine, removing CallTracerCollector integration, and disabling database-backed trace lookups.
Out of Scope Changes check ✅ Passed All changes are directly related to removing tx trace generation and storage; no unrelated modifications detected across modified files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch mpeter/remove-tx-trace-generation

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b961d2f and 2621d8e.

📒 Files selected for processing (8)
  • api/debug.go (0 hunks)
  • bootstrap/bootstrap.go (1 hunks)
  • services/ingestion/engine.go (1 hunks)
  • services/ingestion/engine_test.go (6 hunks)
  • services/replayer/blocks_provider.go (1 hunks)
  • services/replayer/blocks_provider_test.go (11 hunks)
  • services/replayer/call_tracer_collector.go (0 hunks)
  • services/replayer/config.go (1 hunks)
💤 Files with no reviewable changes (2)
  • api/debug.go
  • services/replayer/call_tracer_collector.go
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-01-24T20:15:10.908Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 682
File: services/ingestion/event_subscriber.go:85-90
Timestamp: 2025-01-24T20:15:10.908Z
Learning: When calculating blocks to catch up in Flow, initialize the counter to 0 and only set it when the latest chain height is greater than the current height (latestOnChainHeight > currentHeight). There's no need to handle the else case as 0 blocks to catch up is the correct default when we're at the same height or ahead.

Applied to files:

  • services/replayer/blocks_provider_test.go
📚 Learning: 2024-10-17T18:04:04.165Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 615
File: bootstrap/bootstrap.go:167-197
Timestamp: 2024-10-17T18:04:04.165Z
Learning: In the `flow-evm-gateway` Go project, the validation ensuring that `startHeight` is less than or equal to `endHeight` is performed before the `StartTraceDownloader` method in `bootstrap/bootstrap.go`, so additional checks in this method are unnecessary.

Applied to files:

  • services/replayer/blocks_provider_test.go
📚 Learning: 2024-10-17T18:04:41.062Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 615
File: services/traces/engine.go:190-190
Timestamp: 2024-10-17T18:04:41.062Z
Learning: In `services/traces/engine.go`, the `e.indexBlockTraces` function limits the number of goroutines to a maximum of 5 and blocks until all are complete, so during backfill, concurrency is already managed to prevent resource exhaustion.

Applied to files:

  • services/ingestion/engine.go
📚 Learning: 2025-10-06T10:14:49.706Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: services/ingestion/event_subscriber.go:235-239
Timestamp: 2025-10-06T10:14:49.706Z
Learning: In services/ingestion/event_subscriber.go, when reconnecting after disconnect errors (DeadlineExceeded, Internal, Unavailable), the subscription should reconnect at lastReceivedHeight rather than lastReceivedHeight+1. This avoids errors when the next height doesn't exist yet, and duplicate event processing is safe because the ingestion engine is explicitly designed to be idempotent (storage uses batch.Set() which overwrites existing entries).

Applied to files:

  • services/ingestion/engine_test.go
🧬 Code graph analysis (2)
services/replayer/blocks_provider_test.go (1)
services/replayer/blocks_provider.go (1)
  • NewBlocksProvider (58-66)
services/ingestion/engine_test.go (2)
services/replayer/blocks_provider.go (1)
  • NewBlocksProvider (58-66)
services/ingestion/engine.go (1)
  • NewEventIngestionEngine (58-90)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test
🔇 Additional comments (12)
services/replayer/config.go (1)

7-11: LGTM!

The simplified Config struct correctly removes the CallTracerCollector field, aligning with the PR objective to eliminate trace generation overhead from the ingestion pipeline.

services/ingestion/engine_test.go (3)

64-77: LGTM!

The NewEventIngestionEngine call correctly removes the traces parameter and uses the simplified NewBlocksProvider constructor. The test setup aligns with the updated API.


346-348: LGTM!

Discarding the unused transaction return value with _ is appropriate here since this test ("ingest block first and then transaction even if received out-of-order") only needs res.TxHash for block creation validation.


594-600: LGTM!

The simplified defaultReplayerConfig() function correctly reflects the removal of the CallTracerCollector field from replayer.Config.

bootstrap/bootstrap.go (3)

171-179: LGTM!

The simplified BlocksProvider and replayerConfig initialization correctly removes the tracer-related parameters. The ValidateResults: true setting ensures that transaction replay still validates results without generating traces during ingestion.


182-195: LGTM!

The NewEventIngestionEngine call correctly omits the traces parameter, aligning with the PR objective to remove trace generation from the ingestion pipeline.


343-353: Traces storage retained for debug API - verify this is intentional.

The b.storages.Traces is still passed to NewDebugAPI. This appears intentional since the debug_trace* endpoints may still need to store/retrieve traces on-demand. However, since trace generation during ingestion is removed, this storage will remain empty unless the debug API populates it.

Please confirm this is the intended behavior - that traces storage exists for on-demand debug API usage rather than being pre-populated during ingestion.

services/replayer/blocks_provider_test.go (2)

26-68: LGTM!

The TestOnBlockReceived tests are correctly updated to use the simplified NewBlocksProvider constructor without the tracer parameter.


70-96: LGTM!

The TestBlockContext test correctly validates that BlockContext returns expected values. The test no longer asserts tracer-related behavior, which aligns with the removal of tracer functionality.

services/replayer/blocks_provider.go (2)

50-66: LGTM!

The BlocksProvider struct and NewBlocksProvider constructor are correctly simplified by removing the tracer field. The struct now only maintains the essential state needed for block snapshot functionality.


21-41: Clean implementation - nil tracer is appropriate for the replayer module.

The BlocksProvider in services/replayer/blocks_provider.go intentionally omits the tracer field, making nil the correct parameter. Unlike the requester and executor modules which maintain tracer state, the replayer serves as a read-only block replay mechanism where tracing is not needed. This design choice cleanly separates concerns and disables trace collection overhead for block context creation in the replay flow.

services/ingestion/engine.go (1)

236-244: Verify nil tracer handling in sync.NewReplayer before merging.

The change from passing e.replayerConfig.CallTracerCollector.TxTracer() to nil correctly disables trace collection during replay and aligns with the PR's performance objectives. However, the external flow-go package documentation for sync.NewReplayer doesn't appear to explicitly document nil support for the tracer parameter. Confirm that:

  • The tracer parameter in sync.NewReplayer is designed to safely handle nil values without causing nil pointer dereferences during replay execution
  • No nil checks or safety mechanisms are relied upon downstream that expect a valid tracer instance

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
services/replayer/blocks_provider_test.go (1)

21-24: Remove unused native tracers import.

This blank import was previously needed when BlocksProvider used tracer functionality, but no references to tracers.DefaultDirectory or any tracer-related code exist in this file. The import can be safely removed.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b961d2f and 2621d8e.

📒 Files selected for processing (8)
  • api/debug.go (0 hunks)
  • bootstrap/bootstrap.go (1 hunks)
  • services/ingestion/engine.go (1 hunks)
  • services/ingestion/engine_test.go (6 hunks)
  • services/replayer/blocks_provider.go (1 hunks)
  • services/replayer/blocks_provider_test.go (11 hunks)
  • services/replayer/call_tracer_collector.go (0 hunks)
  • services/replayer/config.go (1 hunks)
💤 Files with no reviewable changes (2)
  • api/debug.go
  • services/replayer/call_tracer_collector.go
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-01-24T20:15:10.908Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 682
File: services/ingestion/event_subscriber.go:85-90
Timestamp: 2025-01-24T20:15:10.908Z
Learning: When calculating blocks to catch up in Flow, initialize the counter to 0 and only set it when the latest chain height is greater than the current height (latestOnChainHeight > currentHeight). There's no need to handle the else case as 0 blocks to catch up is the correct default when we're at the same height or ahead.

Applied to files:

  • services/replayer/blocks_provider_test.go
📚 Learning: 2024-10-17T18:04:04.165Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 615
File: bootstrap/bootstrap.go:167-197
Timestamp: 2024-10-17T18:04:04.165Z
Learning: In the `flow-evm-gateway` Go project, the validation ensuring that `startHeight` is less than or equal to `endHeight` is performed before the `StartTraceDownloader` method in `bootstrap/bootstrap.go`, so additional checks in this method are unnecessary.

Applied to files:

  • services/replayer/blocks_provider_test.go
📚 Learning: 2024-10-17T18:04:41.062Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 615
File: services/traces/engine.go:190-190
Timestamp: 2024-10-17T18:04:41.062Z
Learning: In `services/traces/engine.go`, the `e.indexBlockTraces` function limits the number of goroutines to a maximum of 5 and blocks until all are complete, so during backfill, concurrency is already managed to prevent resource exhaustion.

Applied to files:

  • services/ingestion/engine.go
📚 Learning: 2025-10-06T10:14:49.706Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: services/ingestion/event_subscriber.go:235-239
Timestamp: 2025-10-06T10:14:49.706Z
Learning: In services/ingestion/event_subscriber.go, when reconnecting after disconnect errors (DeadlineExceeded, Internal, Unavailable), the subscription should reconnect at lastReceivedHeight rather than lastReceivedHeight+1. This avoids errors when the next height doesn't exist yet, and duplicate event processing is safe because the ingestion engine is explicitly designed to be idempotent (storage uses batch.Set() which overwrites existing entries).

Applied to files:

  • services/ingestion/engine_test.go
🧬 Code graph analysis (2)
services/replayer/blocks_provider_test.go (1)
services/replayer/blocks_provider.go (1)
  • NewBlocksProvider (58-66)
services/ingestion/engine_test.go (2)
services/replayer/blocks_provider.go (1)
  • NewBlocksProvider (58-66)
services/ingestion/engine.go (1)
  • NewEventIngestionEngine (58-90)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test
🔇 Additional comments (12)
services/replayer/config.go (1)

7-11: LGTM!

The simplified Config struct correctly removes the CallTracerCollector field, aligning with the PR objective to eliminate trace generation overhead from the ingestion pipeline.

services/ingestion/engine_test.go (3)

64-77: LGTM!

The NewEventIngestionEngine call correctly removes the traces parameter and uses the simplified NewBlocksProvider constructor. The test setup aligns with the updated API.


346-348: LGTM!

Discarding the unused transaction return value with _ is appropriate here since this test ("ingest block first and then transaction even if received out-of-order") only needs res.TxHash for block creation validation.


594-600: LGTM!

The simplified defaultReplayerConfig() function correctly reflects the removal of the CallTracerCollector field from replayer.Config.

bootstrap/bootstrap.go (3)

171-179: LGTM!

The simplified BlocksProvider and replayerConfig initialization correctly removes the tracer-related parameters. The ValidateResults: true setting ensures that transaction replay still validates results without generating traces during ingestion.


182-195: LGTM!

The NewEventIngestionEngine call correctly omits the traces parameter, aligning with the PR objective to remove trace generation from the ingestion pipeline.


343-353: Traces storage retained for debug API - verify this is intentional.

The b.storages.Traces is still passed to NewDebugAPI. This appears intentional since the debug_trace* endpoints may still need to store/retrieve traces on-demand. However, since trace generation during ingestion is removed, this storage will remain empty unless the debug API populates it.

Please confirm this is the intended behavior - that traces storage exists for on-demand debug API usage rather than being pre-populated during ingestion.

services/replayer/blocks_provider_test.go (2)

26-68: LGTM!

The TestOnBlockReceived tests are correctly updated to use the simplified NewBlocksProvider constructor without the tracer parameter.


70-96: LGTM!

The TestBlockContext test correctly validates that BlockContext returns expected values. The test no longer asserts tracer-related behavior, which aligns with the removal of tracer functionality.

services/replayer/blocks_provider.go (2)

50-66: LGTM!

The BlocksProvider struct and NewBlocksProvider constructor are correctly simplified by removing the tracer field. The struct now only maintains the essential state needed for block snapshot functionality.


21-41: Clean implementation - nil tracer is appropriate for the replayer module.

The BlocksProvider in services/replayer/blocks_provider.go intentionally omits the tracer field, making nil the correct parameter. Unlike the requester and executor modules which maintain tracer state, the replayer serves as a read-only block replay mechanism where tracing is not needed. This design choice cleanly separates concerns and disables trace collection overhead for block context creation in the replay flow.

services/ingestion/engine.go (1)

236-244: Verify nil tracer handling in sync.NewReplayer before merging.

The change from passing e.replayerConfig.CallTracerCollector.TxTracer() to nil correctly disables trace collection during replay and aligns with the PR's performance objectives. However, the external flow-go package documentation for sync.NewReplayer doesn't appear to explicitly document nil support for the tracer parameter. Confirm that:

  • The tracer parameter in sync.NewReplayer is designed to safely handle nil values without causing nil pointer dereferences during replay execution
  • No nil checks or safety mechanisms are relied upon downstream that expect a valid tracer instance

Copy link
Contributor

@peterargue peterargue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. would it make sense to add an in-memory LRU cache, or do you think it's uncommon enough that it's not worth it?

@m-Peter
Copy link
Collaborator Author

m-Peter commented Dec 18, 2025

looks good. would it make sense to add an in-memory LRU cache, or do you think it's uncommon enough that it's not worth it?

@peterargue It turns out that this tracer config is actually pretty limiting. Most tools need to see all the calls (not only the top level call), and in some cases they also want to see the logs too. Take a look at the default tracer that Blockscout is using: https://evm.flowscan.io/tx/0x448e487c9408f7b3699d8989aa660c0498b84c7bc97fba2eb32ffb652bcac7d8?tab=raw_trace . So it's not worth adding an in-memory LRU cache, just for this specific tracer config. And for the one that Blockscout is using, it's also not worth it, as the response is rather large, so we better rely on QN's caching layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove tx trace generation & storage from ingestion engine

4 participants