ChainIndexor is a high-performance, modular blockchain log indexer and event processor for Ethereum and EVM-compatible chains. It enables fast, reliable, and scalable indexing of smart contract events, making it easy to build analytics, dashboards, and backend services on top of blockchain data.
ChainIndexor is designed to:
- Efficiently fetch, filter, and store blockchain logs and events.
- Support custom indexers for any contract/event type.
- Handle large-scale data, reorgs, and RPC limitations robustly.
- Provide a flexible foundation for explorers, analytics, and DeFi backends.
- Modular Indexer Framework: Easily add custom indexers for any contract/event.
- Recursive Log Fetching: Automatically splits queries to handle RPC "too many results" errors.
- Reorg Detection & Recovery: Detects chain reorganizations and safely rolls back indexed data.
- Configurable Database Backend: Uses SQLite with connection pooling, PRAGMA tuning, and schema migrations.
- Batch & Chunked Downloading: Efficiently downloads logs in configurable block ranges.
- Comprehensive Test Suite: Includes unit and integration tests for all major components.
- Example Indexers: Production-grade ERC20 token indexer included as a template.
ChainIndexor is optimized for:
- Fast initial syncs and incremental updates.
- Minimal RPC calls via batching and chunking.
- Safe operation under RPC rate limits and large data volumes.
- Multi-indexer support with independent start blocks and schemas.
- Configure: Edit
config.example.yamlto specify RPC endpoints, indexers, and database settings. - Run Migrations: Ensure database schemas are up-to-date (automatic on startup).
- Implement Indexers: Use the provided interface to add custom event processors.
- Start Indexing: Run the downloader to begin fetching and indexing logs.
Example:
Run the test in examples/run_test.go to test the ChainIndexor.
ChainIndexor supports YAML, JSON, and TOML configuration formats. Below is a comprehensive guide to all configuration options.
downloader:
# ... downloader settings
indexers:
# ... indexer settingsThe downloader is responsible for fetching logs from the blockchain and coordinating indexers.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
rpc_url |
string | Yes | - | Ethereum RPC endpoint URL (HTTP/HTTPS/WebSocket) |
chunk_size |
uint64 | No | 5000 | Number of blocks to fetch per eth_getLogs call. Adjust based on RPC limits |
finality |
string | No | "finalized" | Block finality mode: "finalized", "safe", or "latest" |
finalized_lag |
uint64 | No | 0 | Blocks behind head to consider finalized (only used when finality: "latest") |
db |
object | Yes | - | Database configuration for the downloader |
retention_policy |
object | No | - | Optional log retention policy configuration |
SQLite database settings for optimal performance:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
path |
string | Yes | - | File path to the SQLite database |
journal_mode |
string | No | "WAL" | SQLite journal mode: "WAL", "DELETE", "TRUNCATE", "PERSIST", "MEMORY". WAL recommended for concurrency |
synchronous |
string | No | "NORMAL" | Synchronization level: "FULL", "NORMAL", "OFF". NORMAL balances safety and performance |
busy_timeout |
int | No | 5000 | Milliseconds to wait when database is locked |
cache_size |
int | No | 10000 | Page cache size (negative = KB, positive = pages). Higher values improve performance |
max_open_connections |
int | No | 25 | Maximum number of open database connections |
max_idle_connections |
int | No | 5 | Maximum number of idle connections in the pool |
enable_foreign_keys |
bool | No | false | Enable foreign key constraint enforcement |
Optional configuration to automatically prune old logs and manage database size:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
max_db_size_mb |
uint64 | No | 0 | Maximum database size in megabytes. 0 = unlimited. Triggers pruning when exceeded |
max_blocks |
uint64 | No | 0 | Maximum number of blocks to retain from finalized block. 0 = keep all blocks |
How Retention Works:
- When
max_blocksis set, blocks older than(newest_block - max_blocks)are pruned - When
max_db_size_mbis set, oldest blocks are pruned when database exceeds the size limit - Both policies can be used together; the more aggressive threshold applies
- Pruning runs automatically after log ingestion and includes WAL-aware vacuuming
Optional configuration for automated database maintenance tasks (WAL checkpoints and VACUUM operations):
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
enabled |
bool | No | false | Enable background maintenance tasks |
check_interval |
string | No | "30m" | How often to run maintenance (e.g., "5m", "30m", "1h") |
vacuum_on_startup |
bool | No | false | Run maintenance immediately on startup before indexing begins |
wal_checkpoint_mode |
string | No | "TRUNCATE" | WAL checkpoint mode: "PASSIVE", "FULL", "RESTART", "TRUNCATE" |
Maintenance Operations:
- WAL Checkpoint: Moves data from Write-Ahead Log (WAL) file back to main database file
- VACUUM: Reclaims fragmented space and optimizes database structure
- Both operations coordinate with active indexing operations to avoid conflicts
Checkpoint Modes:
PASSIVE: Non-blocking, skips pages if busy (least aggressive)FULL: Waits for transactions, checkpoints all pagesRESTART: Like FULL but also resets WAL fileTRUNCATE: Most aggressive - resets and truncates WAL file (recommended for production)
When to Enable:
- Essential for long-running indexers to prevent WAL file growth
- Recommended for production deployments
- Disable for short-lived or test environments
- Works seamlessly with retention policies for optimal disk usage
Configure one or more indexers to process specific events:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
name |
string | Yes | - | Unique identifier for this indexer |
start_block |
uint64 | No | 0 | Block number to start indexing from. 0 = genesis |
db |
object | Yes | - | Database configuration for the indexer (same format as downloader db) |
contracts |
array | Yes | - | List of contracts and events to index |
Each contract specifies which events to monitor:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
address |
string | Yes | - | Ethereum contract address (hex format with 0x prefix) |
events |
array | Yes | - | List of event signatures to index |
Event Signature Format:
EventName(type1,type2,...)Examples:
Transfer(address,address,uint256)- ERC20 TransferApproval(address,address,uint256)- ERC20 ApprovalSwap(address,uint256,uint256,uint256,uint256,address)- Uniswap Swap
# YAML anchor for reusable database config
common_db: &common_db
journal_mode: WAL
synchronous: NORMAL
busy_timeout: 5000
cache_size: 10000
max_open_connections: 25
max_idle_connections: 5
enable_foreign_keys: true
downloader:
rpc_url: "https://mainnet.infura.io/v3/YOUR_API_KEY"
chunk_size: 5000
finality: "finalized"
db:
<<: *common_db
path: "./data/downloader.sqlite"
retention_policy:
max_db_size_mb: 1000 # Keep database under 1GB
max_blocks: 10000 # Retain last 10k blocks
maintenance:
enabled: true
check_interval: "30m" # Run maintenance every 30 minutes
vacuum_on_startup: true # Clean database on startup
wal_checkpoint_mode: "TRUNCATE" # Aggressive WAL reclamation
indexers:
- name: "ERC20Indexer"
start_block: 12000000
db:
<<: *common_db
path: "./data/erc20.sqlite"
contracts:
- address: "0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48"
events:
- "Transfer(address,address,uint256)"
- "Approval(address,address,uint256)"
- address: "0xdAC17F958D2ee523a2206206994597C13D831ec7"
events:
- "Transfer(address,address,uint256)"
- name: "UniswapV2Indexer"
start_block: 10000835
db:
<<: *common_db
path: "./data/uniswap.sqlite"
contracts:
- address: "0x5C69bEe701ef814a2B6a3EDD4B1652CB9cc5aA6f"
events:
- "PairCreated(address,address,address,uint256)"Performance Tuning:
- Increase
chunk_sizefor faster syncing if RPC allows (watch for "query returned more than X results" errors) - Use WAL mode (
journal_mode: WAL) for better concurrent read/write performance - Increase
cache_sizefor memory-rich environments - Use
finality: "latest"with appropriatefinalized_lagfor faster indexing (less safe for reorgs)
Production Settings:
- Use
finality: "finalized"for maximum safety against reorgs - Enable
retention_policyto prevent unbounded database growth - Set reasonable
max_db_size_mbbased on available storage - Monitor
max_blocksto balance data retention needs with performance - Enable
maintenancewith appropriatecheck_interval(e.g.,"30m"or"1h") - Use
wal_checkpoint_mode: "TRUNCATE"for maximum space reclamation - Enable
vacuum_on_startup: truefor fresh starts after crashes - Configure logging levels per component for production monitoring
Development Settings:
- Use
finality: "latest"for faster local testing - Disable retention policy or set high limits to keep all data
- Use smaller
chunk_sizeto test recursive splitting logic - Enable
logging.development: truefor detailed debug output with stack traces
Multi-Indexer Best Practices:
- Each indexer gets its own database for isolation
- Set appropriate
start_blockper indexer to avoid unnecessary syncing - Use descriptive names for easier monitoring and debugging
ChainIndexor provides structured logging with per-component log level configuration, allowing you to fine-tune verbosity for different parts of the system.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
default_level |
string | No | "info" | Default log level for all components: "debug", "info", "warn", "error" |
development |
bool | No | false | Enable development mode (stack traces, colored console output) |
component_levels |
map | No | {} | Per-component log level overrides |
| Component | Description |
|---|---|
downloader |
Main download orchestration and indexer coordination |
log-fetcher |
Blockchain log fetching and RPC interaction |
sync-manager |
Sync state management and checkpoint persistence |
reorg-detector |
Blockchain reorganization detection |
log-store |
Log storage layer and database operations |
maintenance |
Database maintenance operations (WAL checkpoint, VACUUM) |
logging:
default_level: "info"
development: falselogging:
default_level: "info"
development: false
component_levels:
downloader: "info"
log-fetcher: "debug" # verbose RPC logging
sync-manager: "info"
reorg-detector: "warn" # only warnings and errors
log-store: "info"
maintenance: "debug" # detailed maintenance logslogging:
default_level: "debug"
development: true # enables stack traces and colored output
component_levels:
log-fetcher: "debug"
maintenance: "debug"Production Monitoring:
logging:
default_level: "info"
development: false
component_levels:
reorg-detector: "warn" # reduce noise from normal operations
maintenance: "info" # track maintenance operationsDebugging RPC Issues:
logging:
default_level: "info"
component_levels:
log-fetcher: "debug" # detailed RPC request/response loggingDebugging Performance:
logging:
default_level: "info"
component_levels:
downloader: "debug" # indexing throughput
sync-manager: "debug" # checkpoint frequency
log-store: "debug" # database operation timingMinimal Logging (High-Performance):
logging:
default_level: "warn" # only warnings and errors
development: false- debug: Verbose output including internal state, timing, and detailed operations. Use for troubleshooting.
- info: Normal operational messages. Good default for production.
- warn: Unexpected conditions that don't prevent operation. Alerts for potential issues.
- error: Errors that require attention but may allow continued operation.
Clone the repo and build:
git clone https://github.com/goran-ethernal/ChainIndexor.git
cd ChainIndexor
go build ./...- Add new indexers in
examples/indexers/. - Use the ERC20 indexer as a template for custom event processing.
- Register indexers in your config and main application.
Run all tests and coverage:
make test
make test-coverageContributions are welcome! Please open issues and pull requests for bug fixes, features, and documentation.
ChainIndexor is Apache-2.0 licensed. See LICENSE for details.
Built on top of go-ethereum, testify, and other great open source projects.
For questions, support, or collaboration, open an issue or reach out via GitHub Discussions.