Skip to content

Fosurero/ShardLake-Toolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ”· ShardLake Toolkit

An open-source extension of near-lake-framework-rs for sharded archival nodes with an AI query optimizer.

CI License: MIT Rust


Pre-Grant PoC – February 2026

This repository is the working proof-of-concept submitted alongside the ShardLake Toolkit grant proposal to the NEAR Foundation.

Motivation

From the NEAR Ecosystem 2025 Year in Review:

"Sharded archival nodes represent one of the most significant infrastructure improvements in 2025. By allowing archival operators to store and serve data for individual shards rather than the entire chain, NEAR dramatically reduces the hardware requirements for running archival infrastructure β€” making it feasible for a much broader set of participants to contribute to the network's data availability layer."

ShardLake Toolkit builds on this foundation by providing a developer-friendly Rust library that:

  1. Filters NEAR Lake streams by shard β€” only the shards you need ever touch your memory or disk.
  2. Pre-indexes data for AI-agent queries β€” bloom-filter + inverted-index engine that delivers 4–11Γ— faster pattern matching than a vanilla full-scan.
  3. Ships a ready-to-use CLI for quick experimentation and benchmarking.

Features

Feature Description
🧩 Shard-filtered streaming Stream only selected shards from NEAR Lake, cutting resource usage proportionally
πŸ€– AI Query Optimizer Pre-indexed search for method names (ft_transfer, nft_mint), event logs, and AI-agent intent patterns
⏱️ Temporal queries Restrict any query to a block-height window
πŸ” Regex support ft_transfer|nft_mint β€” combine patterns in a single pass
πŸ“Š Built-in benchmarks Every CLI run shows optimizer vs. vanilla speedup
🐳 Docker image One command to build and run

Installation

From source (recommended)

git clone https://github.com/Fosurero/ShardLake-Toolkit.git
cd ShardLake-Toolkit
cargo build --release

Via cargo install

cargo install --git https://github.com/Fosurero/ShardLake-Toolkit.git

Quick Start

# Run the default demo: shards 0,3 β€’ 1 000 blocks β€’ query "intent"
cargo run -- run --network testnet --start-block 100000000 --shards 0,3 --query "intent"

Expected Output

  ╔═══════════════════════════════════════════════════════╗
  β•‘   πŸ”·  ShardLake Toolkit  v0.1.0                      β•‘
  β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

  Network:      testnet
  Start Block:  100,000,000
  Shards:       [0, 3]
  Query:        "intent"
  Mode:         Demo (1,000 simulated blocks per shard)

  ⏳ Generating demo data …  done (14.2ms)
  ⏳ Building optimizer index …  done (11.8ms, 12,847 entries)

  πŸ“Š Results
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚ Shard   β”‚ Blocks         β”‚ Matches  β”‚ Query Time   β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ 0       β”‚ 1,000          β”‚ 168      β”‚ 0.4ms        β”‚
  β”‚ 3       β”‚ 1,000          β”‚ 152      β”‚ 0.4ms        β”‚
  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚ Total   β”‚ 2,000          β”‚ 320      β”‚ 0.8ms        β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  ⚑ ShardLake Optimizer: 0.82ms  (filtered 2 of 4 shards)
  🐌 Vanilla full scan:   3.91ms  (all 4 shards, no index)
  πŸš€ Speedup:             4.8Γ— faster

  πŸ” Match breakdown:
      intent        192 hits
      event          88 hits
      method         40 hits

  πŸ“‹ Sample matches (first 5):
      Block 100,000,042 β”‚ Shard 0 β”‚ [intent] intent:swap NEAR->USDT amount=50
      Block 100,000,107 β”‚ Shard 3 β”‚ [intent] intent:bridge ETH->NEAR
      Block 100,000,218 β”‚ Shard 0 β”‚ [intent] intent:delegate stake=100
      Block 100,000,319 β”‚ Shard 3 β”‚ [event]  EVENT_JSON:{"standard":"nep141"...
      Block 100,000,455 β”‚ Shard 0 β”‚ [intent] intent:query price_feed NEAR/USD

  βœ… Complete.  320 events matched query "intent" across 2 shards in 42.1ms.
     Indexed 2,000 blocks, 7,412 receipts, 10 unique methods.

More Examples

# Token transfers only
cargo run -- run --query "ft_transfer" --num-blocks 5000

# Regex: transfers OR mints, all shards
cargo run -- run --query "ft_transfer|nft_mint" --shards 0,1,2,3

# Run the library examples
cargo run --example sharded_archival_streamer
cargo run --example ai_query_demo

Project Structure

ShardLake-Toolkit/
β”œβ”€β”€ Cargo.toml                          # Dependencies & build config
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ lib.rs                          # Core types & module re-exports
β”‚   β”œβ”€β”€ streamer.rs                     # Shard-filtered NEAR Lake streamer
β”‚   β”œβ”€β”€ optimizer.rs                    # AI query optimizer (index + bloom)
β”‚   └── main.rs                         # CLI binary (clap)
β”œβ”€β”€ examples/
β”‚   β”œβ”€β”€ sharded_archival_streamer.rs    # Streamer example
β”‚   └── ai_query_demo.rs               # Optimizer example
β”œβ”€β”€ benches/
β”‚   └── optimizer_bench.rs              # Criterion benchmarks
β”œβ”€β”€ docs/
β”‚   └── usage.md                        # Detailed usage guide
β”œβ”€β”€ benchmarks/
β”‚   └── results.md                      # Performance results
β”œβ”€β”€ .github/workflows/
β”‚   └── ci.yml                          # GitHub Actions CI
β”œβ”€β”€ Dockerfile                          # Multi-stage Docker build
β”œβ”€β”€ milestones.md                       # M1 / M2 / M3 roadmap
β”œβ”€β”€ LICENSE                             # MIT
└── README.md                           # ← You are here

Architecture

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚        NEAR Lake S3 Bucket          β”‚
                    β”‚  (testnet / mainnet block data)     β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                   β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚      near-lake-framework-rs         β”‚
                    β”‚   (StreamerMessage per block)       β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                   β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚          ShardLake Toolkit                   β”‚
              β”‚                                             β”‚
              β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
              β”‚  β”‚ ShardedStreamer  β”‚  β”‚  QueryOptimizer  β”‚ β”‚
              β”‚  β”‚                 β”‚  β”‚                  β”‚ β”‚
              β”‚  β”‚ β€’ shard filter  β”‚  β”‚ β€’ inverted index β”‚ β”‚
              β”‚  β”‚ β€’ config build  β”‚  β”‚ β€’ bloom filter   β”‚ β”‚
              β”‚  β”‚ β€’ demo gen      β”‚  β”‚ β€’ regex engine   β”‚ β”‚
              β”‚  β”‚ β€’ type convert  β”‚  β”‚ β€’ temporal query β”‚ β”‚
              β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
              β”‚           β”‚                    β”‚            β”‚
              β”‚           β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚
              β”‚                    β”‚                        β”‚
              β”‚           β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”               β”‚
              β”‚           β”‚   CLI / API    β”‚               β”‚
              β”‚           β”‚  (shardlake)   β”‚               β”‚
              β”‚           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Running Tests

cargo test                        # All tests
cargo test -- --nocapture         # With stdout
cargo test optimizer              # Only optimizer tests

Benchmarks

cargo bench                       # Criterion benchmarks

See benchmarks/results.md for detailed numbers.


Docker

docker build -t shardlake-toolkit .
docker run --rm shardlake-toolkit
docker run --rm shardlake-toolkit run --query "ft_transfer" --num-blocks 2000

Roadmap

Milestone Target Status
M1 Core PoC Feb 2026 βœ… Complete
M2 Live NEAR Lake integration Mar 2026 πŸ”² Planned
M3 Production & AI-agent SDK Apr–May 2026 πŸ”² Planned

See milestones.md for full details.


Grant Proposal

This PoC accompanies the ShardLake Toolkit grant proposal submitted to the NEAR Foundation Infrastructure Committee.

πŸ“„ Proposal link: ShardLake Toolkit – NEAR Grant Proposal


Contributing

Contributions are welcome! Please open an issue first to discuss what you'd like to change.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-thing)
  3. Commit your changes (git commit -m 'Add amazing thing')
  4. Push to the branch (git push origin feature/amazing-thing)
  5. Open a Pull Request

License

MIT Β© 2026 ShardLake Contributors

About

A Lightweight open-source toolkit that extends the existing NEAR Lake Framework to support easy deployment of sharded archival nodes + built-in real-time query optimizer tailored for AI agent patterns (event filtering, temporal indexing, cross-shard queries).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors