Skip to content

Commit 6e38087

Browse files
committed
feat(scripts): add witness caching for multi and cost_estimator (#776)
* feat(witness-cache): add caching for witness generation Add witness caching to reduce 2-hour bottleneck in multi.rs and cost_estimator scripts. Cache WitnessData (not SP1Stdin) to disk with keys (chain_id, start_block, end_block). New features: - Witness cache module with save/load functions using rkyv serialization - cfg_if conditional compilation for EigenDA support - Three cache flags: --cache (default), --use-cache (load only), --save-cache (force regenerate) - Multi-script integration with cache status messages - Cost estimator parallel cache loads per batch range - Comprehensive documentation on cache usage and management Cache location: data/{chain_id}/witness-cache/{start_block}-{end_block}.bin DA compatibility: Ethereum/Celestia compatible, EigenDA separate cache Files changed: - utils/host/src/witness_cache.rs (NEW) - utils/host/src/lib.rs, Cargo.toml - scripts/utils/src/lib.rs - scripts/prove/bin/multi.rs - scripts/utils/bin/cost_estimator.rs - book/advanced/cost-estimation-tools.md (renamed from cost-estimator.md) - book/SUMMARY.md * fix(witness-cache): use SP1Stdin caching for cost_estimator to fix CI - Add SP1Stdin cache functions using bincode (DA-agnostic) - cost_estimator.rs now caches SP1Stdin instead of WitnessData - Remove WitnessDataType type constraint that caused CI failures - Fix race condition in multi.rs by using match pattern with graceful fallback SP1Stdin is the same type regardless of which DA witness generator produced it, so it works with generic host types. This fixes the CI type mismatch error when running with --features celestia/eigenda. * refactor(witness-cache): unify on SP1Stdin caching for DA-agnosticism Switch both multi.rs and cost_estimator.rs to use SP1Stdin caching instead of WitnessData caching. This fixes CI failures when running with different DA feature flags (celestia, eigenda). SP1Stdin is DA-agnostic - it's the same type regardless of which witness generator produced it. This means cache files now work across all DA types (Ethereum, Celestia, EigenDA). Changes: - Update multi.rs to cache SP1Stdin using bincode - Simplify witness_cache.rs to only contain SP1Stdin functions - Remove eigenda feature flag from utils/host (no longer needed) - Update documentation to reflect SP1Stdin caching * refactor(witness-cache): address PR review comments - Use tracing macros instead of println/eprintln in multi.rs - Fix DA compatibility docs: clarify that cache files are compatible between Ethereum ↔ Celestia, but NOT with EigenDA - Simplify cache flags: remove --use-cache and --save-cache, keep only --cache for simpler UX (cherry picked from commit 7bfa643)
1 parent ca33cee commit 6e38087

File tree

10 files changed

+289
-74
lines changed

10 files changed

+289
-74
lines changed

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

book/SUMMARY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@
4343
- [EigenDA DA](./fault_proofs/experimental/eigenda.md)
4444

4545
- [Advanced](./advanced/intro.md)
46-
- [Cost Estimator](./advanced/cost-estimator.md)
46+
- [Cost Estimation Tools](./advanced/cost-estimation-tools.md)
4747
- [Reproduce Binaries](./advanced/verify-binaries.md)
4848
- [Node Setup](./advanced/node-setup.md)
4949
- [FAQ](./faq.md)
Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# Cost Estimation Tools
2+
3+
This guide covers the scripts for estimating proving costs and testing execution: `multi` and `cost-estimator`.
4+
5+
## Setup
6+
7+
Before running these scripts, set up a `.env` file in the project root:
8+
9+
```bash
10+
L1_RPC=<YOUR_L1_RPC_ENDPOINT>
11+
L2_RPC=<YOUR_L2_RPC_ENDPOINT>
12+
L2_NODE_RPC=<YOUR_L2_NODE_RPC_ENDPOINT>
13+
```
14+
15+
## Multi Script
16+
17+
The `multi` script executes the OP Succinct range proof program for a block range. Use it to test proof generation or generate actual proofs.
18+
19+
### Usage
20+
21+
```bash
22+
# Execute without proving (generates execution report)
23+
cargo run --bin multi -- --start 1000 --end 1020
24+
25+
# Generate compressed proofs
26+
cargo run --bin multi -- --start 1000 --end 1020 --prove
27+
```
28+
29+
### Output
30+
31+
- **Execution mode**: Prints execution stats and saves to `execution-reports/multi/{chain_id}/{start}-{end}.csv`
32+
- **Prove mode**: Saves proof to `data/{chain_id}/proofs/{start}-{end}.bin`
33+
34+
## Cost Estimator
35+
36+
The `cost-estimator` estimates proving costs without generating proofs. It splits large ranges into batches and runs them in parallel.
37+
38+
### Usage
39+
40+
```bash
41+
cargo run --bin cost-estimator -- \
42+
--start 2000000 \
43+
--end 2001800 \
44+
--batch-size 300
45+
```
46+
47+
For best estimation, use a range bigger than the batcher interval with batch size equal to the range.
48+
49+
### Output
50+
51+
Execution report saved to `execution-reports/{chain_id}/{start}-{end}-report.csv` with metrics:
52+
- Total instruction count
53+
- Oracle verification / derivation / block execution costs
54+
- SP1 gas usage
55+
- Transaction counts and EVM gas
56+
- Precompile cycles (BN pair, add, mul, KZG eval, etc.)
57+
58+
## Witness Caching
59+
60+
Both scripts support witness caching to skip the time-consuming witness generation step on subsequent runs.
61+
62+
### Why Cache?
63+
64+
The proving pipeline has two stages:
65+
66+
```
67+
host.run() → WitnessData → get_sp1_stdin() → SP1Stdin
68+
[hours] [milliseconds]
69+
```
70+
71+
Witness generation (`host.run()`) fetches L1/L2 data and executes blocks, which can take **hours** for large ranges. Caching saves this data to disk.
72+
73+
We cache `SP1Stdin` because:
74+
1. It skips the hours-long `host.run()` bottleneck
75+
2. SP1Stdin implements serde serialization via bincode
76+
3. The cache is compatible across Ethereum and Celestia DA (both use the same witness format)
77+
78+
Note: The `get_sp1_stdin()` conversion is milliseconds, so caching after this step has negligible overhead.
79+
80+
### Cache Flag
81+
82+
Use `--cache` to enable caching. If a cache file exists for the block range, it will be loaded. Otherwise, witness generation runs and the result is saved to cache.
83+
84+
### Examples
85+
86+
```bash
87+
# First run: generates witness and saves to cache
88+
cargo run --bin multi -- --start 1000 --end 1020 --cache
89+
90+
# Second run: loads from cache (instant), then proves
91+
cargo run --bin multi -- --start 1000 --end 1020 --cache --prove
92+
93+
# Force regenerate by deleting cache first
94+
rm data/{chain_id}/witness-cache/1000-1020-stdin.bin
95+
cargo run --bin multi -- --start 1000 --end 1020 --cache
96+
97+
# Cost estimator with caching
98+
cargo run --bin cost-estimator -- --start 1000 --end 1100 --batch-size 10 --cache
99+
```
100+
101+
### Cache Location
102+
103+
```
104+
data/{chain_id}/witness-cache/{start_block}-{end_block}-stdin.bin
105+
```
106+
107+
Example: `data/8453/witness-cache/1000-1020-stdin.bin` for Base.
108+
109+
### DA Compatibility
110+
111+
| DA Type | Compatible With |
112+
|---------|-----------------|
113+
| Ethereum (default) | Celestia |
114+
| Celestia | Ethereum |
115+
| EigenDA | EigenDA only |
116+
117+
Cache files are compatible between Ethereum and Celestia (both use `DefaultWitnessData`), but **not** with EigenDA (uses `EigenDAWitnessData`). Don't mix cache files across incompatible DA types.
118+
119+
### Cache Management
120+
121+
```bash
122+
# Clear all cache for a chain
123+
rm -rf data/{chain_id}/witness-cache/
124+
125+
# Clear specific range
126+
rm data/{chain_id}/witness-cache/{start}-{end}-stdin.bin
127+
```
128+
129+
Cache files are typically 100MB-1GB per range.

book/advanced/cost-estimator.md

Lines changed: 0 additions & 53 deletions
This file was deleted.

scripts/prove/bin/multi.rs

Lines changed: 51 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,23 @@
11
use anyhow::{Context, Result};
22
use clap::Parser;
33
use op_succinct_host_utils::{
4-
block_range::get_validated_block_range, fetcher::OPSuccinctDataFetcher, host::OPSuccinctHost,
5-
stats::ExecutionStats, witness_generation::WitnessGenerator,
4+
block_range::get_validated_block_range,
5+
fetcher::OPSuccinctDataFetcher,
6+
host::OPSuccinctHost,
7+
stats::ExecutionStats,
8+
witness_cache::{load_stdin_from_cache, save_stdin_to_cache},
9+
witness_generation::WitnessGenerator,
610
};
711
use op_succinct_proof_utils::{get_range_elf_embedded, initialize_host};
812
use op_succinct_prove::execute_multi;
913
use op_succinct_scripts::HostExecutorArgs;
1014
use sp1_sdk::{utils, ProverClient};
11-
use std::{fs, sync::Arc, time::Instant};
12-
use tracing::debug;
15+
use std::{
16+
fs,
17+
sync::Arc,
18+
time::{Duration, Instant},
19+
};
20+
use tracing::{debug, info, warn};
1321

1422
/// Execute the OP Succinct program for multiple blocks.
1523
#[tokio::main]
@@ -35,16 +43,47 @@ async fn main() -> Result<()> {
3543
)
3644
.await?;
3745

38-
let host_args = host.fetch(l2_start_block, l2_end_block, None, args.safe_db_fallback).await?;
46+
let l2_chain_id = data_fetcher.get_l2_chain_id().await?;
47+
48+
// Helper closure to generate stdin (runs witness generation and converts to SP1Stdin)
49+
let generate_stdin = || async {
50+
let host_args =
51+
host.fetch(l2_start_block, l2_end_block, None, args.safe_db_fallback).await?;
52+
debug!("Host args: {:?}", host_args);
3953

40-
debug!("Host args: {:?}", host_args);
54+
let start_time = Instant::now();
55+
let witness = host.run(&host_args).await?;
56+
let duration = start_time.elapsed();
4157

42-
let start_time = Instant::now();
43-
let witness_data = host.run(&host_args).await?;
44-
let witness_generation_duration = start_time.elapsed();
58+
// Convert witness to SP1Stdin
59+
let stdin = host.witness_generator().get_sp1_stdin(witness)?;
4560

46-
// Get the stdin for the block.
47-
let sp1_stdin = host.witness_generator().get_sp1_stdin(witness_data)?;
61+
// Save to cache if enabled
62+
if args.cache {
63+
let cache_path =
64+
save_stdin_to_cache(l2_chain_id, l2_start_block, l2_end_block, &stdin)?;
65+
info!("Saved stdin to cache: {}", cache_path.display());
66+
}
67+
68+
Ok::<_, anyhow::Error>((stdin, duration))
69+
};
70+
71+
// Check cache first if enabled (with graceful fallback)
72+
let (sp1_stdin, witness_generation_duration) = if args.cache {
73+
match load_stdin_from_cache(l2_chain_id, l2_start_block, l2_end_block) {
74+
Ok(Some(stdin)) => {
75+
info!("Loaded stdin from cache");
76+
(stdin, Duration::ZERO)
77+
}
78+
Ok(None) => generate_stdin().await?,
79+
Err(e) => {
80+
warn!("Failed to load cache: {e}, regenerating...");
81+
generate_stdin().await?
82+
}
83+
}
84+
} else {
85+
generate_stdin().await?
86+
};
4887

4988
let prover = ProverClient::from_env();
5089

@@ -55,7 +94,7 @@ async fn main() -> Result<()> {
5594
let proof = prover.prove(&pk, &sp1_stdin).compressed().run().unwrap();
5695

5796
// Create a proof directory for the chain ID if it doesn't exist.
58-
let proof_dir = format!("data/{}/proofs", data_fetcher.get_l2_chain_id().await.unwrap());
97+
let proof_dir = format!("data/{}/proofs", l2_chain_id);
5998
if !std::path::Path::new(&proof_dir).exists() {
6099
fs::create_dir_all(&proof_dir).unwrap();
61100
}
@@ -64,8 +103,6 @@ async fn main() -> Result<()> {
64103
.save(format!("{proof_dir}/{l2_start_block}-{l2_end_block}.bin"))
65104
.expect("saving proof failed");
66105
} else {
67-
let l2_chain_id = data_fetcher.get_l2_chain_id().await?;
68-
69106
let (block_data, report, execution_duration) =
70107
execute_multi(&data_fetcher, sp1_stdin, l2_start_block, l2_end_block).await?;
71108

scripts/utils/bin/cost_estimator.rs

Lines changed: 36 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ use op_succinct_host_utils::{
1010
fetcher::OPSuccinctDataFetcher,
1111
host::OPSuccinctHost,
1212
stats::ExecutionStats,
13+
witness_cache::{load_stdin_from_cache, save_stdin_to_cache},
1314
witness_generation::WitnessGenerator,
1415
};
1516
use op_succinct_proof_utils::{get_range_elf_embedded, initialize_host};
@@ -26,14 +27,18 @@ use std::{
2627

2728
/// Run the zkVM execution process for each split range in parallel. Writes the execution stats for
2829
/// each block range to a CSV file after each execution completes (not guaranteed to be in order).
29-
async fn execute_blocks_and_write_stats_csv<H: OPSuccinctHost>(
30+
async fn execute_blocks_and_write_stats_csv<H>(
3031
host: Arc<H>,
3132
host_args: &[H::Args],
3233
ranges: Vec<SpanBatchRange>,
3334
l2_chain_id: u64,
3435
start: u64,
3536
end: u64,
36-
) -> Result<()> {
37+
cache_enabled: bool,
38+
) -> Result<()>
39+
where
40+
H: OPSuccinctHost,
41+
{
3742
let data_fetcher = OPSuccinctDataFetcher::new_with_rollup_config().await?;
3843

3944
// Fetch all of the execution stats block ranges in parallel.
@@ -67,12 +72,38 @@ async fn execute_blocks_and_write_stats_csv<H: OPSuccinctHost>(
6772
let prover = ProverClient::builder().cpu().build();
6873

6974
// Run the host tasks in parallel using join_all
70-
let handles = host_args.iter().map(|host_args| {
75+
let handles = host_args.iter().zip(ranges.iter()).map(|(host_args, range)| {
7176
let host_args = host_args.clone();
7277
let host = host.clone();
78+
let start = range.start;
79+
let end = range.end;
7380
tokio::spawn(async move {
81+
// Try loading SP1Stdin from cache
82+
if cache_enabled {
83+
match load_stdin_from_cache(l2_chain_id, start, end) {
84+
Ok(Some(stdin)) => {
85+
info!("Loaded stdin from cache for range {}-{}", start, end);
86+
return stdin;
87+
}
88+
Ok(None) => {} // No cache, generate below
89+
Err(e) => {
90+
log::warn!("Failed to load stdin cache for range {}-{}: {e}", start, end);
91+
}
92+
}
93+
}
94+
95+
// Generate witness and convert to SP1Stdin
7496
let witness_data = host.run(&host_args).await.unwrap();
75-
host.witness_generator().get_sp1_stdin(witness_data).unwrap()
97+
let stdin = host.witness_generator().get_sp1_stdin(witness_data).unwrap();
98+
99+
// Save SP1Stdin to cache
100+
if cache_enabled {
101+
if let Ok(cache_path) = save_stdin_to_cache(l2_chain_id, start, end, &stdin) {
102+
info!("Saved stdin to cache: {}", cache_path.display());
103+
}
104+
}
105+
106+
stdin
76107
})
77108
});
78109

@@ -238,6 +269,7 @@ async fn main() -> Result<()> {
238269
l2_chain_id,
239270
l2_start_block,
240271
l2_end_block,
272+
args.cache,
241273
)
242274
.await?;
243275

scripts/utils/src/lib.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,9 @@ pub struct HostExecutorArgs {
1515
/// The number of blocks to execute in a single batch.
1616
#[arg(long, default_value = "10")]
1717
pub batch_size: u64,
18-
/// Use cached witness generation.
18+
/// Enable caching: load from cache if available, save to cache if not.
1919
#[arg(long)]
20-
pub use_cache: bool,
20+
pub cache: bool,
2121
/// Use a fixed recent range.
2222
#[arg(long)]
2323
pub rolling: bool,

utils/host/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ alloy-sol-types.workspace = true
5050
# general
5151
anyhow.workspace = true
5252
async-trait.workspace = true
53+
bincode.workspace = true
5354
cfg-if.workspace = true
5455
c-kzg.workspace = true
5556
futures.workspace = true

0 commit comments

Comments
 (0)