Context
SCFuzzBench benchmarks smart contract fuzzers (Echidna, Medusa, Foundry, Recon) against real DeFi protocols (Aave v4, Superform, Liquity). Current results show Foundry finding 0-3 bugs vs Echidna's 10+ on the same targets with the same time budget.
Analysis of the Foundry invariant testing engine identified several root causes, ranging from architectural blockers to corpus/coverage quality issues. This issue tracks the phased improvement plan.
Benchmark Baseline (Aave v4, 24h, SCFuzzBench)
| Fuzzer |
Bugs Found |
AUC (norm) |
Coverage Proxy |
| Echidna 2.3.1 |
10 [9,10] |
0.816 |
456k |
| Medusa 1.4.1 |
10 [9,10] |
0.754 |
5.2k |
| Foundry v1.6.0-rc1 |
3 [3,3] |
0.272 |
374 |
Phase 1: Architectural Blockers (prerequisite)
These must land first — without them, the fuzzer uses ~4% of its time budget because trivially-findable assertion failures stop exploration immediately.
Phase 2: Dictionary Quality
The fuzz dictionary collects values from storage, logs, call results, and push bytes. Currently, all values learned during an invariant run are thrown away at the end of each run via FuzzDictionary::revert(). Values from coverage-producing runs should persist to compound across runs.
Phase 3: Corpus Selection & Mutation
The corpus-guided mutation engine has structural weaknesses that limit exploration depth.
Phase 4: Coverage Signal Quality
The coverage map tracks EVM JUMPI edge pairs but misses state-dependent behavior.
Validation
Use SCFuzzBench for end-to-end validation:
- Target: aave-v4-scfuzzbench (
v0.5.6-recon)
- Metric: unique broken invariants at 1h and 24h timeouts
- Baseline: Echidna finds 10+; current Foundry finds 3
- Each phase should be benchmarked independently with multiple seeds
cargo build --release --bin forge
FOUNDRY_INVARIANT_TIMEOUT=300 forge test --mc CryticToFoundry -vv --fuzz-seed 42
Context
SCFuzzBench benchmarks smart contract fuzzers (Echidna, Medusa, Foundry, Recon) against real DeFi protocols (Aave v4, Superform, Liquity). Current results show Foundry finding 0-3 bugs vs Echidna's 10+ on the same targets with the same time budget.
Analysis of the Foundry invariant testing engine identified several root causes, ranging from architectural blockers to corpus/coverage quality issues. This issue tracks the phased improvement plan.
Benchmark Baseline (Aave v4, 24h, SCFuzzBench)
Phase 1: Architectural Blockers (prerequisite)
These must land first — without them, the fuzzer uses ~4% of its time budget because trivially-findable assertion failures stop exploration immediately.
fail_on_assertconfig — Gate assertion-failure detection behind a config flag so users can opt in to treating Solidityassert()panics in target functions as invariant breaks.#14275)continuous_runmode — Continue running after finding an invariant/assertion failure. Record it and keep searching for more failures. Currently, the firstassert_canary(0)kills every invariant test in <100 runs.assert_all) #12587 (open, review required)Preflight check in
continuous_runmode — Whencontinuous_run = true, the preflight invariant check (which verifies invariants hold before fuzzing starts) should record failures but not abort the campaign. Currently an always-failing canary invariant causes"failed to set up invariant testing environment".crates/evm/evm/src/executors/invariant/mod.rs(prepare_test→ preflight check)assert_all) #12587Phase 2: Dictionary Quality
The fuzz dictionary collects values from storage, logs, call results, and push bytes. Currently, all values learned during an invariant run are thrown away at the end of each run via
FuzzDictionary::revert(). Values from coverage-producing runs should persist to compound across runs.Selective dictionary persistence — When an invariant run produces new coverage, promote collected values to the persistent baseline (advance the revert watermark) instead of discarding them. Bounded at ~4096 values / ~512 addresses to prevent unbounded growth. Same pattern as the existing
persistent_valuesmechanism for sancov trace-cmp.crates/evm/fuzz/src/strategies/state.rs(addpromote_ephemeral_values())crates/evm/evm/src/executors/invariant/mod.rs(call inend_run()whennew_coverage)Collect data from reverted calls (filtered) — Currently only successful calls feed the dictionary (
if !call_result.reverted). Reverted calls contain boundary values that help explore nearby valid inputs. Needs filtering to avoid noise from panics — only collect from "soft" reverts (require failures), not assertion panics.crates/evm/evm/src/executors/invariant/mod.rsline ~530Phase 3: Corpus Selection & Mutation
The corpus-guided mutation engine has structural weaknesses that limit exploration depth.
Weighted corpus selection — Corpus entries are selected uniformly at random. Favored entries (those producing new coverage >30% of the time) should be selected 3-5× more often, not just protected from eviction. Simple weighted sampling.
crates/evm/evm/src/executors/corpus.rs(new_inputs(),new_input())Add Insert/Delete/Swap mutations — The current mutation set (Splice, Repeat, Interleave, Prefix, Suffix, Abi) is missing critical structural mutations:
crates/evm/evm/src/executors/corpus.rs(MutationTypeenum,new_inputs())Adaptive mutation scheduling — All 6 mutation types have equal weight. Track which mutation types produce new coverage and bias toward productive ones.
crates/evm/evm/src/executors/corpus.rs(mutation_generator)Adaptive random injection rate — Hardcoded 10% chance of replacing a corpus call with a fully random one. Should start higher when corpus is small, decay as quality improves.
crates/evm/evm/src/executors/corpus.rs(generate_next_input())Phase 4: Coverage Signal Quality
The coverage map tracks EVM JUMPI edge pairs but misses state-dependent behavior.
State-sensitive coverage — Hash key storage slot deltas into the coverage map so different state transitions count as different coverage. Highest-impact long-term improvement.
crates/evm/evm/src/executors/corpus.rs,crates/evm/evm/src/executors/mod.rsFunction-pair coverage tracking — Track which pairs/triples of Solidity functions have been called in sequence as an additional coverage dimension.
Validation
Use SCFuzzBench for end-to-end validation:
v0.5.6-recon)cargo build --release --bin forge FOUNDRY_INVARIANT_TIMEOUT=300 forge test --mc CryticToFoundry -vv --fuzz-seed 42