Skip to content

Invariant Fuzzer: Phased improvement plan to close gap with Echidna/Medusa (SCFuzzBench) #14437

Description

@grandizzy

Context

SCFuzzBench benchmarks smart contract fuzzers (Echidna, Medusa, Foundry, Recon) against real DeFi protocols (Aave v4, Superform, Liquity). Current results show Foundry finding 0-3 bugs vs Echidna's 10+ on the same targets with the same time budget.

Analysis of the Foundry invariant testing engine identified several root causes, ranging from architectural blockers to corpus/coverage quality issues. This issue tracks the phased improvement plan.

Benchmark Baseline (Aave v4, 24h, SCFuzzBench)

Fuzzer Bugs Found AUC (norm) Coverage Proxy
Echidna 2.3.1 10 [9,10] 0.816 456k
Medusa 1.4.1 10 [9,10] 0.754 5.2k
Foundry v1.6.0-rc1 3 [3,3] 0.272 374

Phase 1: Architectural Blockers (prerequisite)

These must land first — without them, the fuzzer uses ~4% of its time budget because trivially-findable assertion failures stop exploration immediately.

  • fail_on_assert config — Gate assertion-failure detection behind a config flag so users can opt in to treating Solidity assert() panics in target functions as invariant breaks.

  • continuous_run mode — Continue running after finding an invariant/assertion failure. Record it and keep searching for more failures. Currently, the first assert_canary(0) kills every invariant test in <100 runs.

  • Preflight check in continuous_run mode — When continuous_run = true, the preflight invariant check (which verifies invariants hold before fuzzing starts) should record failures but not abort the campaign. Currently an always-failing canary invariant causes "failed to set up invariant testing environment".


Phase 2: Dictionary Quality

The fuzz dictionary collects values from storage, logs, call results, and push bytes. Currently, all values learned during an invariant run are thrown away at the end of each run via FuzzDictionary::revert(). Values from coverage-producing runs should persist to compound across runs.

  • Selective dictionary persistence — When an invariant run produces new coverage, promote collected values to the persistent baseline (advance the revert watermark) instead of discarding them. Bounded at ~4096 values / ~512 addresses to prevent unbounded growth. Same pattern as the existing persistent_values mechanism for sancov trace-cmp.

    • Location: crates/evm/fuzz/src/strategies/state.rs (add promote_ephemeral_values())
    • Location: crates/evm/evm/src/executors/invariant/mod.rs (call in end_run() when new_coverage)
  • Collect data from reverted calls (filtered) — Currently only successful calls feed the dictionary (if !call_result.reverted). Reverted calls contain boundary values that help explore nearby valid inputs. Needs filtering to avoid noise from panics — only collect from "soft" reverts (require failures), not assertion panics.

    • Location: crates/evm/evm/src/executors/invariant/mod.rs line ~530

Phase 3: Corpus Selection & Mutation

The corpus-guided mutation engine has structural weaknesses that limit exploration depth.

  • Weighted corpus selection — Corpus entries are selected uniformly at random. Favored entries (those producing new coverage >30% of the time) should be selected 3-5× more often, not just protected from eviction. Simple weighted sampling.

    • Location: crates/evm/evm/src/executors/corpus.rs (new_inputs(), new_input())
  • Add Insert/Delete/Swap mutations — The current mutation set (Splice, Repeat, Interleave, Prefix, Suffix, Abi) is missing critical structural mutations:

    • Insert: add a new random call at a random position
    • Delete/Trim: remove calls to find shorter interesting sequences
    • Swap: reorder two calls in a sequence
    • Location: crates/evm/evm/src/executors/corpus.rs (MutationType enum, new_inputs())
  • Adaptive mutation scheduling — All 6 mutation types have equal weight. Track which mutation types produce new coverage and bias toward productive ones.

    • Location: crates/evm/evm/src/executors/corpus.rs (mutation_generator)
  • Adaptive random injection rate — Hardcoded 10% chance of replacing a corpus call with a fully random one. Should start higher when corpus is small, decay as quality improves.

    • Location: crates/evm/evm/src/executors/corpus.rs (generate_next_input())

Phase 4: Coverage Signal Quality

The coverage map tracks EVM JUMPI edge pairs but misses state-dependent behavior.

  • State-sensitive coverage — Hash key storage slot deltas into the coverage map so different state transitions count as different coverage. Highest-impact long-term improvement.

    • Location: crates/evm/evm/src/executors/corpus.rs, crates/evm/evm/src/executors/mod.rs
  • Function-pair coverage tracking — Track which pairs/triples of Solidity functions have been called in sequence as an additional coverage dimension.


Validation

Use SCFuzzBench for end-to-end validation:

  • Target: aave-v4-scfuzzbench (v0.5.6-recon)
  • Metric: unique broken invariants at 1h and 24h timeouts
  • Baseline: Echidna finds 10+; current Foundry finds 3
  • Each phase should be benchmarked independently with multiple seeds
cargo build --release --bin forge
FOUNDRY_INVARIANT_TIMEOUT=300 forge test --mc CryticToFoundry -vv --fuzz-seed 42

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions