Invariant Fuzzer: Phased improvement plan to close gap with Echidna/Medusa (SCFuzzBench)

## Context

[SCFuzzBench](https://scfuzzbench.com/) benchmarks smart contract fuzzers (Echidna, Medusa, Foundry, Recon) against real DeFi protocols (Aave v4, Superform, Liquity). Current results show Foundry finding **0-3 bugs** vs Echidna's **10+** on the same targets with the same time budget.

Analysis of the Foundry invariant testing engine identified several root causes, ranging from architectural blockers to corpus/coverage quality issues. This issue tracks the phased improvement plan.

### Benchmark Baseline (Aave v4, 24h, SCFuzzBench)

| Fuzzer | Bugs Found | AUC (norm) | Coverage Proxy |
|--------|-----------|------------|---------------|
| Echidna 2.3.1 | 10 [9,10] | 0.816 | 456k |
| Medusa 1.4.1 | 10 [9,10] | 0.754 | 5.2k |
| **Foundry v1.6.0-rc1** | **3 [3,3]** | **0.272** | **374** |

---

## Phase 1: Architectural Blockers (prerequisite)

These must land first — without them, the fuzzer uses ~4% of its time budget because trivially-findable assertion failures stop exploration immediately.

- [x] **`fail_on_assert` config** — Gate assertion-failure detection behind a config flag so users can opt in to treating Solidity `assert()` panics in target functions as invariant breaks.
  - Merged: 419f787e5 (`#14275`)

- [ ] **`continuous_run` mode** — Continue running after finding an invariant/assertion failure. Record it and keep searching for more failures. Currently, the first `assert_canary(0)` kills every invariant test in <100 runs.
  - PR: #12587 (open, review required)
  - Impact: 0-3 bugs → 6 bugs (all found in <15s)

- [ ] **Preflight check in `continuous_run` mode** — When `continuous_run = true`, the preflight invariant check (which verifies invariants hold before fuzzing starts) should record failures but not abort the campaign. Currently an always-failing canary invariant causes `"failed to set up invariant testing environment"`.
  - Location: `crates/evm/evm/src/executors/invariant/mod.rs` (`prepare_test` → preflight check)
  - Should be part of #12587

---

## Phase 2: Dictionary Quality

The fuzz dictionary collects values from storage, logs, call results, and push bytes. Currently, **all values learned during an invariant run are thrown away** at the end of each run via `FuzzDictionary::revert()`. Values from coverage-producing runs should persist to compound across runs.

- [ ] **Selective dictionary persistence** — When an invariant run produces new coverage, promote collected values to the persistent baseline (advance the revert watermark) instead of discarding them. Bounded at ~4096 values / ~512 addresses to prevent unbounded growth. Same pattern as the existing `persistent_values` mechanism for sancov trace-cmp.
  - Location: `crates/evm/fuzz/src/strategies/state.rs` (add `promote_ephemeral_values()`)
  - Location: `crates/evm/evm/src/executors/invariant/mod.rs` (call in `end_run()` when `new_coverage`)

- [ ] **Collect data from reverted calls (filtered)** — Currently only successful calls feed the dictionary (`if !call_result.reverted`). Reverted calls contain boundary values that help explore nearby valid inputs. Needs filtering to avoid noise from panics — only collect from "soft" reverts (require failures), not assertion panics.
  - Location: `crates/evm/evm/src/executors/invariant/mod.rs` line ~530

---

## Phase 3: Corpus Selection & Mutation

The corpus-guided mutation engine has structural weaknesses that limit exploration depth.

- [ ] **Weighted corpus selection** — Corpus entries are selected uniformly at random. Favored entries (those producing new coverage >30% of the time) should be selected 3-5× more often, not just protected from eviction. Simple weighted sampling.
  - Location: `crates/evm/evm/src/executors/corpus.rs` (`new_inputs()`, `new_input()`)

- [ ] **Add Insert/Delete/Swap mutations** — The current mutation set (Splice, Repeat, Interleave, Prefix, Suffix, Abi) is missing critical structural mutations:
  - **Insert**: add a new random call at a random position
  - **Delete/Trim**: remove calls to find shorter interesting sequences
  - **Swap**: reorder two calls in a sequence
  - Location: `crates/evm/evm/src/executors/corpus.rs` (`MutationType` enum, `new_inputs()`)

- [ ] **Adaptive mutation scheduling** — All 6 mutation types have equal weight. Track which mutation types produce new coverage and bias toward productive ones.
  - Location: `crates/evm/evm/src/executors/corpus.rs` (`mutation_generator`)

- [ ] **Adaptive random injection rate** — Hardcoded 10% chance of replacing a corpus call with a fully random one. Should start higher when corpus is small, decay as quality improves.
  - Location: `crates/evm/evm/src/executors/corpus.rs` (`generate_next_input()`)

---

## Phase 4: Coverage Signal Quality

The coverage map tracks EVM JUMPI edge pairs but misses state-dependent behavior.

- [ ] **State-sensitive coverage** — Hash key storage slot deltas into the coverage map so different state transitions count as different coverage. Highest-impact long-term improvement.
  - Location: `crates/evm/evm/src/executors/corpus.rs`, `crates/evm/evm/src/executors/mod.rs`

- [ ] **Function-pair coverage tracking** — Track which pairs/triples of Solidity functions have been called in sequence as an additional coverage dimension.

---

## Validation

Use [SCFuzzBench](https://github.com/Recon-Fuzz/scfuzzbench) for end-to-end validation:
- Target: [aave-v4-scfuzzbench](https://github.com/Recon-Fuzz/aave-v4-scfuzzbench) (`v0.5.6-recon`)
- Metric: unique broken invariants at 1h and 24h timeouts
- Baseline: Echidna finds 10+; current Foundry finds 3
- Each phase should be benchmarked independently with multiple seeds

```bash
cargo build --release --bin forge
FOUNDRY_INVARIANT_TIMEOUT=300 forge test --mc CryticToFoundry -vv --fuzz-seed 42
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Invariant Fuzzer: Phased improvement plan to close gap with Echidna/Medusa (SCFuzzBench) #14437

Context

Benchmark Baseline (Aave v4, 24h, SCFuzzBench)

Phase 1: Architectural Blockers (prerequisite)

Phase 2: Dictionary Quality

Phase 3: Corpus Selection & Mutation

Phase 4: Coverage Signal Quality

Validation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Fuzzer	Bugs Found	AUC (norm)	Coverage Proxy
Echidna 2.3.1	10 [9,10]	0.816	456k
Medusa 1.4.1	10 [9,10]	0.754	5.2k
Foundry v1.6.0-rc1	3 [3,3]	0.272	374

Uh oh!

Invariant Fuzzer: Phased improvement plan to close gap with Echidna/Medusa (SCFuzzBench) #14437

Description

Context

Benchmark Baseline (Aave v4, 24h, SCFuzzBench)

Phase 1: Architectural Blockers (prerequisite)

Phase 2: Dictionary Quality

Phase 3: Corpus Selection & Mutation

Phase 4: Coverage Signal Quality

Validation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions