Description
I'm trying to compare Hybrid-Echidna with Echidna and the Forge fuzzer on several benchmark contracts.
To make the comparison as fair as possible, I've created a benchmark generator that automatically generates challenging contracts. The benchmarks intentionally use a limited subset of Solidity to avoid language features that could be handled differently by different tools. Each contract contains ~50 assertions (some can fail, but others cannot due to infeasible path conditions). (If you're curious, you can find one of the benchmarks here. The benchmark-generation approach is inspired by the Fuzzle benchmark generator for C-based fuzzers.) To find the assertions that can fail, a fuzzer needs to generate up to ~15 transactions and satisfy some input constraints for each transaction.
Since I'm not deeply familiar with Hybrid-Echidna I'd like to check if there are any potential issues with my benchmark setup before sharing results.
Since Hybrid-Echidna does not support limiting the execution time (see issue at #101), I'm repeatedly running the fuzzer for shorter periods until the time limit for all fuzzers (for instance, 1 hour for each contract). For each of these shorter fuzzing campaigns I'm using the following settings that deviate from the defaults:
seq-len
: 100 (instead of 10)test-limit
: 50000 for the first short campaign and 500 for all subsequent ones (instead of 50000)max-iters
: 3 for the first short campaign and 1 for all subsequent ones (instead of leaving the option unspecified)no-incremental
: false for the first short campaign and true for all subsequent ones (instead of false)codeSize
(Echidna): 0xc00000 (instead of 0x6000)
I increased seq-len
to 100 since some assertions may require up to ~15 transactions, and some generated transactions may fail. Echidna uses 100 by default.
I observed very high memory consumption when leaving max-iters
unspecified or using larger values. For this reason, I bound the number of iterations.
I reduced test-limit
to make sure that the short campaigns terminate reasonably fast. I also observed increased memory consumption for larger values.
I enable no-incremental
for subsequent short campaigns since the first campaign will have already performed incremental seeding once.
I also increased the codeSize
setting to handle larger contracts, if necessary. Currently, all benchmark contracts are below the EVM limit when using the solc optimizer (0.8.19).
Somewhat surprisingly, Echidna performs much better on these benchmarks than Hybrid-Echidna. It would be great to understand why. For instance, I tried setting a solver timeout. However, this did not have a noticable effect on the fuzzing performance.
Please let me know if you see any potential issues with this setup.