Description
I created this ticket to open the discussion on possible improvements over the EVM execution.
I've been digging a bit to improve the performances of the EVM.
There isn't much space left for optimization but here are my reports.
Running a basic contract looping (32000 times) over an addition:
contract Loop {
function big_loop(uint256 target) pure public returns (uint256) {
uint256 number = 0;
for (uint i = 0; i < target; i++) {
number += 1;
}
return number;
}
}
I used callgrind (valgrind) to retrieve as much info I could (the recording starts right before the contract call):
callgrind.out.txt (to read with qcachegrind
)
perf.data.txt (to rename perf.data and run with perf report --no-inline
)
Here are the screenshot of the interesting parts:
(This is the map of the execution cycles for the execution of this contract)
This is the same data as a list (filtered to evm functions only):
Some important parts:
- tracing (including memory(), position(), return_value(), gasometer tracing): ~50% (This build was used using the "tracing" feature)
- run loop: 25% (without including children calls), generating I think the 3.7M calls to memory, position, inspect, return_value,...
- return_value: 19% (being called 3.7M times), this is mostly due to 256bits number conversation.
- gasometer: ~5% (snapshot called 3M times)
I've dug into the return_value
to see if there was anything special. 1/3rd of it is used by U256::partial_cmp and U256::From, the rest is inherent to the function:
(There is nothing really surprising to me when looking at it, but I suspect that being called 3.7M times makes any ops heavy)