Skip to content

Conversation

@elopez
Copy link
Collaborator

@elopez elopez commented Sep 24, 2025

Description

This is a WIP PR to prepare for when nixpkgs is ready to build hevm with GHC 9.10

We would need the following to land on nixpkgs-unstable or be resolved before we can make the switch:

Checklist

  • tested locally
  • added automated tests
  • updated the docs
  • updated the changelog

@elopez
Copy link
Collaborator Author

elopez commented Sep 24, 2025

Performance seems to be either unchanged or consistently (~ same % across sizes) worse in some cases. I wonder if FFI calls have a higher cost on the new version, my first intuition is that the slower tests are the ones that compute keccak hashes.

Benchmark
All
  loop
    2:     OK
      89.1 μs ± 7.9 μs,       same as baseline
    4:     OK
      131  μs ± 8.3 μs,       same as baseline
    8:     OK
      212  μs ± 7.1 μs,       same as baseline
    16:    OK
      376  μs ±  31 μs,       same as baseline
    32:    OK
      692  μs ±  58 μs,       same as baseline
    64:    OK
      1.37 ms ± 124 μs,       same as baseline
    128:   OK
      2.61 ms ± 215 μs,       same as baseline
    256:   OK
      5.21 ms ± 470 μs,       same as baseline
    512:   OK
      10.4 ms ± 932 μs,       same as baseline
    1024:  OK
      20.4 ms ± 1.8 ms,       same as baseline
    2048:  OK
      40.9 ms ± 4.0 ms,       same as baseline
    4096:  OK
      81.6 ms ± 7.5 ms,       same as baseline
    8192:  OK
      164  ms ±  11 ms,       same as baseline
    16384: OK
      325  ms ±  16 ms,       same as baseline
  primes
    2:     OK
      279  μs ±  23 μs,       same as baseline
    4:     OK
      429  μs ±  28 μs,  8% more than baseline
    8:     OK
      642  μs ±  61 μs,       same as baseline
    16:    OK
      1.23 ms ±  48 μs,       same as baseline
    32:    OK
      2.72 ms ± 135 μs,       same as baseline
    64:    OK
      5.90 ms ± 442 μs,       same as baseline
    128:   OK
      13.3 ms ± 916 μs,       same as baseline
    256:   OK
      31.4 ms ± 2.4 ms,       same as baseline
    512:   OK
      75.0 ms ± 4.1 ms,       same as baseline
    1024:  OK
      180  ms ±  18 ms,       same as baseline
    2048:  OK
      444  ms ±  19 ms,       same as baseline
    4096:  OK
      1.105 s ±  44 ms,       same as baseline
    8192:  OK
      2.769 s ± 176 ms,       same as baseline
    16384: OK
      7.014 s ± 333 ms,       same as baseline
  hashes
    2:     OK
      127  μs ± 4.1 μs, 19% more than baseline
    4:     OK
      205  μs ±  17 μs, 19% more than baseline
    8:     OK
      358  μs ±  35 μs, 23% more than baseline
    16:    OK
      666  μs ±  56 μs, 23% more than baseline
    32:    OK
      1.28 ms ± 126 μs, 21% more than baseline
    64:    OK
      2.56 ms ± 176 μs, 26% more than baseline
    128:   OK
      5.03 ms ± 235 μs, 24% more than baseline
    256:   OK
      10.1 ms ± 543 μs, 23% more than baseline
    512:   OK
      20.3 ms ± 1.9 ms, 23% more than baseline
    1024:  OK
      40.7 ms ± 2.1 ms, 25% more than baseline
    2048:  OK
      80.5 ms ± 7.7 ms, 24% more than baseline
    4096:  OK
      161  ms ± 8.7 ms, 23% more than baseline
    8192:  OK
      317  ms ±  23 ms, 23% more than baseline
    16384: OK
      637  ms ±  60 ms, 25% more than baseline
  hashmem
    2:     OK
      208  μs ±  14 μs, 17% more than baseline
    4:     OK
      327  μs ±  32 μs, 24% more than baseline
    8:     OK
      562  μs ±  55 μs, 22% more than baseline
    16:    OK
      1.04 ms ±  54 μs, 21% more than baseline
    32:    OK
      2.05 ms ± 172 μs, 31% more than baseline
    64:    OK
      4.03 ms ± 287 μs, 28% more than baseline
    128:   OK
      7.88 ms ± 506 μs, 25% more than baseline
    256:   OK
      16.1 ms ± 1.2 ms, 25% more than baseline
    512:   OK
      32.2 ms ± 2.6 ms, 26% more than baseline
    1024:  OK
      64.5 ms ± 5.3 ms, 28% more than baseline
    2048:  OK
      128  ms ±  13 ms, 25% more than baseline
    4096:  OK
      257  ms ±  15 ms, 28% more than baseline
    8192:  OK
      512  ms ±  37 ms, 26% more than baseline
    16384: OK
      1.025 s ±  60 ms, 27% more than baseline
  balanceTransfer
    2:     OK
      5.18 ms ± 433 μs,       same as baseline
    4:     OK
      5.49 ms ± 248 μs, 11% more than baseline
    8:     OK
      5.44 ms ± 498 μs,       same as baseline
    16:    OK
      5.57 ms ± 552 μs,       same as baseline
    32:    OK
      5.86 ms ± 481 μs,       same as baseline
    64:    OK
      6.45 ms ± 508 μs,       same as baseline
    128:   OK
      8.18 ms ± 515 μs,  8% more than baseline
    256:   OK
      11.3 ms ± 1.0 ms,       same as baseline
    512:   OK
      18.6 ms ± 852 μs,       same as baseline
    1024:  OK
      31.8 ms ± 2.1 ms,       same as baseline
    2048:  OK
      56.1 ms ± 3.9 ms,       same as baseline
    4096:  OK
      109  ms ± 9.2 ms,       same as baseline
    8192:  OK
      218  ms ±  21 ms,       same as baseline
    16384: OK
      429  ms ±  38 ms,       same as baseline
  funcCall
    2:     OK
      151  μs ± 9.0 μs,       same as baseline
    4:     OK
      208  μs ±  11 μs,       same as baseline
    8:     OK
      332  μs ±  23 μs,       same as baseline
    16:    OK
      564  μs ±  23 μs,       same as baseline
    32:    OK
      1.06 ms ±  98 μs,       same as baseline
    64:    OK
      1.97 ms ± 146 μs,       same as baseline
    128:   OK
      3.82 ms ± 224 μs,       same as baseline
    256:   OK
      7.42 ms ± 640 μs,       same as baseline
    512:   OK
      14.7 ms ± 868 μs,       same as baseline
    1024:  OK
      29.3 ms ± 1.8 ms,       same as baseline
    2048:  OK
      57.7 ms ± 4.0 ms,       same as baseline
    4096:  OK
      116  ms ±  10 ms,       same as baseline
    8192:  OK
      235  ms ±  19 ms,       same as baseline
    16384: OK
      464  ms ±  37 ms,       same as baseline
  contractCreation
    2:     OK
      245  μs ±  18 μs, 16% more than baseline
    4:     OK
      397  μs ±  16 μs, 16% more than baseline
    8:     OK
      720  μs ±  68 μs, 18% more than baseline
    16:    OK
      1.38 ms ±  82 μs, 29% more than baseline
    32:    OK
      2.73 ms ± 125 μs, 52% more than baseline
    64:    OK
      6.25 ms ± 276 μs, 28% more than baseline
    128:   OK
      13.8 ms ± 974 μs, 22% more than baseline
    256:   OK
      30.5 ms ± 1.9 ms, 14% more than baseline
    512:   OK
      66.8 ms ± 2.4 ms, 16% more than baseline
    1024:  OK
      140  ms ±  10 ms, 13% more than baseline
    2048:  OK
      292  ms ±  12 ms, 15% more than baseline
    4096:  OK
      644  ms ±  40 ms, 19% more than baseline
    8192:  OK
      1.310 s ±  69 ms, 20% more than baseline
    16384: OK
      2.579 s ±  63 ms,       same as baseline
  contractCreationMem
    2:     OK
      1.06 ms ±  69 μs, 23% more than baseline
    4:     OK
      1.93 ms ± 140 μs, 26% more than baseline
    8:     OK
      4.14 ms ± 298 μs, 34% more than baseline
    16:    OK
      9.04 ms ± 490 μs, 25% more than baseline
    32:    OK
      19.0 ms ± 977 μs, 22% more than baseline
    64:    OK
      38.0 ms ± 1.8 ms, 20% more than baseline
    128:   OK
      77.3 ms ± 6.2 ms, 18% more than baseline
    256:   OK
      157  ms ± 7.0 ms, 19% more than baseline
    512:   OK
      322  ms ±  11 ms, 19% more than baseline
    1024:  OK
      669  ms ±  65 ms, 21% more than baseline
    2048:  OK
      1.277 s ±  77 ms, 15% more than baseline
    4096:  OK
      2.681 s ± 256 ms, 20% more than baseline
    8192:  OK
      5.426 s ± 348 ms, 21% more than baseline
    16384: OK
      10.282 s ± 106 ms, 14% more than baseline
  arrayCreationMem
    2:     OK
      517  μs ±  44 μs, 31% more than baseline
    4:     OK
      1.52 ms ±  99 μs, 56% more than baseline
    8:     OK
      5.39 ms ± 439 μs, 62% more than baseline
    16:    OK
      19.7 ms ± 1.3 ms, 64% more than baseline
    32:    OK
      78.2 ms ± 7.3 ms, 67% more than baseline
    64:    OK
      305  ms ±  16 ms, 66% more than baseline
    128:   OK
      1.222 s ±  43 ms, 70% more than baseline
    256:   OK
      4.863 s ±  88 ms, 69% more than baseline
    512:   OK
      19.337 s ±  36 ms, 69% more than baseline
  mapStorage
    2:     OK
      217  μs ± 8.0 μs, 23% more than baseline
    4:     OK
      369  μs ±  25 μs, 26% more than baseline
    8:     OK
      665  μs ±  54 μs, 27% more than baseline
    16:    OK
      1.29 ms ± 112 μs, 35% more than baseline
    32:    OK
      2.52 ms ± 164 μs, 29% more than baseline
    64:    OK
      5.09 ms ± 431 μs, 32% more than baseline
    128:   OK
      10.2 ms ± 830 μs, 31% more than baseline
    256:   OK
      19.9 ms ± 1.3 ms, 22% more than baseline
    512:   OK
      41.3 ms ± 3.5 ms, 28% more than baseline
    1024:  OK
      83.4 ms ± 5.9 ms, 31% more than baseline
    2048:  OK
      166  ms ±  16 ms, 32% more than baseline
    4096:  OK
      330  ms ±  17 ms, 29% more than baseline
    8192:  OK
      664  ms ±  40 ms, 28% more than baseline
    16384: OK
      1.321 s ±  32 ms, 29% more than baseline
  swapOperations
    2:     OK
      480  μs ±  21 μs,       same as baseline
    4:     OK
      527  μs ±  43 μs,       same as baseline
    8:     OK
      742  μs ±  27 μs,       same as baseline
    16:    OK
      1.05 ms ±  59 μs,       same as baseline
    32:    OK
      1.67 ms ± 123 μs,       same as baseline
    64:    OK
      2.96 ms ± 226 μs,       same as baseline
    128:   OK
      5.41 ms ± 467 μs,       same as baseline
    256:   OK
      10.1 ms ± 610 μs,       same as baseline
    512:   OK
      19.3 ms ± 1.2 ms,       same as baseline
    1024:  OK
      38.2 ms ± 2.4 ms,       same as baseline
    2048:  OK
      76.2 ms ± 4.0 ms,       same as baseline
    4096:  OK
      153  ms ± 7.3 ms,       same as baseline
    8192:  OK
      303  ms ±  16 ms,       same as baseline
    16384: OK
      605  ms ±  20 ms,       same as baseline

All 149 tests passed (438.09s)

@blishko
Copy link
Collaborator

blishko commented Sep 24, 2025

Performance seems to be either unchanged or consistently (~ same % across sizes) worse in some cases. I wonder if FFI calls have a higher cost on the new version, my first intuition is that the slower tests are the ones that compute keccak hashes.

This is quite unfortunate.
Should we try to identify the regression and see if GHC can do something about it?
Not sure we want to update under these circumstances.

@elopez
Copy link
Collaborator Author

elopez commented Sep 24, 2025

Yeah, I think it'd be good to try and figure out what the extra time is spent on.

@elopez elopez changed the title flake: update inputs, fix build with GHC 9.8 flake: update inputs, fix build with GHC 9.10 Sep 24, 2025
@gustavo-grieco
Copy link
Collaborator

I will be also in favor to wait until we have a clear picture on what is causing this regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants