Skip to content

Verilator simulations can be made an order of magnitude faster #2000

@mayyxeng

Description

@mayyxeng

Background Work

Feature Description

For simple tests where we only need to run an assembly program, we can make the Verilator simulations more than order of magnitude faster, by removing some functionality from the test harness.

Motivating Example

On a an AMD EPYC 9554 3.75 GHz processors, single-thread Verilator simulation of a single-core RocketChip runs at about 10 kHz, but by stripping down the harness, we could make it run at 270~kHz (single-thread on EPYC), i.e., 27x faster.

Here is how I achieved a 27x speedup:

  1. Remove TL monitors (WithoutTLMonitors as stated in the documentation.
  2. Directly load the program as a hex file into the simulated RAM ($readmemhex), see here.
  3. Exclusively handle +verbose in Verilog using a simplified all-Verilog harness

The last step has the most significant effect. It seems that Verilator really struggles with how verbose printing is handled through $c(...) PLI calls. Even when simulation is non-verbose there is a huge performance impact.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions