A Benchmark for Evaluating Efficiency of LLM-Generated Hardware Code.
Pluto is an LLM evaluation benchmark dataset for PPA-aware (Power, Performance, Area) hardware design optimization using Verilog. It consists of a curated set of digital design problems, where each problem is paired with three implementations; each is optimized separaelty for area, delay, and power.
The dataset is published and maintained at:
🤗 Pluto@HF
You can directly download the dataset from huggingface using:
For the dataset synthesized with yosys and Sky130 PDK:
data = load_dataset("scale-lab/Pluto", name="yosys-sky130", split="eval")For the dataset synthesized with yosys and TSMC65 PDK:
data = load_dataset("scale-lab/Pluto", name="yosys-tsmc65", split="eval")For the dataset synthesized with Cadence Genus and Sky130 PDK:
data = load_dataset("scale-lab/Pluto", name="genus-sky130", split="eval")For the dataset synthesized with Cadence Genus and TSMC65 PDK:
data = load_dataset("scale-lab/Pluto", name="genus-tsmc65", split="eval")Prerequisties:
You must have a synthesis tool installed. The flow currently supports Yosys & OpenSTA or Cadence Genus.
Then, you can install python dependecies using:
pip install -r requirements.txt
python core/eval/evaluator.py --model <model_name> --num_samples <num_samples> --batch_size <batch_size> --p <problem_formulation>
To report PPA for a specific design, you can run the following command:
The script supports reporting post-synthesis metrics using Cadence Genus or Yosys.
python core/synth/synth.py --design <path-to-rtl-code> --tool <yosys/genus> --liberty <path-to-liberty-file>
@INPROCEEDINGS{abdelatty2025pluto,
author={M. {Abdelatty} and M. {Nouh} and J.{Rosenstein} and S. {Reda}},
booktitle={Preprint},
title={Pluto: A Benchmark for Evaluating Efficiency of LLM-generated Hardware Code},
year={2026},
volume={},
number={},
}
BSD 3-Clause License. See LICENSE file
