Optimum-Benchmark (🚧 WIP 🚧)

The Goal

A repository aiming to create a benchmarking utility for any model on HuggingFace's Hub supporting Optimum's inference & training, optimizations & quantizations, on different backends & hardwares (OnnxRuntime, Intel Neural Compressor, OpenVINO, Habana Gaudi Processor (HPU), etc).

The experiment management and tracking is handled by hydra using the command line with minimum configuration changes and maximum flexibility (inspired from tune).

Motivation

Many users would want to know how their chosen model performs (latency & throughput) before deploying it to production.
Many hardware vendors would want to know how their hardware performs on different models and how it compares to others.
Optimum offers a lot of optimizations that can be applied to models and improve their performance, but it's hard to know which ones to use if you don't know a lot about your hardware. It's also hard to estimate how much these optimizations will improve the performance before training your model or downloading it from the hub and optimizing it.
Benchmarks depend heavily on many factors, like the machine/hardware/os/releases/etc but most of this information is not put forward with the results. And that makes most of the benchmarks available today, not very useful for decision making.
[...]

Features

General:

Latency and throughput tracking (default behavior)
Peak memory tracking (benchmark.memory=true)
Symbolic Profiling (benchmark.profile=true)
Input shapes control (e.g. benchmark.input_shapes.batch_size=8)
Random weights initialization (backend.no_weights=true support depends on backend)

Inference:

Pytorch backend for CPU
Pytorch backend for CUDA
Pytorch backend for Habana Gaudi Processor (HPU)
OnnxRuntime backend for CPUExecutionProvider
OnnxRuntime backend for CUDAExecutionProvider
Intel Neural Compressor backend for CPU
OpenVINO backend for CPU

Optimizations:

Pytorch's Automatic Mixed Precision
Optimum's BetterTransformer
Optimum's Optimization and AutoOptimization
Optimum's Quantization and AutoQuantization
Optimum's Calibration for Static Quantization
BitsAndBytes' quantization

Quickstart

Start by installing the required dependencies depending on your hardware and the backends you want to use. For example, if you're gonna be running some GPU benchmarks, you can install the requirements with:

python -m pip install -r gpu_requirements.txt

Then install the package:

python -m pip install -e .

You can now run a benchmark using the command line by specifying the configuration directory and the configuration name. Both arguments are mandatory. The config-dir is the directory where the configuration files are stored and the config-name is the name of the configuration file without the .yaml extension.

optimum-benchmark --config-dir examples/ --config-name pytorch

This will run the benchmark using the configuration in examples/pytorch.yaml and store the results in runs/pytorch.

The result files are inference_results.csv, the program's logs main.log and the configuration that's been used hydra_config.yaml

The directory for storing these results can be changed using the hydra.run.dir (and/or hydra.sweep.dir in case of a multirun) in the command line or in the config file (see base_config.yaml).

Command-line configuration overrides

It's easy to override the default behavior of a benchmark from the command line.

optimum-benchmark --config-dir examples/ --config-name pytorch model=gpt2 device=cuda:1

Multirun configuration sweeps

You can easily run configuration sweeps using the -m or --multirun option. By default, configurations will be executed serially but other kinds of executions are supported with hydra's launcher plugins : hydra/launcher=submitit, hydra/launcher=rays, etc.

optimum-benchmark --config-dir examples --config-name pytorch -m device=cpu,cuda

Also, for integer parameters like batch_size, one can specify a range of values to sweep over:

optimum-benchmark --config-dir examples --config-name pytorch -m device=cpu,cuda benchmark.input_shapes.batch_size='range(1,10,step=2)'

Reporting benchamrk results (WIP)

To aggregate the results of a benchmark (run(s) or sweep(s)), you can use the optimum-report command.

optimum-report --experiments {experiments_folder_1} {experiments_folder_2} --baseline {baseline_folder} --report-name {report_name}

This will create a report in the reports folder with the name {report_name}. The report will contain the results of the experiments in {experiments_folder_1} and {experiments_folder_2} compared to the results of the baseline in {baseline_folder} in the form of a .csv file, an .svg rich table and (a) .png plot(s).

Configurations structure

You can create custom configuration files following the examples here. The easiest way to do so is by using hydra's composition with a base configuratin examples/base_config.yaml.

To create a configuration that uses a wav2vec2 model and onnxruntime backend, it's as easy as:

defaults:
  - base_config
  - _self_
  - override backend: onnxruntime

experiment_name: onnxruntime_wav2vec2

model: bookbot/distil-wav2vec2-adult-child-cls-37m
device: cpu

Some examples are provided in the tests/configs folder for different backends and models.

Name		Name	Last commit message	Last commit date
Latest commit History 323 Commits
.github/workflows		.github/workflows
docker		docker
examples		examples
optimum_benchmark		optimum_benchmark
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cpu_requirements.txt		cpu_requirements.txt
gpu_requirements.txt		gpu_requirements.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Optimum-Benchmark (🚧 WIP 🚧)

The Goal

Motivation

Features

Quickstart

Command-line configuration overrides

Multirun configuration sweeps

Reporting benchamrk results (WIP)

Configurations structure

TODO

About

Uh oh!

Releases

Packages

Languages

License

aoowweenn/optimum-benchmark

Folders and files

Latest commit

History

Repository files navigation

Optimum-Benchmark (🚧 WIP 🚧)

The Goal

Motivation

Features

Quickstart

Command-line configuration overrides

Multirun configuration sweeps

Reporting benchamrk results (WIP)

Configurations structure

TODO

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages