Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
dynamic_identity.onnx	dynamic_identity.onnx

Comparing Frameworks

Introduction

You can use the run subtool to compare a model between different frameworks. In the simplest case, you can supply a model, and one or more framework flags. By default, it will generate synthetic input data, run inference using the specified frameworks, then compare outputs of the specified frameworks.

Running The Example

In this example, we'll outline various common use-cases for the run subtool:

Comparing TensorRT And ONNX-Runtime Outputs
Comparing TensorRT Precisions
Changing Tolerances
Changing Comparison Metrics
Comparing Per-Layer Outputs Between ONNX-Runtime And TensorRT

Comparing TensorRT And ONNX-Runtime Outputs

To run the model in Polygraphy with both frameworks and perform an output comparison:

polygraphy run dynamic_identity.onnx --trt --onnxrt

The dynamic_identity.onnx model has dynamic input shapes. By default, Polygraphy will override any dynamic input dimensions in the model to constants.DEFAULT_SHAPE_VALUE (defined as 1) and warn you:

[W]     Input tensor: X (dtype=DataType.FLOAT, shape=(1, 2, -1, -1)) | No shapes provided; Will use shape: [1, 2, 1, 1] for min/opt/max in profile.
[W]     This will cause the tensor to have a static shape. If this is incorrect, please set the range of shapes for this input tensor.

In order to suppress this message and explicitly provide input shapes to Polygraphy, use the --input-shapes option:

polygraphy run dynamic_identity.onnx --trt --onnxrt \
    --input-shapes X:[1,2,4,4]

Comparing TensorRT Precisions

To build a TensorRT engine with reduced precision layers for comparison against ONNXRT, use one of the supported precision flags (--tf32, --fp16, or --int8). For example:

polygraphy run dynamic_identity.onnx --trt --fp16 --onnxrt \
    --input-shapes X:[1,2,4,4]

⚠️ Getting acceptable accuracy with INT8 precision typically requires an additional calibration step: see the developer guide and instructions on how to do calibration with Polygraphy on the command line.

Changing Tolerances

The default tolerances used by run are usually appropriate for FP32 precision but may not be appropriate for reduced precisions. In order to relax tolerances, you can use the --atol and --rtol options to set absolute and relative tolerance respectively.

Changing Comparison Metrics

You can use the --check-error-stat option to change the metric used for comparison. By default, Polygraphy uses an "elementwise" metric (--check-error-stat elemwise).

Other possible metrics for --check-error-stat are mean, median, and max, which compares the mean, median, and maximum absolute/relative error across the tensor, respectively.

To better understand this, suppose we are comparing two outputs out0 and out1. Polygraphy takes the elementwise absolute and relative difference of these tensors:

absdiff = out0 - out1
reldiff = absdiff / abs(out1)

Then, for each index i in the output, Polygraphy checks whether absdiff[i] > atol and reldiff[i] > rtol. If any index satisfies this, then the comparison will fail. This is less stringent than comparing the maximum absolute and relative error across the entire tensor (--check-error-stat max) since if different indices i and j satisfy absdiff[i] > atol and reldiff[j] > rtol, then the max comparison will fail but the elemwise comparison may pass.

Putting it all together, the below example runs a median comparison between TensorRT using FP16 and ONNX-Runtime, using absolute and relative tolerances of 0.001:

polygraphy run dynamic_identity.onnx --trt --fp16 --onnxrt \
    --input-shapes X:[1,2,4,4] \
    --atol 0.001 --rtol 0.001 --check-error-stat median

You can also specify per-output values for --atol/--rtol/--check-error-stat. See the help output of the run subtool for more information.

Comparing Per-Layer Outputs Between ONNX-Runtime And TensorRT

When network outputs do not match, it can be useful to compare per-layer outputs to see where the error is introduced. To do so, you can use the --trt-outputs and --onnx-outputs options respectively. These options accept one or more output names as their arguments. The special value mark all indicates that all tensors in the model should be compared:

 polygraphy run dynamic_identity.onnx --trt --onnxrt \
     --trt-outputs mark all \
     --onnx-outputs mark all

To find the first mismatched output more easily, you can use the --fail-fast option which will cause the tool to exit after the first mismatch between outputs.

Note that use of --trt-outputs mark all can sometimes perturb the generated engine due to differences in timing, layer fusion choices, and format constraints, which can hide the failure. In that case, you may have to use a more sophisticated approach to bisect the failing model and generate a reduced test case that reproduces the error. See Reducing Failing ONNX Models for a tutorial on how to do this with Polygraphy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Comparing Frameworks

Introduction

Running The Example

Comparing TensorRT And ONNX-Runtime Outputs

Comparing TensorRT Precisions

Changing Tolerances

Changing Comparison Metrics

Comparing Per-Layer Outputs Between ONNX-Runtime And TensorRT

Further Reading

FilesExpand file tree

01_comparing_frameworks

Directory actions

More options

Directory actions

More options

Latest commit

History

01_comparing_frameworks

Folders and files

parent directory

README.md

Comparing Frameworks

Introduction

Running The Example

Comparing TensorRT And ONNX-Runtime Outputs

Comparing TensorRT Precisions

Changing Tolerances

Changing Comparison Metrics

Comparing Per-Layer Outputs Between ONNX-Runtime And TensorRT

Further Reading