Skip to content

[Bug Report] ONNX export failed on adaptive_avg_pool2d at tensorrt micro bench. #352

Open
@LeiWang1999

Description

@LeiWang1999

I am currently working on the superbench/superbench:v0.4.0-cuda11.1.1 docker workspace to measure benchmark.

To get different model's benchmark with tensorrt, I customized the superbenchmark/examples/benchmarks/tensorrt_inference_performance.py like below

# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

"""Micro benchmark example for TensorRT inference performance.

Commands to run:
    python3 examples/benchmarks/tensorrt_inference_performance.py
"""
import sys
from statistics import mode
from superbench.benchmarks import BenchmarkRegistry, Platform
from superbench.common.utils import logger

if __name__ == '__main__':
    batch = int(sys.argv[1])
    model = sys.argv[2]
    precision = sys.argv[3]
    parameters = '--batch_size {0} --pytorch_models {1} --precision {2} --seq_length 8 --iterations 105'.format(batch, model, precision)

    context = BenchmarkRegistry.create_benchmark_context('tensorrt-inference', platform=Platform.CUDA, parameters=parameters)
    benchmark = BenchmarkRegistry.launch_benchmark(context)
    if benchmark:
        logger.info(
            'benchmark: {}, return code: {}, result: {}'.format(
                benchmark.name, benchmark.return_code, benchmark.result
            )
        )

execution:

nvprof --log-file benches/TensorRT/vgg11/fp32_batch_1_prof.txt /opt/conda/bin/python /opt/superbench/examples/benchmarks/tensorrt_inference_performance.py 1 vgg11 fp32 | tee benches/TensorRT/vgg11/fp32_batch_1_time.txt

log :

root@616b67a69ab7:/opt/superbench# nvprof --log-file benches/TensorRT/vgg11/fp32_batch_1_prof.txt /opt/conda/bin/python /opt/superbench/examples/benchmarks/tensorrt_inference_performance.py 1 vgg11 fp32 | tee benches/TensorRT/vgg11/fp32_batch_1_time.txt
/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py:256: UserWarning: `add_node_names' can be set to True only when 'operator_export_type' is `ONNX`. Since 'operator_export_type' is not set to 'ONNX', `add_node_names` argument will be ignored.
warnings.warn("`{}' can be set to True only when 'operator_export_type' is "
/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py:256: UserWarning: `do_constant_folding' can be set to True only when 'operator_export_type' is `ONNX`. Since 'operator_export_type' is not set to 'ONNX', `do_constant_folding` argument will be ignored.
warnings.warn("`{}' can be set to True only when 'operator_export_type' is "
/opt/conda/lib/python3.8/site-packages/torch/onnx/symbolic_helper.py:182: UserWarning: ONNX export failed on adaptive_avg_pool2d because input size not accessible not supported
warnings.warn("ONNX export failed on " + op + " because " + msg + " not supported")
[2022-05-06 12:33:25,995 616b67a69ab7:18330][micro_base.py:167][INFO] Execute command - round: 0, benchmark: tensorrt-inference, command: /opt/tensorrt/bin/trtexec --onnx=/root/.cache/torch/hub/onnx/vgg11.onnx --explicitBatch --optShapes=input:1x3x224x224 --workspace=8192 --iterations=105 --percentile=99.
[2022-05-06 12:33:40,844 616b67a69ab7:18330][micro_base.py:176][ERROR] Microbenchmark execution failed - round: 0, benchmark: tensorrt-inference, error message: &&&& RUNNING TensorRT.trtexec # /opt/tensorrt/bin/trtexec --onnx=/root/.cache/torch/hub/onnx/vgg11.onnx --explicitBatch --optShapes=input:1x3x224x224 --workspace=8192 --iterations=105 --percentile=99
[05/06/2022-12:33:26] [I] === Model Options ===
[05/06/2022-12:33:26] [I] Format: ONNX
[05/06/2022-12:33:26] [I] Model: /root/.cache/torch/hub/onnx/vgg11.onnx
[05/06/2022-12:33:26] [I] Output:
[05/06/2022-12:33:26] [I] === Build Options ===
[05/06/2022-12:33:26] [I] Max batch: explicit
[05/06/2022-12:33:26] [I] Workspace: 8192 MiB
[05/06/2022-12:33:26] [I] minTiming: 1
[05/06/2022-12:33:26] [I] avgTiming: 8
[05/06/2022-12:33:26] [I] Precision: FP32
[05/06/2022-12:33:26] [I] Calibration:
[05/06/2022-12:33:26] [I] Refit: Disabled
[05/06/2022-12:33:26] [I] Safe mode: Disabled
[05/06/2022-12:33:26] [I] Save engine:
[05/06/2022-12:33:26] [I] Load engine:
[05/06/2022-12:33:26] [I] Builder Cache: Enabled
[05/06/2022-12:33:26] [I] NVTX verbosity: 0
[05/06/2022-12:33:26] [I] Tactic sources: Using default tactic sources
[05/06/2022-12:33:26] [I] Input(s)s format: fp32:CHW
[05/06/2022-12:33:26] [I] Output(s)s format: fp32:CHW
[05/06/2022-12:33:26] [I] Input build shape: input=1x3x224x224+1x3x224x224+1x3x224x224
[05/06/2022-12:33:26] [I] Input calibration shapes: model
[05/06/2022-12:33:26] [I] === System Options ===
[05/06/2022-12:33:26] [I] Device: 0
[05/06/2022-12:33:26] [I] DLACore:
[05/06/2022-12:33:26] [I] Plugins:
[05/06/2022-12:33:26] [I] === Inference Options ===
[05/06/2022-12:33:26] [I] Batch: Explicit
[05/06/2022-12:33:26] [I] Input inference shape: input=1x3x224x224
[05/06/2022-12:33:26] [I] Iterations: 105
[05/06/2022-12:33:26] [I] Duration: 3s (+ 200ms warm up)
[05/06/2022-12:33:26] [I] Sleep time: 0ms
[05/06/2022-12:33:26] [I] Streams: 1
[05/06/2022-12:33:26] [I] ExposeDMA: Disabled
[05/06/2022-12:33:26] [I] Data transfers: Enabled
[05/06/2022-12:33:26] [I] Spin-wait: Disabled
[05/06/2022-12:33:26] [I] Multithreading: Disabled
[05/06/2022-12:33:26] [I] CUDA Graph: Disabled
[05/06/2022-12:33:26] [I] Separate profiling: Disabled
[05/06/2022-12:33:26] [I] Skip inference: Disabled
[05/06/2022-12:33:26] [I] Inputs:
[05/06/2022-12:33:26] [I] === Reporting Options ===
[05/06/2022-12:33:26] [I] Verbose: Disabled
[05/06/2022-12:33:26] [I] Averages: 10 inferences
[05/06/2022-12:33:26] [I] Percentile: 99
[05/06/2022-12:33:26] [I] Dump refittable layers:Disabled
[05/06/2022-12:33:26] [I] Dump output: Disabled
[05/06/2022-12:33:26] [I] Profile: Disabled
[05/06/2022-12:33:26] [I] Export timing to JSON file:
[05/06/2022-12:33:26] [I] Export output to JSON file:
[05/06/2022-12:33:26] [I] Export profile to JSON file:
[05/06/2022-12:33:26] [I]
[05/06/2022-12:33:26] [I] === Device Information ===
[05/06/2022-12:33:26] [I] Selected Device: NVIDIA Tesla V100-PCIE-16GB
[05/06/2022-12:33:26] [I] Compute Capability: 7.0
[05/06/2022-12:33:26] [I] SMs: 80
[05/06/2022-12:33:26] [I] Compute Clock Rate: 1.38 GHz
[05/06/2022-12:33:26] [I] Device Global Memory: 16160 MiB
[05/06/2022-12:33:26] [I] Shared Memory per SM: 96 KiB
[05/06/2022-12:33:26] [I] Memory Bus Width: 4096 bits (ECC enabled)
[05/06/2022-12:33:26] [I] Memory Clock Rate: 0.877 GHz
[05/06/2022-12:33:26] [I]
----------------------------------------------------------------
Input filename: /root/.cache/torch/hub/onnx/vgg11.onnx
ONNX IR version: 0.0.6
Opset version: 10
Producer name: pytorch
Producer version: 1.8
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[05/06/2022-12:33:40] [W] [TRT] /workspace/TensorRT/parsers/onnx/onnx2trt_utils.cpp:218: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/06/2022-12:33:40] [I] [TRT] /workspace/TensorRT/parsers/onnx/ModelImporter.cpp:139: No importer registered for op: adaptive_avg_pool2d. Attempting to import as plugin.
[05/06/2022-12:33:40] [I] [TRT] /workspace/TensorRT/parsers/onnx/builtin_op_importers.cpp:3716: Searching for plugin: adaptive_avg_pool2d, plugin_version: 1, plugin_namespace:
[05/06/2022-12:33:40] [E] [TRT] INVALID_ARGUMENT: getPluginCreator could not find plugin adaptive_avg_pool2d version 1
While parsing node number 22 [adaptive_avg_pool2d]:
ERROR: /workspace/TensorRT/parsers/onnx/builtin_op_importers.cpp:3718 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[05/06/2022-12:33:40] [E] Failed to parse onnx file
[05/06/2022-12:33:40] [E] Parsing model failed
[05/06/2022-12:33:40] [E] Engine creation failed
[05/06/2022-12:33:40] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # /opt/tensorrt/bin/trtexec --onnx=/root/.cache/torch/hub/onnx/vgg11.onnx --explicitBatch --optShapes=input:1x3x224x224 --workspace=8192 --iterations=105 --percentile=99
.
[2022-05-06 12:33:40,844 616b67a69ab7:18330][tensorrt_inference_performance.py:23][INFO] benchmark: tensorrt-inference, return code: 32, result: {'return_code': [32]}

It seems that the trt onnx importer can not support the adaptive_avg_pool2d op?

Please cc.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions