Skip to content

TensorRT fails with calibration #660

@esp-vt

Description

@esp-vt
 cm run script --tags=run-mlperf,inference,_r4.1-dev,_all-scenarios \
                --model=retinanet \
                --implementation=nvidia \
                --framework=tensorrt \
                --category=datacenter \
                --server_target_qps=50 \
                --execution_mode=valid \
                --device=cuda \
                        --repro \
                --quiet \

I ran into an issue with calibration. It was the same with different Inference version (v5.0)
How can I solve this issue?

make calibrate RUN_ARGS=' --benchmarks=retinanet --scenarios=server  --test_mode=PerformanceOnly  --server_target_qps=50 --gpu_copy_streams=2 --gpu_inference_streams=2 --gpu_batch_size=8 --use_deque_limit --no_audit_verify  '
[2025-06-12 15:13:32,657 main.py:229 INFO] Detected system ID: KnownSystem.Nvidia_5e0adc8add39
[2025-06-12 15:13:32,814 calibrate.py:45 INFO] Generating calibration cache for Benchmark "retinanet"
[06/12/2025-15:13:34] [TRT] [I] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 364, GPU 924 (MiB)
[06/12/2025-15:13:41] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +4417, GPU +1160, now: CPU 4917, GPU 2084 (MiB)
[2025-06-12 15:13:42,059 retinanet_graphsurgeon.py:159 INFO] Renaming layers...
[2025-06-12 15:13:42,060 retinanet_graphsurgeon.py:237 INFO] Renamed 225 layers.
[2025-06-12 15:13:42,060 retinanet_graphsurgeon.py:243 INFO] Renaming tensors to match layer names
[2025-06-12 15:13:42,063 retinanet_graphsurgeon.py:264 INFO] Adding NMS layer nmsopt to the graph...
[06/12/2025-15:13:42] [TRT] [I] No checker registered for op: NMS_OPT_TRT. Attempting to check as plugin.
[06/12/2025-15:13:42] [TRT] [I] No importer registered for op: NMS_OPT_TRT. Attempting to import as plugin.
[06/12/2025-15:13:42] [TRT] [I] Searching for plugin: NMS_OPT_TRT, plugin_version: 2, plugin_namespace: 
[06/12/2025-15:13:42] [TRT] [W] builtin_op_importers.cpp:5677: Attribute permuteBeforeReshape not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
[06/12/2025-15:13:42] [TRT] [W] builtin_op_importers.cpp:5677: Attribute concatInputs not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
[06/12/2025-15:13:42] [TRT] [I] Successfully created plugin: NMS_OPT_TRT
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /home/cmuser/CM/repos/local/cache/ac4a8632ea8a437d/pytorch/aten/src/ATen/native/TensorShape.cpp:3516.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Traceback (most recent call last):
  File "/root/CM/repos/local/cache/db6b5d49a322435e/repo/closed/NVIDIA/code/actionhandler/base.py", line 78, in run
    success = self.handle()
  File "/root/CM/repos/local/cache/db6b5d49a322435e/repo/closed/NVIDIA/code/actionhandler/calibrate.py", line 62, in handle
    b.calibrate()
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/nvidia/builder.py", line 594, in calibrate
    self.mitten_builder.run(self.legacy_scratch, None)
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 258, in _wrapper
    raise exc_info[1]
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 245, in _wrapper
    retval = obj(*args, **kwargs)
  File "/root/CM/repos/local/cache/db6b5d49a322435e/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 379, in run
    network = self.create_network(self.builder, subnetwork_name=subnet_name)
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 258, in _wrapper
    raise exc_info[1]
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 245, in _wrapper
    retval = obj(*args, **kwargs)
  File "/root/CM/repos/local/cache/db6b5d49a322435e/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 239, in create_network
    self.apply_subnetwork_io_types(network, subnetwork_name)
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 258, in _wrapper
    raise exc_info[1]
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 245, in _wrapper
    retval = obj(*args, **kwargs)
  File "/root/CM/repos/local/cache/db6b5d49a322435e/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 289, in apply_subnetwork_io_types
    self._set_tensor_format(tensor_in, use_dla=self.dla_enabled)
  File "/root/CM/repos/local/cache/db6b5d49a322435e/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 356, in _set_tensor_format
    tensor.allowed_formats = 1 << int(tensor_format)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions