MLPerf Inference: Errors across ONNX Runtime, PyTorch, and TensorRT Backends

I'm currently trying to run the MLPerf Inference benchmark suite (v5.0-dev) for the RetinaNet model under various framework backends (ONNX Runtime, PyTorch, TensorRT), but I'm running into critical errors on each.

Below are the details for each backend, with their respective stack traces and my understanding so far.

I'd really appreciate any help, hints, or confirmation on what might be wrong 🙏

### ONNX
*Coommand:*
```
mlcr run-mlperf,inference,_full,_r5.0-dev \
   --model=retinanet \
   --implementation=reference \
   --framework=onnxruntime \
   --category=datacenter \
   --scenario=Offline \
   --execution_mode=valid \
   --device=cuda

```
*Error:*
```
3.13/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:121: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider'
  warnings.warn(
```

**My Thoughts:**
It seems like the onnxruntime package I have might not have been built with GPU support. I installed via pip, but the error still shows up.


### PyTorch

*Command:*
```
mlcr run-mlperf,inference,_full,_r5.0-dev \
   --model=retinanet \
   --implementation=reference \
   --framework=pytorch \
   --category=datacenter \
   --scenario=Offline \
   --execution_mode=valid \
   --device=cuda
```
*Error:*
```
Traceback (most recent call last):
  File "/home/esp/CM/repos/local/cache/e53f5bcba86f4a9e/inference/vision/classification_and_detection/python/main.py", line 624, in <module>
    main()
    ~~~~^^
  File "/home/esp/CM/repos/local/cache/e53f5bcba86f4a9e/inference/vision/classification_and_detection/python/main.py", line 503, in main
    model = backend.load(args.model, inputs=args.inputs, outputs=args.outputs)
  File "/home/esp/CM/repos/local/cache/e53f5bcba86f4a9e/inference/vision/classification_and_detection/python/backend_pytorch_native.py", line 27, in load
    self.model = torch.load(model_path)
                 ~~~~~~~~~~^^^^^^^^^^^^
  File "/home/esp/CM/repos/local/cache/0c7c8e7dc1564794/mlperf/lib/python3.13/site-packages/torch/serialization.py", line 1495, in load
    raise RuntimeError(
    ...<2 lines>...
    )
RuntimeError: Cannot use ``weights_only=True`` with TorchScript archives passed to ``torch.load``. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.

CM error: Portable CM script failed (name = benchmark-program, return code = 256)
```



### TensorRT

*Command:*
```
mlcr run-mlperf,inference,_full,_r5.0-dev \
   --model=retinanet \
   --implementation=nvidia \
   --framework=tensorrt \
   --category=datacenter \
   --scenario=Offline \
   --execution_mode=valid \
   --device=cuda
```

*Error:*
```
tensorrt 5.0
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/actionhandler/base.py", line 189, in subprocess_target
    return self.action_handler.handle()
  File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/actionhandler/generate_engines.py", line 176, in handle
    total_engine_build_time += self.build_engine(job)
  File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/actionhandler/generate_engines.py", line 167, in build_engine
    builder.build_engines()
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/nvidia/builder.py", line 579, in build_engines
    self.mitten_builder.run(self.legacy_scratch, None)
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 258, in _wrapper
    raise exc_info[1]
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 245, in _wrapper
    retval = obj(*args, **kwargs)
  File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 379, in run
    network = self.create_network(self.builder, subnetwork_name=subnet_name)
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 258, in _wrapper
    raise exc_info[1]
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 245, in _wrapper
    retval = obj(*args, **kwargs)
  File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 239, in create_network
    self.apply_subnetwork_io_types(network, subnetwork_name)
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 258, in _wrapper
    raise exc_info[1]
  File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 245, in _wrapper
    retval = obj(*args, **kwargs)
  File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 289, in apply_subnetwork_io_types
    self._set_tensor_format(tensor_in, use_dla=self.dla_enabled)
  File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 356, in _set_tensor_format
    tensor.allowed_formats = 1 << int(tensor_format)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
```
**My Thoughts:**
The tensor_format seems to be None, which causes int(None) to fail. This likely comes from get_tensor_format() returning None, maybe due to DLA misconfiguration or unsupported tensor types.


### System Info
- OS: Ubuntu 22.04
- Python: 3.10
- CUDA: 12.4
- GPU: H100
- MLPerf Inference: v5.0-dev
- Framework:
```
Name: onnxruntime-gpu
Version: 1.22.0
Summary: ONNX Runtime is a runtime accelerator for Machine Learning mode
ls
Home-page: https://onnxruntime.ai
Author: Microsoft Corporation
Author-email: onnxruntime@microsoft.com
License: MIT License
Location: /home/esp/mlc/lib/python3.10/site-packages
Requires: coloredlogs, flatbuffers, numpy, packaging, protobuf, sympy
Required-by: 
```
```
Name: onnxruntime
Version: 1.22.0
Summary: ONNX Runtime is a runtime accelerator for Machine Learning mode
ls
Home-page: https://onnxruntime.ai
Author: Microsoft Corporation
Author-email: onnxruntime@microsoft.com
License: MIT License
Location: /home/esp/mlc/lib/python3.10/site-packages
Requires: coloredlogs, flatbuffers, numpy, packaging, protobuf, sympy
Required-by: 
```
```
>>> print(torch.__version__)
2.5.1+cu124
>>> print(torch.cuda.is_available())
True
>>> print(torch.version.cuda)
12.4
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MLPerf Inference: Errors across ONNX Runtime, PyTorch, and TensorRT Backends #661

ONNX

PyTorch

TensorRT

System Info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MLPerf Inference: Errors across ONNX Runtime, PyTorch, and TensorRT Backends #661

Description

ONNX

PyTorch

TensorRT

System Info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions