-
Notifications
You must be signed in to change notification settings - Fork 21
Description
I'm currently trying to run the MLPerf Inference benchmark suite (v5.0-dev) for the RetinaNet model under various framework backends (ONNX Runtime, PyTorch, TensorRT), but I'm running into critical errors on each.
Below are the details for each backend, with their respective stack traces and my understanding so far.
I'd really appreciate any help, hints, or confirmation on what might be wrong 🙏
ONNX
Coommand:
mlcr run-mlperf,inference,_full,_r5.0-dev \
--model=retinanet \
--implementation=reference \
--framework=onnxruntime \
--category=datacenter \
--scenario=Offline \
--execution_mode=valid \
--device=cuda
Error:
3.13/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:121: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider'
warnings.warn(
My Thoughts:
It seems like the onnxruntime package I have might not have been built with GPU support. I installed via pip, but the error still shows up.
PyTorch
Command:
mlcr run-mlperf,inference,_full,_r5.0-dev \
--model=retinanet \
--implementation=reference \
--framework=pytorch \
--category=datacenter \
--scenario=Offline \
--execution_mode=valid \
--device=cuda
Error:
Traceback (most recent call last):
File "/home/esp/CM/repos/local/cache/e53f5bcba86f4a9e/inference/vision/classification_and_detection/python/main.py", line 624, in <module>
main()
~~~~^^
File "/home/esp/CM/repos/local/cache/e53f5bcba86f4a9e/inference/vision/classification_and_detection/python/main.py", line 503, in main
model = backend.load(args.model, inputs=args.inputs, outputs=args.outputs)
File "/home/esp/CM/repos/local/cache/e53f5bcba86f4a9e/inference/vision/classification_and_detection/python/backend_pytorch_native.py", line 27, in load
self.model = torch.load(model_path)
~~~~~~~~~~^^^^^^^^^^^^
File "/home/esp/CM/repos/local/cache/0c7c8e7dc1564794/mlperf/lib/python3.13/site-packages/torch/serialization.py", line 1495, in load
raise RuntimeError(
...<2 lines>...
)
RuntimeError: Cannot use ``weights_only=True`` with TorchScript archives passed to ``torch.load``. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
CM error: Portable CM script failed (name = benchmark-program, return code = 256)
TensorRT
Command:
mlcr run-mlperf,inference,_full,_r5.0-dev \
--model=retinanet \
--implementation=nvidia \
--framework=tensorrt \
--category=datacenter \
--scenario=Offline \
--execution_mode=valid \
--device=cuda
Error:
tensorrt 5.0
self.run()
File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/actionhandler/base.py", line 189, in subprocess_target
return self.action_handler.handle()
File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/actionhandler/generate_engines.py", line 176, in handle
total_engine_build_time += self.build_engine(job)
File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/actionhandler/generate_engines.py", line 167, in build_engine
builder.build_engines()
File "/usr/local/lib/python3.8/dist-packages/nvmitten/nvidia/builder.py", line 579, in build_engines
self.mitten_builder.run(self.legacy_scratch, None)
File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 258, in _wrapper
raise exc_info[1]
File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 245, in _wrapper
retval = obj(*args, **kwargs)
File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 379, in run
network = self.create_network(self.builder, subnetwork_name=subnet_name)
File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 258, in _wrapper
raise exc_info[1]
File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 245, in _wrapper
retval = obj(*args, **kwargs)
File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 239, in create_network
self.apply_subnetwork_io_types(network, subnetwork_name)
File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 258, in _wrapper
raise exc_info[1]
File "/usr/local/lib/python3.8/dist-packages/nvmitten/debug/debug_manager.py", line 245, in _wrapper
retval = obj(*args, **kwargs)
File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 289, in apply_subnetwork_io_types
self._set_tensor_format(tensor_in, use_dla=self.dla_enabled)
File "/root/MLC/repos/local/cache/get-git-repo_mlperf-inferenc_6db69b29/repo/closed/NVIDIA/code/retinanet/tensorrt/Retinanet.py", line 356, in _set_tensor_format
tensor.allowed_formats = 1 << int(tensor_format)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
My Thoughts:
The tensor_format seems to be None, which causes int(None) to fail. This likely comes from get_tensor_format() returning None, maybe due to DLA misconfiguration or unsupported tensor types.
System Info
- OS: Ubuntu 22.04
- Python: 3.10
- CUDA: 12.4
- GPU: H100
- MLPerf Inference: v5.0-dev
- Framework:
Name: onnxruntime-gpu
Version: 1.22.0
Summary: ONNX Runtime is a runtime accelerator for Machine Learning mode
ls
Home-page: https://onnxruntime.ai
Author: Microsoft Corporation
Author-email: [email protected]
License: MIT License
Location: /home/esp/mlc/lib/python3.10/site-packages
Requires: coloredlogs, flatbuffers, numpy, packaging, protobuf, sympy
Required-by:
Name: onnxruntime
Version: 1.22.0
Summary: ONNX Runtime is a runtime accelerator for Machine Learning mode
ls
Home-page: https://onnxruntime.ai
Author: Microsoft Corporation
Author-email: [email protected]
License: MIT License
Location: /home/esp/mlc/lib/python3.10/site-packages
Requires: coloredlogs, flatbuffers, numpy, packaging, protobuf, sympy
Required-by:
>>> print(torch.__version__)
2.5.1+cu124
>>> print(torch.cuda.is_available())
True
>>> print(torch.version.cuda)
12.4