Description
trtexec has a lot of arguments, but superbench only covers a fraction of them, and the default value trtexec set is not suitable for benchmarking our program, for example, when I use superbench to profile a resnet50.onnx with tensorrt backend, the command that superbench generated is :
/opt/tensorrt/bin/trtexec --onnx=/workspace/v-leiwang3/.torch/hub/onnx/resnet50.onnx --explicitBatch --optShapes=input:1x3x224x224 --workspace=8192 --iterations=105 --percentile=99.
How ever I found this command executed more than 200 executions on our V100 GPU, it was caused by the default arguments --duration
was set to 3, which means trtexec will profile the model at least 3s, but for 100 iterations on resnet50, it only takes about 1.5 second, so the default value of --duration should be set to 0 to srtictly execute with given iterations.
And, for warmup step, trtexec also provides --warmUp
options to set warmup step, so my expected command should be :
/opt/tensorrt/bin/trtexec --onnx=/workspace/v-leiwang3/.torch/hub/onnx/resnet50.onnx --explicitBatch --optShapes=input:1x3x224x224 --workspace=8192 --fp16 --avgRuns=10 --warmUp=5 --iterations=100 --percentile=99. --duration=0