No improvement in GPU memory consumption during inference

I have convertd matterport implementation of Mask RCNN saved model to a 16-bit TRT optimized saved model. I can see 100ms improvement in the inference time, however, I do not see any reduction in GPU memory consumption. Given that the original model is 32-bit model, and the optimized model is 16-bit model, I am expecting some reduction in the GPU memory consumption during inference.

I used:
Tensorflow 2.10.0
Tensorrt 7.2.2.1
Colab pro+

No one talks about the GPU memory consumption after optimization. Is it only the inference time that is improved by TF-TRT? 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No improvement in GPU memory consumption during inference #328

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

No improvement in GPU memory consumption during inference #328

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions