Open
Description
Describe the issue
onnxruntime + openvino need double memory compared with openvino-only
I guess the onnx model and opnevino model are in the memory in the same time.
my model size: 330M
onnx+openvino mem usage(after inference): 1G
openvino mem usage (after inference): 467M
To reproduce
ov_options = std::make_shared<OrtOpenVINOProviderOptions>();
ov_options->device_type = "GPU_FP32";
ov_options->enable_dynamic_shapes = false;
//options.device_id = "0";
ov_options->num_of_threads = 1;
ov_options->cache_dir = "./cache";
//ov_options->context ="0x123456ff";
ov_options->enable_opencl_throttling = false;
session_options->AppendExecutionProvider_OpenVINO(*ov_options);
Urgency
No response
Platform
Windows
OS Version
10
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.17.1
ONNX Runtime API
C++
Architecture
X64
Execution Provider
OpenVINO
Execution Provider Library Version
No response
Model File
No response
Is this a quantized model?
No