Open
Description
Hello ! Thanks for the great work.
I have been using the Yunet model and tried the quantized version to speed up inference but I got slower results, both in my own code and trying your demo. I checked the benchmarks and the quantized model is announced to be slower too : is this an expected behavior ?
For context I use default backend and I have a Intel(R) Core(TM) i5-10300H CPU. To be clear, I loaded the int8 onnx file from the github and did not use the quantization script.