I used aimet-onnx to do mix-precision quantization, and found that 15% percentage of int16 activation could meet my requirements.
Then, I tried to do QAT to improve the accuracy. But I found that the results from aimet-onnx cannot be used when doing QAT.
Is there any way to achieve this? Or, is there any way to config the activation of specific layers to be int16-activation, like in config_file ?
Thank you.