Description
🐛 Bug
Hello, I am trying to quantize a model. I have done post training static quantization following the tutorial. During the conversion, I:
- define my model:
mymodel = model(cfg)
- load the state_dict
mymodel = load_state_dict(torch.load('weights.pt'))
-
Prepare the model, calibrate it and convert it to its quantized version
-
save the quantized version using
torch.save(mymodel_q.state_dict(), 'weights_q.pt')
When I load it I use the same code as before, i.e. I define the model and then load it using load_state_dict
But having the quantized model information about the scale and zero_point it seems that they are missing from the original model definition. It was guessable since nobody has ever changed 'model(cfg)' after quantization. But how to include info about scale and zero_point?
Am I saving the quantized model in the wrong way?
Thank you!
- PyTorch / torchvision Version (e.g., 1.0 / 0.4.0): the last available version
- OS (e.g., Linux): Ubuntu Linux
- How you installed PyTorch / torchvision (
conda
,pip
, source): pip - Build command you used (if compiling from source):
- Python version: 3.7
- CUDA/cuDNN version:
- GPU models and configuration:
- Any other relevant information:
Activity