Open
Description
when loading the quantized model (smoothquant) with
from neural_compressor.utils.pytorch import load
qmodel = load(qmodel_path, model_fp)
I got
RecursiveScriptModule(original_name=QuantizationDispatchModule)
I'd like to extract those quantized int8 weight matrix, together with corresponding quantization parameter (scales, zero_points), what should I do?