Skip to content

Can't convert model to int8 precision with post_training_quantization #137

@jerrydyc

Description

@jerrydyc

Hello, I build Bolt(tag: v1.5.1) with the linux-x86_64_avx512 version, and convert onnx model to PTQ version by X2bolt.Then try post_training_quantization to quantize it to int8 precision. I follow the doc at https://huawei-noah.github.io/bolt/docs/QUANTIZATION.html. With the under procedure, I get the model model_f32.bolt, but can't get model_int8_q.bolt. Am I miss something? How to do int8 quantize and inference test? Thanks.

`./post_training_quantization -p model.bolt
[INFO] thread 30247: environment variable BOLT_INT8_STORAGE_ERROR_THRESHOLD: 99999.000000
[INFO] thread 30247: Write bolt model to model_f32.bolt.
Post Training Quantization Succeeded!

./post_training_quantization -V -p model.bolt -i INT8_FP16
option is -i [inferencePrecision], value is: INT8_FP32
[ERROR] thread 30243: The inferPrecision is Not Supported

./post_training_quantization -V -p model.bolt -i INT8_FP16
option is -i [inferencePrecision], value is: INT8_FP16
[ERROR] thread 30244: The inferPrecision is Not Supported`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions