Skip to content

Static quantization but want only float output - is it possible? #366

@BmanClark

Description

@BmanClark

I have a model that I want to statically quantize for all the benefits that brings. However, the quality suffered when I did so. First I excluded sensitive layers, but that wasn't enough. Then I had it suggested that I should keep float input/output and exclude the first and last x layers.
I have been trying to do this, but I cannot get rid of a quantize op at the end, which then causes a crash at runtime when allocating output buffers.

Image

My recipe excludes all the last layers (plus more, unshown):
# Exclude output/final layers (exact tensor names)
rp_manager.add_quantization_config(
regex='.*Linear_projector;1',
operation_name=qtyping.TFLOperationName.ALL_SUPPORTED,
algorithm_key=algorithm_manager.AlgorithmName.NO_QUANTIZE,
)
rp_manager.add_quantization_config(
regex='.*WavLMForSequenceClassification;1',
operation_name=qtyping.TFLOperationName.ALL_SUPPORTED,
algorithm_key=algorithm_manager.AlgorithmName.NO_QUANTIZE,
)
rp_manager.add_quantization_config(
regex='.StatefulPartitionedCall.',
operation_name=qtyping.TFLOperationName.ALL_SUPPORTED,
algorithm_key=algorithm_manager.AlgorithmName.NO_QUANTIZE,
)
rp_manager.add_quantization_config(
regex='.logits.',
operation_name=qtyping.TFLOperationName.ALL_SUPPORTED,
algorithm_key=algorithm_manager.AlgorithmName.NO_QUANTIZE,
)

But somehow this Quantize op is still there. What am I missing?

Before all my exclusions, my recipe starts with this:
# First: Quantize ALL supported ops to static int8 by default
rp_manager.add_static_config(
regex='.*',
operation_name=qtyping.TFLOperationName.ALL_SUPPORTED,
activation_num_bits=8,
weight_num_bits=8,
algorithm_key=algorithm_manager.AlgorithmName.MIN_MAX_UNIFORM_QUANT,
)

What am I doing wrong, or is my aim not possible?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions