Skip to content

Latest commit

 

History

History
23 lines (14 loc) · 1.27 KB

File metadata and controls

23 lines (14 loc) · 1.27 KB

The converter works by (1) strip QKeras model of quantization attributes and store in a dictionary; (2) convert (as if plain Keras model) using tf2onnx; (3) Insert “Quant” nodes at appropriate locations based on a dictionary of quantization attributes.

The current version has few issues given how tf2onnx inserts the quant nodes. These problems have suitable workarounds detailed below.

Quantized-Relu

The quantized-relu quantization inserts a redundant quantization node when used as output activation of Dense/Conv2D layer.

Workaround: Only use quantized-relu activation in a seperate QActivation layers.

Quantized-Bits

The quantized-bits quantization node is not added to the model when used in QActivation layers.

Workaround: Use quantized-bits only at the output of a Dense/Conv2D layers.

Ternary Quantization

A threshold of 0.5 must be used when using ternary quantization. (This is sometimes unstable even with t=0.5)