qonnx/docs/qkeras-converter/qkeras_to_qonnx.md at 6c7606458ddcb66cc39ab5021ba75f6a7ead4648 · fastmachinelearning/qonnx

Qkeras to Qonnx

The converter works by (1) strip QKeras model of quantization attributes and store in a dictionary; (2) convert (as if plain Keras model) using tf2onnx; (3) Insert “Quant” nodes at appropriate locations based on a dictionary of quantization attributes.

The current version has few issues given how tf2onnx inserts the quant nodes. These problems have suitable workarounds detailed below.

Quantized-Relu

The quantized-relu quantization inserts a redundant quantization node when used as output activation of Dense/Conv2D layer.

Workaround: Only use quantized-relu activation in a seperate QActivation layers.

Quantized-Bits

The quantized-bits quantization node is not added to the model when used in QActivation layers.

Workaround: Use quantized-bits only at the output of a Dense/Conv2D layers.

Ternary Quantization

A threshold of 0.5 must be used when using ternary quantization. (This is sometimes unstable even with t=0.5)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qkeras to Qonnx

Quantized-Relu

Quantized-Bits

Ternary Quantization

FilesExpand file tree

qkeras_to_qonnx.md

Latest commit

History

qkeras_to_qonnx.md

File metadata and controls

Qkeras to Qonnx

Quantized-Relu

Quantized-Bits

Ternary Quantization