Skip to content

LayerNorm conversion error #393

@spacycoder

Description

@spacycoder

Hi, In the latest version of TinyNeuralNetwork layer norm causes the conversion to fail.

error output:

Error in QNNPACK: failed to create add operator with 8.124962e-06 A-to-output scale ratio: scale ratio must be in [2**-14, 2**8) range
...
  File "../TinyNeuralNetwork/tinynn/graph/quantization/modules.py", line 136, in forward
    return self.f_add_2.add(norm_alpha, bias_fq_expand)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "../torch/ao/nn/quantized/modules/functional_modules.py", line 241, in add
    r = ops.quantized.add(x, y, scale=self.scale, zero_point=self.zero_point)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "../torch/_ops.py", line 1116, in __call__
    return self._op(*args, **(kwargs or {}))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: createStatus == pytorch_qnnp_status_success INTERNAL ASSERT FAILED at "../aten/src/ATen/native/quantized/cpu/BinaryOps.cpp":204, please report a bug to PyTorch. failed to create QNNPACK Add operator

torch version: 2.5.1
python version: 3.12

This should reproduce it:

import torch.nn as nn
import torch
from tinynn.graph.quantization.quantizer import PostQuantizer
from tinynn.converter import TFLiteConverter
from tinynn.graph.tracer import model_tracer

class LayerNormModel(nn.Module):

    def __init__(self,):
        super().__init__()
        self.layer_norm = torch.nn.LayerNorm(256)

    def forward(self, x: torch.Tensor):
        return self.layer_norm(x)

def _main():
    dummy_input = torch.rand(1, 60, 256).float()
    model = LayerNormModel()
    qat_config = {
        "backend": "qnnpack",
        "per_tensor": True,
        "disable_requantization_for_cat": True,
    }
    with model_tracer():
        quantizer = PostQuantizer(
            model, (dummy_input), work_dir="LayerNormModel", config=qat_config
        )

        layer_norm_model = quantizer.quantize()

    layer_norm_model(dummy_input)

    with torch.no_grad():
        layer_norm_model.eval()
        layer_norm_model.cpu()

        layer_norm_model = quantizer.convert(layer_norm_model)
        torch.backends.quantized.engine = quantizer.backend
        converter = TFLiteConverter(
            layer_norm_model,
            (dummy_input),
            "layer_norm.tflite",
            fuse_quant_dequant=True,
            quantize_target_type="int8"
        )
        converter.convert()

if __name__ == '__main__':
    _main()

Is there a new flag or something I should set to make this work?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions