Skip to content

Cannot save compressed binary or ternary weights, saved as float32 parameters #806

Open
@BenCrulis

Description

@BenCrulis

I am trying to save a quantized ternary model to a .tflite file, but larq doesn't seem to save the weights using datatypes with a reduced precision and thus compress the file size.
However, after converting and writing to disk, the size of the file is about the same as the one predicted by larq.models.summary in float32 parameters.

Even if I try to do the same thing with a simple QuantDense layer, the weights are saved in float32.

I am using this kind of code:

quantDense = larq.layers.QuantDense(1000, kernel_quantizer="ste_sign", use_bias=False)
quantDense(tf.ones((1, 500)))

with larq.context.quantized_scope(True):
    inp_quant = keras.Input((1,500))
    out_quant = quantDense(inp_quant)
    quantModelTest = keras.Model(inputs=inp_quant, outputs=out_quant)
    print("Keras test model")
    larq.models.summary(quantModelTest)

    print("converting keras test model to tflite")
    converted = lce.convert_keras_model(quantModelTest)
    with open("test.tflite", "wb") as f:
        print("writing tflite model to disk")
        f.write(converted)

Am I doing something wrong?

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions