Skip to content

[Good First Issue][NNCF]: Support not transposed weight for data-aware weight compression methods #3494

Open
@ljaljushkin

Description

@ljaljushkin

Context

Matmul operation in OpenVINO assumes an implicit shape alignment for input arguments. It applies transpositions specified by optional transpose_a and transpose_b attributes: OV spec.
Currently, weight compression in NNCF does not support transpose_b=False.
Here's the test.
Potentially, it affects Mixed-Precision, AWQ, Scale Estimation, GPTQ and Lora Correction algorithms.

What needs to be done?

The task is to enable data-aware weight compression methods (Mixed-Precision, AWQ, Scale Estimation, Lora Correction, GPTQ) for models with matrix multiplications having not transposed weight.

test_compression_with_transpose shouldn't raise an error for transpose_b=False

Example Pull Requests

#3230
#3296

Resources

Contact points

@ljaljushkin

Ticket

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

Assigned

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions