Skip to content

[Good First Issue][NNCF]: Support transposed input for data-aware weight compression methods #3230

Open
@ljaljushkin

Description

@ljaljushkin

Context

Matmul operation in OpenVINO assumes an implicit shape alignment for input arguments. It applies transpositions specified by optional transpose_a and transpose_b attributes: OV spec.
Currently, weight compression in NNCF does not support transpose_a=True.
Here's the check and test.
Potentially, it affects Mixed-Precision, AWQ, Scale Estimation and Lora Correction algorithms.

What needs to be done?

The task is to enable data-aware weight compression methods (Mixed-Precision, AWQ, Scale Estimation, Lora Correction) for models with transposed input matrix multiplications.

  1. At least one function process_stats should be corrected, check - removed.
  2. test should pass and be a templated one in order to check OV and Torch backend at once.
  3. Tests that used LMLinearModel with transpose_a=False by default should pass with transpose_a=True.

Example Pull Requests

#3179
#3129

Resources

Contact points

@ljaljushkin

Ticket

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

In Review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions