Open
Description
Context
Matmul operation in OpenVINO assumes an implicit shape alignment for input arguments. It applies transpositions specified by optional transpose_a
and transpose_b
attributes: OV spec.
Currently, weight compression in NNCF does not support transpose_b
=False.
Here's the test.
Potentially, it affects Mixed-Precision, AWQ, Scale Estimation, GPTQ and Lora Correction algorithms.
What needs to be done?
The task is to enable data-aware weight compression methods (Mixed-Precision, AWQ, Scale Estimation, Lora Correction, GPTQ) for models with matrix multiplications having not transposed weight.
test_compression_with_transpose shouldn't raise an error for transpose_b=False
Example Pull Requests
Resources
- Contribution guide - start here!
- start-working-on-your-good-first-issue
- Intel DevHub Discord channel - engage in discussions, ask questions and talk to OpenVINO developers
- How to link your Pull Request to an issue
Contact points
Ticket
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Assigned