Skip to content

Enable transpose_a support for LoRA Correction#3864

Open
Shehrozkashif wants to merge 4 commits intoopenvinotoolkit:developfrom
Shehrozkashif:support-transpose
Open

Enable transpose_a support for LoRA Correction#3864
Shehrozkashif wants to merge 4 commits intoopenvinotoolkit:developfrom
Shehrozkashif:support-transpose

Conversation

@Shehrozkashif
Copy link
Contributor

Summary of Changes

  • Updated process_stats to handle transpose_a for LoRA Correction.
  • LoRA algorithm now reads transpose_a from the weight node and processes activations accordingly.
  • Added tests:
    • test_process_stats_with_transpose_a_changes_layout to verify activation processing.
    • test_lora_transpose_a_fix to ensure LoRA compression works correctly with transpose_a=False.
  • Ensures LoRA Correction works correctly without errors when transpose_a is False.

Details of Changes

  • process_stats now supports a transpose_a flag that adjusts activation layouts when processing statistics.
  • LoraCorrectionAlgorithm.calculate_adapters reads the transpose_a attribute from the weight node and passes it to calculate_low_rank_matrices.
  • Low-rank adapter calculation (calculate_low_rank_matrices) now transposes residuals when transpose_a=True.
  • Tests added in tests/openvino/native/quantization/test_weights_compression.py to verify correctness.

Reason for Changes

  • Previously, LoRA Correction did not correctly handle layers with transpose_a=True.
  • These changes ensure that activations are processed with the correct layout and low-rank adapters are computed correctly, preventing errors during weight compression.

Related Tickets

Tests

  • test_process_stats_with_transpose_a_changes_layout confirms that activation layout changes when transpose_a=True.
  • test_lora_transpose_a_fix ensures LoRA Correction executes without errors for supported transpose configurations.
  • All pre-commit and linter checks passed.

- Updated  to handle  for LoRA Correction.
- LoRA algorithm now reads  from weight node and processes activations accordingly.
- Added tests:
  -  for activation processing.
  -  ensures LoRA compression works with transpose_a=False.
- Ensures LoRA Correction works correctly without errors when transpose_a is False.
@Shehrozkashif Shehrozkashif requested a review from a team as a code owner January 29, 2026 13:19
@github-actions github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Jan 29, 2026
@Shehrozkashif
Copy link
Contributor Author

@daniil-lyakhov, I hope I'm in the right direction?

Copy link
Collaborator

@daniil-lyakhov daniil-lyakhov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @Shehrozkashif,
thank you for the PR! In general the direction is correct, please adress a couple comments from me

Comment on lines 2625 to 2631
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. I can update the existing tests to cover the act_ch_axis/transpose handling instead of adding separate ones, so the verification of LoRA Correction with transposed inputs is integrated with the current test suite.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't forget to update the tests

@github-actions github-actions bot removed the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Jan 30, 2026
@Shehrozkashif
Copy link
Contributor Author

@daniil-lyakhov Passed act_ch_axis from statistics to process_stats in LoRA Correction and added a conditional transpose of X so that residual multiplication works correctly for transposed inputs.

@github-actions github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Feb 4, 2026
@Shehrozkashif
Copy link
Contributor Author

@daniil-lyakhov Hi, I’ve updated the test decorator to skip the configurations where transpose_a=True or transpose_b=True, since LoRA correction does not support transposed activations yet. This change keeps the test function intact and avoids runtime failures while preserving all other test cases.

@Shehrozkashif
Copy link
Contributor Author

Hi @daniil-lyakhov, quick reminder on this PR. I’ve updated the tests and addressed previous feedback. Please let me know if anything else is needed. Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new tests were enabled, could you please enable them and check everyting is working?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@daniil-lyakhov Thank you for the feedback. I have now fully enabled the tests and refactored the implementation to match the pattern used in PR #3794.

Updates:

  1. Enabled Tests: I have unskipped the transpose_a=True test cases in test_lora_adapters_in_the_graph.
  2. Refactored Implementation:
    • I reverted the changes to statistics.py (no act_ch_axis stored in WCTensorStatistic).
    • act_ch_axis is now calculated on-the-fly in openvino_backend.py using get_activation_channel_axis and passed directly to lora_correction_algo.calculate_adapters.
    • lora_correction.py was updated to accept and use this argument.
  3. Test Overrides: I overrode test_compression_skipped_with_transposed_activations for this specific test class to exclude LoRA Correction from the expected failures, as it now supports transposed activations (while keeping the check for GPTQ/Scale Estimation).

I have verified precisely that tests/openvino/native/quantization/test_weights_compression.py::test_lora_adapters_in_the_graph passes for transpose_a=True. Please review.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables support for transpose_a=True in the LoRA Correction algorithm for weight compression. The LoRA Correction algorithm previously did not handle MatMul operations with transposed activation inputs correctly. This PR updates the algorithm to read the transpose_a attribute from weight nodes and process activations accordingly.

Changes:

  • Removed the check that blocked LoRA Correction for nodes with transpose_a=True
  • Updated process_stats function to accept an act_ch_axis parameter for proper handling of different activation layouts
  • Modified LoRA adapter calculation to account for activation channel axis and conditionally transpose activations
  • Updated adapter insertion to use the correct transpose_a value when creating adapter MatMul operations
  • Added test coverage for transpose_a=True scenarios
  • Added test to verify other algorithms (scale_estimation, GPTQ) still correctly reject transpose_a=True

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/openvino/native/quantization/test_weights_compression.py Added test parameters for transpose_a=True cases and new test for unsupported algorithms with transposed activations
src/nncf/quantization/algorithms/weight_compression/openvino_backend.py Updated insert_adapters to read and use transpose_a flag, and calculate activation channel axis for LoRA
src/nncf/quantization/algorithms/weight_compression/lora_correction.py Modified calculate_adapters and calculate_low_rank_matrices signatures to accept act_ch_axis and added conditional transpose logic
src/nncf/quantization/algorithms/weight_compression/algorithm.py Removed transpose_a check for LoRA and updated variable naming for clarity
src/nncf/common/tensor_statistics/statistics.py Minor refactoring to inline variable usage

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +208 to +210
# Conditionally transpose X so samples are rows and channels are columns
if act_ch_axis != 0: # if channel is not already the first axis
X = fns.transpose(X, axes=(1, 0)) # [SS, H]
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditional transpose logic appears incorrect. The process_stats function always returns X with shape [HiddenDim, SampleSize] (as documented in its docstring line 29), regardless of the act_ch_axis value. The act_ch_axis parameter is only used within process_stats for sampling logic, not for determining the output layout. Therefore, this conditional check if act_ch_axis != 0 doesn't achieve the intended purpose, and the transpose should either always be applied or never be applied. The expected shape after this line should be [SS, H] based on the comment, which means the transpose should always happen since process_stats returns [H, SS].

Suggested change
# Conditionally transpose X so samples are rows and channels are columns
if act_ch_axis != 0: # if channel is not already the first axis
X = fns.transpose(X, axes=(1, 0)) # [SS, H]
# Transpose X so samples are rows and channels are columns.
# process_stats returns X with shape [H, SS], so we convert to [SS, H].
X = fns.transpose(X, axes=(1, 0)) # [SS, H]

Copilot uses AI. Check for mistakes.
),
)
def test_lora_adapters_in_the_graph(params, transpose_b):
def test_lora_adapters_in_the_graph(params, transpose_a, transpose_b):
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions two new tests (test_process_stats_with_transpose_a_changes_layout and test_lora_transpose_a_fix) that are not present in the diff. These tests are important to verify that the transpose_a support is working correctly. Either the tests were not included in this PR, or the PR description needs to be updated to reflect the actual tests that were added.

Copilot uses AI. Check for mistakes.
Comment on lines 275 to 283
def _get_serialized_data(self) -> dict[str, Tensor]:
backend = self.mean_values[0].backend
device = self.mean_values[0].device
return {
self.MEAN_STAT: fns.stack(self.mean_values),
self.SHAPE_STAT: fns.tensor(
self.shape_values,
backend=backend,
backend=self.mean_values[0].backend,
dtype=TensorDataType.int32,
device=device,
device=self.mean_values[0].device,
),
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These refactoring changes to inline variable usage are unrelated to the PR's stated goal of enabling transpose_a support for LoRA Correction. While the refactoring is a reasonable style improvement, it should ideally be in a separate commit or PR to keep changes focused and easier to review. Including unrelated refactoring makes it harder to understand the core changes and could complicate any future bisecting or reverting.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

NNCF OpenVINO Pull requests that updates NNCF OpenVINO

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments