Release QAT example with NLS #3480

jpablomch · 2025-05-06T17:28:52Z

Changes

Adds example to use NLS fine-tuning with quantization-aware LoRA on downstream tasks.

Reason for changes

To support fine-tuning for downstream scenarios, and NLS often boost the performance of LoRA fine-tuning on downstream tasks.

Related tickets

https://jira.devtools.intel.com/browse/CVS-166802

Tests

See the results in NLSDownstreamTasks.md. We have conducted extensive evaluation on 11 language models and 4 downstream tasks.

examples job: https://github.com/openvinotoolkit/nncf/actions/runs/14934370942

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

ljaljushkin

Thank you for the contribution and very extensive evaluation!
It's great to see an improvement on top of baseline with constant LoRA rank!

On a high level, it looks good for me. Most of the logic is implemented in the sample, changes in NNCF are minimized by extending FQ with LoRA.
I have a few remarks to make it better in terms of integration into NNCF.

One thing that is important for potential customers - total time to get the best checkpoint.
Could you please specify in the readme, how long was tuning and search stage in both cases?

examples/llm_compression/torch/qat_with_lora/NLSDownstreamTasks.md

nncf/torch/quantization/layers.py

examples/llm_compression/torch/qat_with_lora/main_nls.py

nncf/torch/quantization/layers.py

examples/llm_compression/torch/qat_with_lora/main_nls.py

tests/cross_fw/examples/example_scope.json

examples/llm_compression/torch/qat_with_lora/main_nls.py

examples/llm_compression/torch/qat_with_lora/NLSDownstreamTasks.md

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

Signed-off-by: J. Pablo Muñoz <[email protected]>

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

examples/llm_compression/torch/qat_with_lora/NLSDownstreamTasks.md

examples/llm_compression/torch/qat_with_lora/main_nls.py

tests/cross_fw/examples/example_scope.json

examples/llm_compression/torch/qat_with_lora/NLSDownstreamTasks.md

nncf/parameters.py

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

Signed-off-by: J. Pablo Muñoz <[email protected]>

ljaljushkin

minor remarks

tests/cross_fw/examples/.test_durations

examples/llm_compression/torch/qat_with_lora/main_nls.py

Co-authored-by: Lyalyushkin Nikolay <[email protected]>

Signed-off-by: J. Pablo Muñoz <[email protected]>

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

Signed-off-by: J. Pablo Muñoz <[email protected]>

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

ljaljushkin

Here's non-blocking comments, but would be nice to resolve them (possible in a separate PR) before code freeze in NNCF (EOW).

examples/llm_compression/torch/qat_with_nls_downstream/main.py

Signed-off-by: J. Pablo Muñoz <[email protected]>

ljaljushkin · 2025-05-20T20:03:30Z

Latest job for test examples passed - https://github.com/openvinotoolkit/nncf/actions/runs/15141407340/attempts/1

Signed-off-by: J. Pablo Muñoz <[email protected]>

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

Co-authored-by: Yuan0320 <[email protected]>

Release qat with nls

bd9ee7b

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

jpablomch requested a review from ljaljushkin May 6, 2025 17:28

jpablomch requested a review from a team as a code owner May 6, 2025 17:28

github-actions bot added documentation Improvements or additions to documentation NNCF PT Pull requests that updates NNCF PyTorch labels May 6, 2025

Fix pre-commit, cspell, markdownlint, bandit issues

69b29e9

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

ljaljushkin requested changes May 7, 2025

View reviewed changes

andreyanufr reviewed May 7, 2025

View reviewed changes

examples/llm_compression/torch/qat_with_lora/NLSDownstreamTasks.md Outdated Show resolved Hide resolved

Refactor for NLS subclasses

c1ac20c

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

github-actions bot added NNCF Common Pull request that updates NNCF Common NNCF PTQ Pull requests that updates NNCF PTQ API Public API-impacting changes labels May 8, 2025

jpablomch and others added 3 commits May 8, 2025 10:41

Add estimated test duration

511bc70

Signed-off-by: J. Pablo Muñoz <[email protected]>

Fix NLS quantizers

c9b6e77

Signed-off-by: J. Pablo Muñoz <[email protected]>

Fix NLS instructions

e061af9

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

ljaljushkin requested changes May 9, 2025

View reviewed changes

ljaljushkin reviewed May 9, 2025

View reviewed changes

nncf/parameters.py Show resolved Hide resolved

jpablomch and others added 2 commits May 9, 2025 10:13

Fix test

97cb4d6

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

Fix acc. tolerance

2ced88b

Signed-off-by: J. Pablo Muñoz <[email protected]>

ljaljushkin requested changes May 9, 2025

View reviewed changes

tests/cross_fw/examples/.test_durations Outdated Show resolved Hide resolved

examples/llm_compression/torch/qat_with_lora/main_nls.py Outdated Show resolved Hide resolved

examples/llm_compression/torch/qat_with_lora/main_nls.py Outdated Show resolved Hide resolved

jpablomch and others added 3 commits May 9, 2025 13:36

Update tests/cross_fw/examples/.test_durations

5526857

Co-authored-by: Lyalyushkin Nikolay <[email protected]>

Use model.dummy_inputs instead of get_model_input

8f4a9dc

Co-authored-by: Lyalyushkin Nikolay <[email protected]>

Remove unnecessary import

ba88aa0

Co-authored-by: Lyalyushkin Nikolay <[email protected]>

jpablomch force-pushed the qat_with_nls_release branch 5 times, most recently from 45eb700 to f8dd856 Compare May 9, 2025 22:33

Dummy inputs to device

8a5b7db

Signed-off-by: J. Pablo Muñoz <[email protected]>

jpablomch force-pushed the qat_with_nls_release branch from f8dd856 to 8a5b7db Compare May 9, 2025 22:38

jpablomch force-pushed the qat_with_nls_release branch from b77ba4a to 407fbb3 Compare May 13, 2025 19:57

github-actions bot added NNCF TF Pull requests that updates NNCF TensorFlow experimental NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF ONNX Pull requests that updates NNCF ONNX labels May 19, 2025

Add BF16 results

28e8f71

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

jpablomch force-pushed the qat_with_nls_release branch from 870e9fe to 28e8f71 Compare May 19, 2025 17:51

jpablomch added 2 commits May 19, 2025 11:34

Merge develop

b375e2d

Signed-off-by: J. Pablo Muñoz <[email protected]>

Merge branch 'develop' into qat_with_nls_release

b73d9ce

github-actions bot removed NNCF TF Pull requests that updates NNCF TensorFlow experimental NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF ONNX Pull requests that updates NNCF ONNX labels May 19, 2025

jpablomch and others added 2 commits May 19, 2025 20:43

Add export to OV

fbb6e86

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

Fix main NLS export to OV

92fb699

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

ljaljushkin approved these changes May 20, 2025

View reviewed changes

andreyanufr approved these changes May 20, 2025

View reviewed changes

ljaljushkin reviewed May 20, 2025

View reviewed changes

jpablomch added 2 commits May 20, 2025 09:18

Merge branch 'develop' into qat_with_nls_release

936233e

Add default lora rank space

db007b3

Signed-off-by: J. Pablo Muñoz <[email protected]>

jpablomch and others added 4 commits May 20, 2025 21:18

Fix defaults for training

4793f27

Signed-off-by: J. Pablo Muñoz <[email protected]>

Fix example with do_train as default

1fba986

Add eval_only

6281dfe

Signed-off-by: J. Pablo Muñoz <[email protected]> Co-authored-by: Yuan0320 <[email protected]>

Fix pre-commit error

bd61bf3

Co-authored-by: Yuan0320 <[email protected]>

alexsu52 approved these changes May 21, 2025

View reviewed changes

alexsu52 assigned ljaljushkin May 21, 2025

MaximProshin added the Code Freeze label May 21, 2025

ljaljushkin merged commit a283adc into openvinotoolkit:develop May 21, 2025
19 checks passed

ljaljushkin mentioned this pull request Jun 4, 2025

[release_v2170] Release notes #3524

Merged

Release QAT example with NLS #3480

Release QAT example with NLS #3480

Uh oh!

Conversation

jpablomch commented May 6, 2025 • edited by ljaljushkin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Reason for changes

Related tickets

Tests

Uh oh!

ljaljushkin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ljaljushkin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ljaljushkin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ljaljushkin commented May 20, 2025

Uh oh!

Uh oh!

Uh oh!

jpablomch commented May 6, 2025 •

edited by ljaljushkin

Loading