[OpenVINO] Adopt new mxfp4 quantization logic #1465

nikita-savelyevv · 2025-10-09T09:34:28Z

What does this PR do?

Changes:

Now with NNCF develop version there is an explicit "mxfp4" compression mode. The group size value is fixed to 32 for it according to mxfp4 definition.
Remove temporary cb4 pre-release logic introduced in Add support for cb4_f8e4m3 quantization mode. #1378.

There are also other compression modes "mxfp8_e4m3", "fp4_e2m1" and "nvfp4" added to NNCF, but they are planned to be added to optimum-intel after the next NNCF release.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

nikita-savelyevv · 2025-10-09T09:36:42Z

optimum/intel/openvino/configuration.py

        )
        self.bits = bits
        self.sym = sym
-        self.group_size = group_size or (-1 if bits == 8 else 128)


Delegate group size value selection to NNCF. This is backward compatible in terms of the default value since NNCF also selects -1 for 8-bit types and 128 for 4-bit non-mxfp4 types by default.

nikita-savelyevv · 2025-10-09T09:37:43Z

cc @ljaljushkin @daniil-lyakhov

optimum/intel/openvino/configuration.py

tests/openvino/test_quantization.py

echarlaix

LGTM, thanks @nikita-savelyevv !

optimum/intel/openvino/configuration.py

Remove cb4 pre-release logic; adopt new mxfp4 logic

b548177

nikita-savelyevv commented Oct 9, 2025

View reviewed changes

nikita-savelyevv added 3 commits October 9, 2025 11:49

Merge branch 'main' into ns/mxfp4-compatibility

5058ad5

Fix tests

f6e6114

Fix tests 2

d64d77d

nikita-savelyevv requested review from IlyasMoutawwakil, echarlaix and rkazants October 10, 2025 08:36

rkazants reviewed Oct 13, 2025

View reviewed changes

optimum/intel/openvino/configuration.py Show resolved Hide resolved

rkazants reviewed Oct 13, 2025

View reviewed changes

tests/openvino/test_quantization.py Show resolved Hide resolved

nikita-savelyevv mentioned this pull request Oct 13, 2025

[OpenVINO] Add model inference check to weight-only and pipeline quantization testing #1470

Open

3 tasks

rkazants approved these changes Oct 14, 2025

View reviewed changes

ljaljushkin approved these changes Oct 14, 2025

View reviewed changes

IlyasMoutawwakil added the openvino-slow Runs OpenVINO slow tests with different versions of transformers label Oct 15, 2025

echarlaix approved these changes Oct 15, 2025

View reviewed changes

optimum/intel/openvino/configuration.py Show resolved Hide resolved

Update setup.py

37ac46e

nikita-savelyevv merged commit 69d276e into huggingface:main Oct 16, 2025
35 of 38 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[OpenVINO] Adopt new mxfp4 quantization logic #1465

[OpenVINO] Adopt new mxfp4 quantization logic #1465

nikita-savelyevv commented Oct 9, 2025 •

edited

Loading

Uh oh!

nikita-savelyevv Oct 9, 2025

Uh oh!

nikita-savelyevv commented Oct 9, 2025

Uh oh!

Uh oh!

Uh oh!

echarlaix left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[OpenVINO] Adopt new mxfp4 quantization logic #1465

[OpenVINO] Adopt new mxfp4 quantization logic #1465

Conversation

nikita-savelyevv commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

nikita-savelyevv Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

nikita-savelyevv commented Oct 9, 2025

Uh oh!

Uh oh!

Uh oh!

echarlaix left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

nikita-savelyevv commented Oct 9, 2025 •

edited

Loading