Skip to content

Conversation

nikita-savelyevv
Copy link
Collaborator

@nikita-savelyevv nikita-savelyevv commented Oct 9, 2025

What does this PR do?

Changes:

There are also other compression modes "mxfp8_e4m3", "fp4_e2m1" and "nvfp4" added to NNCF, but they are planned to be added to optimum-intel after the next NNCF release.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

)
self.bits = bits
self.sym = sym
self.group_size = group_size or (-1 if bits == 8 else 128)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delegate group size value selection to NNCF. This is backward compatible in terms of the default value since NNCF also selects -1 for 8-bit types and 128 for 4-bit non-mxfp4 types by default.

@nikita-savelyevv
Copy link
Collaborator Author

cc @ljaljushkin @daniil-lyakhov

@IlyasMoutawwakil IlyasMoutawwakil added the openvino-slow Runs OpenVINO slow tests with different versions of transformers label Oct 15, 2025
Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @nikita-savelyevv !

@nikita-savelyevv nikita-savelyevv merged commit 69d276e into huggingface:main Oct 16, 2025
35 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

openvino-slow Runs OpenVINO slow tests with different versions of transformers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants