-
Notifications
You must be signed in to change notification settings - Fork 146
[OpenVINO] Adopt new mxfp4 quantization logic #1465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OpenVINO] Adopt new mxfp4 quantization logic #1465
Conversation
) | ||
self.bits = bits | ||
self.sym = sym | ||
self.group_size = group_size or (-1 if bits == 8 else 128) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delegate group size value selection to NNCF. This is backward compatible in terms of the default value since NNCF also selects -1 for 8-bit types and 128 for 4-bit non-mxfp4 types by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @nikita-savelyevv !
What does this PR do?
Changes:
There are also other compression modes "mxfp8_e4m3", "fp4_e2m1" and "nvfp4" added to NNCF, but they are planned to be added to optimum-intel after the next NNCF release.
Before submitting