-
Notifications
You must be signed in to change notification settings - Fork 165
[OpenVINO] Add model-specific quantization ignored scopes #1556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
echarlaix
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR !
| model._apply_quantization( | ||
| quantization_config, compile_only, compile_model, str(model_id), trust_remote_code | ||
| ) | ||
| model_id = export_model_id or getattr(config, "name_or_path", model_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why use export_model_id instead of directly using the config ? (also here looks like export_model_id will be used later on to load the config anyway to get config.name_or_path to add as a candidate)
| model_id = export_model_id or getattr(config, "name_or_path", model_id) | |
| model_id = getattr(config, "name_or_path", model_id) |
likely related to,
This is needed for more robust matching of default quantization configuration / ignored scope when exporting a model via OVModelForX.from_pretrained(model_id, export=True)
would you mind developing ?
| The quantization configuration to which the default ignored scope will be applied. | ||
| Returns: | ||
| Updated quantization configuration with the default ignored scope applied. | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could add a check to make sure quantization_config is in instance of OVPipelineQuantizationConfig
| # 1. Create a local copy of the model so that we can override _name_or_path | ||
| pt_model = auto_model_cls.from_pretrained(MODEL_NAMES[test_model_id]) | ||
| pt_model.save_pretrained(tmpdir) | ||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here you could use instead maybe_save_preprocessors https://github.com/huggingface/optimum/blob/0227a1ce9652b1b02da5a510bf513c585608f8c2/optimum/utils/save_utils.py#L65
What does this PR do?
_DEFAULT_IGNORED_SCOPE_CONFIGSfor storing default quantization ignored scope per model id. Such ignored scope will be applied irrespective of quantization method.export_model_idargument toOVBaseModel._from_pretrained()which is provided fromOVBaseModel._export(). This is needed for more robust matching of default quantization configuration / ignored scope when exporting a model viaOVModelForX.from_pretrained(model_id, export=True).test_quantization.py::OVWeightCompressionTest.test_ovmodel_4bit_auto_compressiontest to test on more model types besidesOVModelForCausalLM.test_quantization.py::OVWeightCompressionTest.test_ovmodel_default_ignored_scopeandtest_exporters_cli.py::OVCLIExportTestCase.test_exporters_cli_with_default_ignored_scopeto test the new default ignored scope matching logic.Tickets CVS-175336, CVS-166099.
Before submitting