-
Notifications
You must be signed in to change notification settings - Fork 146
[OV] Fix model inference not being affected by static quantization #1461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
nikita-savelyevv
wants to merge
9
commits into
huggingface:main
Choose a base branch
from
nikita-savelyevv:ns/fix-inplace-quantization
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
247fbec
Update all OVModel classes
nikita-savelyevv e4ee0b0
Update test
nikita-savelyevv 9151a89
Reorder inheritance
nikita-savelyevv 034c556
Fix
nikita-savelyevv e4f8ce5
Fix model replacement logic
nikita-savelyevv 380aa51
Add missing decorators
nikita-savelyevv e03a024
Remove references to the deprecated fields
nikita-savelyevv 9de6797
Merge branch 'main' into ns/fix-inplace-quantization
nikita-savelyevv ef56c7d
Merge branch 'main' into ns/fix-inplace-quantization
nikita-savelyevv File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -57,12 +57,97 @@ | |
logger = logging.getLogger(__name__) | ||
|
||
|
||
class OVModelHostMixin: | ||
""" | ||
Mixin class for models that contain OpenVINO models as submodels. | ||
""" | ||
|
||
@property | ||
def ov_models(self) -> Dict[str, Union[openvino.Model, openvino.runtime.CompiledModel]]: | ||
""" | ||
Returns a dictionary of all OpenVINO models associated with this model. Keys are model names, and values are | ||
either instances of `openvino.Model` or `openvino.runtime.CompiledModel`. Compiled model instances are returned | ||
if the model is initialized with `compile_only=True`. | ||
""" | ||
return {ov_model_name: getattr(self, ov_model_name) for ov_model_name in self._ov_model_names} | ||
|
||
@property | ||
def ov_submodels(self) -> Dict[str, Union[openvino.Model, openvino.runtime.CompiledModel]]: | ||
logger.warn( | ||
"`ov_submodels` property is deprecated and will be removed in v1.27. Please use `ov_models` property instead." | ||
) | ||
return self.ov_models | ||
|
||
@property | ||
def _ov_model_names(self) -> List[str]: | ||
""" | ||
List of openvino model names. Used as keys for a dictionary returned by `.ov_models` property. | ||
""" | ||
return ["model"] | ||
|
||
@property | ||
def _component_names(self) -> List[str]: | ||
""" | ||
List of model component names. Used as keys for a dictionary returned by `.components` property. | ||
""" | ||
return [] | ||
|
||
@property | ||
def components(self) -> Dict[str, "OVModelHostMixin"]: | ||
""" | ||
Dictionary of model components which are instances of OVModelHostMixin. | ||
""" | ||
return {component_name: getattr(self, component_name) for component_name in self._component_names} | ||
|
||
def replace_ov_model(self, current_model: openvino.Model, new_model: openvino.Model): | ||
""" | ||
Replace OpenVINO model within the model with new one. Replacement is performed by object id. | ||
|
||
Args: | ||
current_model (`openvino.Model`): | ||
Current OpenVINO model to be replaced. | ||
new_model (`openvino.Model`): | ||
New OpenVINO model to replace the current one. | ||
""" | ||
# Validate replacement parameters | ||
if isinstance(current_model, openvino.CompiledModel): | ||
raise ValueError( | ||
"OpenVINO model replacement is not supported for models initialized with `compile_only=True`." | ||
) | ||
# Replace OpenVINO model stored inside the model | ||
for ov_model_name in self.ov_models: | ||
if ov_model_name in ["lm_model", "vision_embeddings_model", "text_embeddings_model"] and isinstance( | ||
getattr(type(self), ov_model_name, None), property | ||
): | ||
# TODO (nikita.savelyevv): Remove this check when these properties are removed | ||
continue | ||
if id(getattr(self, ov_model_name, None)) == id(current_model): | ||
setattr(self, ov_model_name, new_model) | ||
# Replace OpenVINO model stored inside components | ||
for component in self.components.values(): | ||
component.replace_ov_model(current_model, new_model) | ||
# Clear requests to force recompilation with the new model | ||
self.clear_requests() | ||
|
||
def clear_requests(self): | ||
""" | ||
Clear model inference requests. | ||
""" | ||
raise NotImplementedError | ||
|
||
def compile(self): | ||
""" | ||
Compile all OpenVINO models within the model. | ||
""" | ||
raise NotImplementedError | ||
|
||
|
||
@add_start_docstrings( | ||
""" | ||
Base OVModel class. | ||
""", | ||
) | ||
class OVBaseModel(OptimizedModel): | ||
class OVBaseModel(OptimizedModel, OVModelHostMixin): | ||
auto_model_class = None | ||
export_feature = None | ||
_supports_cache_class = False # No loger defined/used in transformers | ||
|
@@ -210,17 +295,6 @@ def dtype(self) -> Optional[torch.dtype]: | |
|
||
return None | ||
|
||
@property | ||
def ov_submodels(self) -> Dict[str, openvino.Model]: | ||
return {submodel_name: getattr(self, submodel_name) for submodel_name in self._ov_submodel_names} | ||
Comment on lines
-213
to
-215
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. breaking change ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
||
@property | ||
def _ov_submodel_names(self) -> List[str]: | ||
""" | ||
List of openvino submodel names. Used as keys for a dictionary returned by `.ov_submodels` property. | ||
""" | ||
return ["model"] | ||
|
||
@staticmethod | ||
def load_model( | ||
file_name: Union[str, Path], | ||
|
@@ -803,7 +877,7 @@ def _incompatible_inputs_warning(self, inputs: Dict): | |
return None | ||
|
||
|
||
class OVModelPart: | ||
class OVModelPart(OVModelHostMixin): | ||
def __init__( | ||
self, | ||
model: Model, | ||
|
@@ -825,7 +899,7 @@ def __init__( | |
self.config = self.parent_model.config | ||
self._model_dir = Path(model_dir or parent_model._model_save_dir) | ||
|
||
def _compile(self): | ||
def compile(self): | ||
if self.parent_model._compile_only and isinstance(self.model, CompiledModel): | ||
self.request = self.model | ||
if self.request is None: | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from my understanding
_ov_model_names
will differ from_component_names
for encoder-only or decoder-only models? I'm wondering if we should modify both so that they have the same behavior as other models (having only one component) so that we can always iterate on the models "components" to compile / clear request / set device, also would remove the distinction between_ov_model_names
and_component_names
wdyt @nikita-savelyevv ?