Open
Conversation
* update quantization configurations for ryzenai * add some combination checks * fix typo * change enable_dpu to enable_ipu_cnn
Contributor
Author
|
Doc build fix in: #141 |
fxmarty
reviewed
Jun 5, 2024
optimum/amd/ryzenai/configuration.py
Outdated
| Determines whether to generate a quantized model that is suitable for the DPU. If set to True, the quantization | ||
| process will create a model that is optimized for DPU computations. | ||
|
|
||
| format (Union[QuantFormat, str], defaults to `QuantFormat.QDQ`): |
Contributor
There was a problem hiding this comment.
Suggested change
| format (Union[QuantFormat, str], defaults to `QuantFormat.QDQ`): | |
| format (`Union[QuantFormat, str]`, defaults to `QuantFormat.QDQ`): |
optimum/amd/ryzenai/configuration.py
Outdated
| into the tensor. Supports a wider range of bit-widths and precisions. | ||
| - `QuantFormat.FixNeuron` (Experimental): Quantizes the model by inserting FixNeuron (a combination of | ||
| QuantizeLinear and DeQuantizeLinear) into the tensor. Experimental and not recommended for deployment. | ||
| calibration_method (Union[CalibrationMethod, str], defaults to `CalibrationMethod.MinMSE`): |
optimum/amd/ryzenai/configuration.py
Outdated
| - `CalibrationMethod.MinMax`: Obtain quantization parameters based on minimum and maximum values of each tensor. | ||
| - `CalibrationMethod.Entropy`: Determine quantization parameters based on the entropy algorithm of each tensor's distribution. | ||
| - `CalibrationMethod.Percentile`: Calculate quantization parameters using percentiles of tensor values. | ||
| activations_dtype (QuantType, defaults to `QuantType.QUInt8`): |
Contributor
There was a problem hiding this comment.
same (and all other args below)
optimum/amd/ryzenai/configuration.py
Outdated
Comment on lines
+420
to
+422
| def check_dtype_and_format(dtype, dtype_name, format): | ||
| if dtype not in ["uint8", "int8"] and format not in ["vitisqdq"]: | ||
| raise ValueError(f'{dtype_name} is: "{dtype}", format must be "vitisqdq".') |
Contributor
There was a problem hiding this comment.
I don't think this is needed, clearer to have it inlined in the post-init.
optimum/amd/ryzenai/configuration.py
Outdated
Comment on lines
+444
to
+453
| mapping = { | ||
| "uint8": QuantType.QUInt8, | ||
| "int8": QuantType.QInt8, | ||
| "uint16": QuantType.QUInt16, | ||
| "int16": QuantType.QInt16, | ||
| "uint32": QuantType.QUInt32, | ||
| "int32": QuantType.QInt32, | ||
| "float16": QuantType.QFloat16, | ||
| "bfloat16": QuantType.QBFloat16, | ||
| } |
Contributor
There was a problem hiding this comment.
Why not define a constant for this
optimum/amd/ryzenai/configuration.py
Outdated
| if self.activations_dtype not in ["uint8", "int8"]: | ||
| raise ValueError('ipu cnn configuration only support activations_dtype "uint8" and "int8".') | ||
| if self.weights_dtype not in ["int8"]: | ||
| raise ValueError('ipu cnn configuration only support weights_dtype "int8".') |
Contributor
There was a problem hiding this comment.
Suggested change
| raise ValueError('ipu cnn configuration only support weights_dtype "int8".') | |
| raise ValueError(f'ipu cnn configuration only support weights_dtype "int8". Got: weights_dtype={self.weights_dtype}') |
optimum/amd/ryzenai/configuration.py
Outdated
| if self.activations_dtype not in ["uint8", "int8"]: | ||
| raise ValueError('ipu cnn configuration only support activations_dtype "uint8" and "int8".') | ||
| if self.weights_dtype not in ["int8"]: | ||
| raise ValueError('ipu cnn configuration only support weights_dtype "int8".') |
| raise ValueError('ipu cnn configuration only support calibration_method "nonoverflow" and "mse".') | ||
| if not (self.extra_options.activation_symmetric and self.extra_options.weight_symmetric): | ||
| raise ValueError( | ||
| "ipu cnn configuration requires setting activation_symmetric and weight_symmetric to true." |
optimum/amd/ryzenai/configuration.py
Outdated
| if self.format not in ["qdq"]: | ||
| raise ValueError('ipu cnn configuration only support format "qdq".') | ||
| if self.calibration_method not in ["nonoverflow", "mse"]: | ||
| raise ValueError('ipu cnn configuration only support calibration_method "nonoverflow" and "mse".') |
Comment on lines
+394
to
+417
| def to_diff_dict(self) -> dict: | ||
| """ | ||
| Returns a dictionary of non-default values in the configuration. | ||
| """ | ||
| non_default_values = {} | ||
| for option in fields(self): | ||
| if option.name == "extra_options": | ||
| extra_options_dict = getattr(self, option.name).to_diff_dict() | ||
| if extra_options_dict: | ||
| non_default_values[option.name] = extra_options_dict | ||
| else: | ||
| value = getattr(self, option.name) | ||
|
|
||
| if value != option.default and value not in ({}, []): | ||
| if option.name == "execution_providers" and value == ["CPUExecutionProvider"]: | ||
| continue | ||
|
|
||
| if isinstance(value, Enum): | ||
| value = value.name | ||
| elif isinstance(value, list): | ||
| value = [elem.name if isinstance(elem, Enum) else elem for elem in value] | ||
|
|
||
| non_default_values[option.name] = value | ||
| return non_default_values |
Contributor
There was a problem hiding this comment.
To me what would be more interesting is a method to compare two configs.
Contributor
Author
There was a problem hiding this comment.
Is the compare method for loading quantization params from config.json file?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Update quantization for Ryzen SDK 1.1