Skip to content

Add quantization options#102

Open
mht-sharma wants to merge 27 commits intomainfrom
add_quantization_options
Open

Add quantization options#102
mht-sharma wants to merge 27 commits intomainfrom
add_quantization_options

Conversation

@mht-sharma
Copy link
Copy Markdown
Contributor

@mht-sharma mht-sharma commented Mar 11, 2024

Update quantization for Ryzen SDK 1.1

@mht-sharma
Copy link
Copy Markdown
Contributor Author

Doc build fix in: #141

Determines whether to generate a quantized model that is suitable for the DPU. If set to True, the quantization
process will create a model that is optimized for DPU computations.

format (Union[QuantFormat, str], defaults to `QuantFormat.QDQ`):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
format (Union[QuantFormat, str], defaults to `QuantFormat.QDQ`):
format (`Union[QuantFormat, str]`, defaults to `QuantFormat.QDQ`):

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

into the tensor. Supports a wider range of bit-widths and precisions.
- `QuantFormat.FixNeuron` (Experimental): Quantizes the model by inserting FixNeuron (a combination of
QuantizeLinear and DeQuantizeLinear) into the tensor. Experimental and not recommended for deployment.
calibration_method (Union[CalibrationMethod, str], defaults to `CalibrationMethod.MinMSE`):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

- `CalibrationMethod.MinMax`: Obtain quantization parameters based on minimum and maximum values of each tensor.
- `CalibrationMethod.Entropy`: Determine quantization parameters based on the entropy algorithm of each tensor's distribution.
- `CalibrationMethod.Percentile`: Calculate quantization parameters using percentiles of tensor values.
activations_dtype (QuantType, defaults to `QuantType.QUInt8`):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same (and all other args below)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment on lines +420 to +422
def check_dtype_and_format(dtype, dtype_name, format):
if dtype not in ["uint8", "int8"] and format not in ["vitisqdq"]:
raise ValueError(f'{dtype_name} is: "{dtype}", format must be "vitisqdq".')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed, clearer to have it inlined in the post-init.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment on lines +444 to +453
mapping = {
"uint8": QuantType.QUInt8,
"int8": QuantType.QInt8,
"uint16": QuantType.QUInt16,
"int16": QuantType.QInt16,
"uint32": QuantType.QUInt32,
"int32": QuantType.QInt32,
"float16": QuantType.QFloat16,
"bfloat16": QuantType.QBFloat16,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not define a constant for this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

if self.activations_dtype not in ["uint8", "int8"]:
raise ValueError('ipu cnn configuration only support activations_dtype "uint8" and "int8".')
if self.weights_dtype not in ["int8"]:
raise ValueError('ipu cnn configuration only support weights_dtype "int8".')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise ValueError('ipu cnn configuration only support weights_dtype "int8".')
raise ValueError(f'ipu cnn configuration only support weights_dtype "int8". Got: weights_dtype={self.weights_dtype}')

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if self.activations_dtype not in ["uint8", "int8"]:
raise ValueError('ipu cnn configuration only support activations_dtype "uint8" and "int8".')
if self.weights_dtype not in ["int8"]:
raise ValueError('ipu cnn configuration only support weights_dtype "int8".')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

raise ValueError('ipu cnn configuration only support calibration_method "nonoverflow" and "mse".')
if not (self.extra_options.activation_symmetric and self.extra_options.weight_symmetric):
raise ValueError(
"ipu cnn configuration requires setting activation_symmetric and weight_symmetric to true."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if self.format not in ["qdq"]:
raise ValueError('ipu cnn configuration only support format "qdq".')
if self.calibration_method not in ["nonoverflow", "mse"]:
raise ValueError('ipu cnn configuration only support calibration_method "nonoverflow" and "mse".')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment on lines +394 to +417
def to_diff_dict(self) -> dict:
"""
Returns a dictionary of non-default values in the configuration.
"""
non_default_values = {}
for option in fields(self):
if option.name == "extra_options":
extra_options_dict = getattr(self, option.name).to_diff_dict()
if extra_options_dict:
non_default_values[option.name] = extra_options_dict
else:
value = getattr(self, option.name)

if value != option.default and value not in ({}, []):
if option.name == "execution_providers" and value == ["CPUExecutionProvider"]:
continue

if isinstance(value, Enum):
value = value.name
elif isinstance(value, list):
value = [elem.name if isinstance(elem, Enum) else elem for elem in value]

non_default_values[option.name] = value
return non_default_values
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me what would be more interesting is a method to compare two configs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the compare method for loading quantization params from config.json file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants