Skip to content

Commit 4c499ab

Browse files
ajrasanekevalmorabia97
authored andcommitted
[NVBUG 5801937] Disable dq_only by default (#777)
## What does this PR do? **Type of change:** Bug fix **Overview:** Disable dq_only flag by default in modelopt onnx quantization ## Testing Able to build and run model with modelopt onnx Python CLI - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: No - dq_only is set to False by default - **Did you write any new necessary tests?**: No - **Did you add or update any necessary documentation?**: No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: No <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **Chores** * Updated quantization default behavior: Q/DQ (Quantize/Dequantize) nodes are now added by default instead of only Dequantize nodes. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
1 parent 841559e commit 4c499ab

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

modelopt/onnx/quantization/quantize.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -223,7 +223,7 @@ def quantize(
223223
high_precision_dtype: str = "fp16",
224224
mha_accumulation_dtype: str = "fp16",
225225
disable_mha_qdq: bool = False,
226-
dq_only: bool = True,
226+
dq_only: bool = False,
227227
block_size: int | None = None,
228228
use_zero_point: bool = False,
229229
passes: list[str] = ["concat_elimination"],
@@ -302,7 +302,7 @@ def quantize(
302302
disable_mha_qdq:
303303
Don't add Q/DQ layers to MatMuls in MHA pattern.
304304
dq_only:
305-
If True (default), only add DQ nodes to the model. If False, add Q/DQ nodes to the model.
305+
If True, only add DQ nodes to the model. If False (default), add Q/DQ nodes to the model.
306306
block_size:
307307
Block size parameter for int4 quantization.
308308
use_zero_point:

0 commit comments

Comments
 (0)