[NVBUG 5801937] Disable dq_only by default (#777)

ajrasane · kevalmorabia97 · commit 4c499abdf66e · 2026-01-14T09:27:30.000+05:30
## What does this PR do? **Type of change:** Bug fix **Overview:** Disable dq_only flag by default in modelopt onnx quantization ## Testing Able to build and run model with modelopt onnx Python CLI - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: No - dq_only is set to False by default - **Did you write any new necessary tests?**: No - **Did you add or update any necessary documentation?**: No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: No  ## Summary by CodeRabbit ## Release Notes * **Chores** * Updated quantization default behavior: Q/DQ (Quantize/Dequantize) nodes are now added by default instead of only Dequantize nodes. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>  Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
diff --git a/modelopt/onnx/quantization/quantize.py b/modelopt/onnx/quantization/quantize.py
@@ -223,7 +223,7 @@ def quantize(
     high_precision_dtype: str = "fp16",
     mha_accumulation_dtype: str = "fp16",
     disable_mha_qdq: bool = False,
-    dq_only: bool = True,
+    dq_only: bool = False,
     block_size: int | None = None,
     use_zero_point: bool = False,
     passes: list[str] = ["concat_elimination"],
@@ -302,7 +302,7 @@ def quantize(
         disable_mha_qdq:
             Don't add Q/DQ layers to MatMuls in MHA pattern.
         dq_only:
-            If True (default), only add DQ nodes to the model. If False, add Q/DQ nodes to the model.
+            If True, only add DQ nodes to the model. If False (default), add Q/DQ nodes to the model.
         block_size:
             Block size parameter for int4 quantization.
         use_zero_point: