Skip to content

[BUG] Premature precision/runtime validation when using --checkpoint DEFAULT_<PRECISION> #297

@hieunc278

Description

@hieunc278

Describe the bug
when a user passes e.g. --checkpoint DEFAULT_W4A16, the precision override
happens after argument parsing (in export_main, via determine_precision_from_checkpoint).
However, validate_precision_runtime() is called during parse_args(), at which point
parsed_args.precision still holds the model's DEFAULT_PRECISION (e.g. w4), not w4a16.
This causes a false "Model does not support runtime onnxruntime_genai with precision w4"
error even though w4a16 + onnxruntime_genai is a valid combination.

To Reproduce
Steps to reproduce the behavior:

  1. Run export Llama 3.2 1B Instruct model for onnxruntime-genai target

python -m qai_hub_models.models.llama_v3_2_1b_instruct.export
--device "SA7255P ADP"
--skip-inferencing
--skip-profiling
--output-dir ./
--context-length 1024
--checkpoint DEFAULT_W4A16
--target-runtime onnxruntime_genai
2. It has error as invalid precision
Model does not support runtime onnxruntime_genai with precision w4. These combinations are supported:
w4: genie
w4a16: genie, onnxruntime_genai

Expected behavior
Export successful with input checkpoint W4A16

Stack trace
python -m qai_hub_models.models.llama_v3_2_1b_instruct.export
--device "SA7255P ADP"
--skip-inferencing
--skip-profiling
--output-dir ./
--context-length 1024
--checkpoint DEFAULT_W4A16
--target-runtime onnxruntime_genai
/mnt/disk1/qaihub/venv/lib/python3.10/site-packages/qai_hub_models/models/_shared/llm/model.py:91: FutureWarning: aimet_common package is deprecated since v2.20 and will be deleted in the future releases. Please directly import 👉 aimet_onnx.common 👈 instead.
import aimet_common.quantsim as qs
Unable to import cvxpy
Model does not support runtime onnxruntime_genai with precision w4. These combinations are supported:
w4: genie
w4a16: genie, onnxruntime_genai

Host configuration:

  • OS and version: Ubuntu 20.04
  • Browser : Chrome
  • QAI-Hub-Models version: 0.50
  • QAI-Hub client version: 0.47

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions