Skip to content

exir "missing out vars" #3443

Open
Open
@antmikinka

Description

My Script I ran to cause this error
python -m examples.models.llama2.export_llama --checkpoint /Users/anthonymikinka/executorch/llama-2-7b-chat/consolidated.00.pth --params /Users/anthonymikinka/executorch/llama-2-7b-chat/params.json -kv --use_sdpa_with_kv_cache --coreml --group_size 128 -qmode 8da4w -d fp32 --verbose --max_seq_length 512 -o "/Volumes/NVME 3/ExecuTorch Models"

Above this is a lot of this EdgeOpOverload, but otherwise MIL backend and default pipelines built.
Lots of ops were removed earlier on before the MIL building.
Below is some terminal code and the traceback error.

INFO:root:Failed converting '<EdgeOpOverload: quantized_decomposed.dequantize_per_token.default>: schema = quantized_decomposed::dequantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype, ScalarType output_dtype) -> Tensor' to its out variant with error: 'SchemaKind.out variant of operator quantized_decomposed::dequantize_per_token can't be found. We've found the schemas of all the overloads: ['quantized_decomposed::dequantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype, ScalarType output_dtype) -> Tensor']'

INFO:root:Failed converting '<EdgeOpOverload: quantized_decomposed.dequantize_per_token.default>: schema = quantized_decomposed::dequantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype, ScalarType output_dtype) -> Tensor' to its out variant with error: 'SchemaKind.out variant of operator quantized_decomposed::dequantize_per_token can't be found. We've found the schemas of all the overloads: ['quantized_decomposed::dequantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype, ScalarType output_dtype) -> Tensor']'

INFO:root:Failed converting '<EdgeOpOverload: quantized_decomposed.choose_qparams_per_token_asymmetric.default>: schema = quantized_decomposed::choose_qparams_per_token_asymmetric(Tensor input, ScalarType dtype) -> (Tensor, Tensor)' to its out variant with error: 'SchemaKind.out variant of operator quantized_decomposed::choose_qparams_per_token_asymmetric can't be found. We've found the schemas of all the overloads: ['quantized_decomposed::choose_qparams_per_token_asymmetric(Tensor input, ScalarType dtype) -> (Tensor, Tensor)']'

INFO:root:Failed converting '<EdgeOpOverload: quantized_decomposed.quantize_per_token.default>: schema = quantized_decomposed::quantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype) -> Tensor' to its out variant with error: 'SchemaKind.out variant of operator quantized_decomposed::quantize_per_token can't be found. We've found the schemas of all the overloads: ['quantized_decomposed::quantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype) -> Tensor']'

INFO:root:Failed converting '<EdgeOpOverload: quantized_decomposed.dequantize_per_token.default>: schema = quantized_decomposed::dequantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype, ScalarType output_dtype) -> Tensor' to its out variant with error: 'SchemaKind.out variant of operator quantized_decomposed::dequantize_per_token can't be found. We've found the schemas of all the overloads: ['quantized_decomposed::dequantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype, ScalarType output_dtype) -> Tensor']'

INFO:root:Failed converting '<EdgeOpOverload: quantized_decomposed.choose_qparams_per_token_asymmetric.default>: schema = quantized_decomposed::choose_qparams_per_token_asymmetric(Tensor input, ScalarType dtype) -> (Tensor, Tensor)' to its out variant with error: 'SchemaKind.out variant of operator quantized_decomposed::choose_qparams_per_token_asymmetric can't be found. We've found the schemas of all the overloads: ['quantized_decomposed::choose_qparams_per_token_asymmetric(Tensor input, ScalarType dtype) -> (Tensor, Tensor)']'`

INFO:root:Failed converting '<EdgeOpOverload: quantized_decomposed.quantize_per_token.default>: schema = quantized_decomposed::quantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype) -> Tensor' to its out variant with error: 'SchemaKind.out variant of operator quantized_decomposed::quantize_per_token can't be found. We've found the schemas of all the overloads: ['quantized_decomposed::quantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype) -> Tensor']'

INFO:root:Failed converting '<EdgeOpOverload: quantized_decomposed.dequantize_per_token.default>: schema = quantized_decomposed::dequantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype, ScalarType output_dtype) -> Tensor' to its out variant with error: 'SchemaKind.out variant of operator quantized_decomposed::dequantize_per_token can't be found. We've found the schemas of all the overloads: ['quantized_decomposed::dequantize_per_token(Tensor input, Tensor scales, Tensor zero_points, int quant_min, int quant_max, ScalarType dtype, ScalarType output_dtype) -> Tensor']'


Traceback (most recent call last):
  File "/opt/anaconda3/envs/executorch/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/anaconda3/envs/executorch/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/anthonymikinka/executorch/examples/models/llama2/export_llama.py", line 30, in <module>
    main()  # pragma: no cover
  File "/Users/anthonymikinka/executorch/examples/models/llama2/export_llama.py", line 26, in main
    export_llama(modelname, args)
  File "/Users/anthonymikinka/executorch/examples/models/llama2/export_llama_lib.py", line 545, in export_llama
    return _export_llama(modelname, args)
  File "/Users/anthonymikinka/executorch/examples/models/llama2/export_llama_lib.py", line 869, in _export_llama
    builder = builder_exported_to_edge.to_backend(partitioners).to_executorch()
  File "/Users/anthonymikinka/executorch/examples/models/llama2/builder.py", line 319, in to_executorch
    self.export_program = self.edge_manager.to_executorch(
  File "/opt/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/program/_program.py", line 842, in to_executorch
    new_gm_res = p(new_gm)
  File "/opt/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/fx/passes/infra/pass_base.py", line 40, in __call__
    res = self.call(graph_module)
  File "/opt/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/passes/__init__.py", line 422, in call
    raise RuntimeError(f"Missing out variants: {missing_out_vars}")
RuntimeError: Missing out variants: {'quantized_decomposed::dequantize_per_token', 'quantized_decomposed::choose_qparams_per_token_asymmetric', 'quantized_decomposed::dequantize_per_channel_group', 'quantized_decomposed::quantize_per_token'}

cc @kimishpatel @YifanShenSZ @cymbalrush

Metadata

Assignees

No one assigned

    Labels

    module: coremlIssues related to Apple's Core ML delegation and code under backends/apple/coreml/triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions