Skip to content

Gemma 3 4B export input seqlen limit #210

@kamalkraj

Description

@kamalkraj

@jackzhxng @guangy10
Currently the dynamic input shape is limited to sliding_window_len -1. The issue is with this in a single turn max input is limited to 1023 tokens.

max_seq_len = min(max_seq_len, sliding_window_len) - 1

also code change from this PR - #209

I have tried changing the value to larger no by directly hardcoding a higher value in the code and installing from source. But it is throwing error from dynamo. Is this a dynamo limitation in exporting sliding window attention?

optimum-cli export executorch \                    
    --model "google/gemma-3-4b-it" \
    --task "multimodal-text-to-text" \
    --recipe "xnnpack" \
    --device cpu \
    --use_custom_sdpa \
    --use_custom_kv_cache \
    --qlinear 8da4w \
    --qlinear_group_size 32 \
    --qlinear_encoder "8da4w,8da8w" \
    --qlinear_encoder_group_size 32 \
    --qembedding "8w" \
    --qembedding_encoder "8w" \
    --max_seq_len 131072 \
    --output_dir="gemma-3-4b-it-8da4w-executorch"
W0202 11:12:25.038000 61595 torch/distributed/elastic/multiprocessing/redirects.py:29] NOTE: Redirects are currently not supported in Windows or MacOs.
Loading weights: 100%|████████| 883/883 [00:10<00:00, 88.29it/s, Materializing param=model.vision_tower.vision_model.post_layernorm.weight]
`torch_dtype` is deprecated! Use `dtype` instead!
WARNING:coremltools:scikit-learn version 1.7.1 is not supported. Minimum required version: 0.17. Maximum required version: 1.5.1. Disabling scikit-learn conversion API.
WARNING:coremltools:Torch version 2.9.0 has not been tested with coremltools. You may run into unexpected errors. Torch 2.7.0 is the most recent version that has been tested.
I tokenizers:regex.cpp:27] Registering override fallback regex
`HybridCache` is deprecated and will be removed in version v4.59 Use `StaticCache(...)` instead which will correctly infer the type of each layer.
/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/output_graph.py:1711: UserWarning: While exporting, we found certain side effects happened in the model.forward. Here are the list of potential sources you can double check: ["L['self'].cache"]
  warnings.warn(
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0] Error while creating guard:
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0] Name: ''
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]     Source: shape_env
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]     Create Function: SHAPE_ENV
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]     Guard Types: None
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]     Code List: None
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]     Object Weakref: None
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]     Guarded Class Weakref: None
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0] Traceback (most recent call last):
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]   File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_guards.py", line 366, in create
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]     return self.create_fn(builder, self)
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]   File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/guards.py", line 2537, in SHAPE_ENV
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]     _get_code_parts(("python", "verbose_python", "cpp"))  # type: ignore[assignment]
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]   File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/guards.py", line 2522, in _get_code_parts
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]     return output_graph.shape_env.produce_guards_verbose(
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]   File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/fx/experimental/symbolic_shapes.py", line 5928, in produce_guards_verbose
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]     raise ConstraintViolationError(
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0] torch.fx.experimental.symbolic_shapes.ConstraintViolationError: Constraints violated (seq_length_dim)! For more information, run with TORCH_LOGS="+dynamic".
E0202 11:13:07.471000 61595 torch/_guards.py:368] [0/0]   - Not all values of seq_length_dim = L['inputs_embeds'].size()[1] in the specified range seq_length_dim <= 131072 satisfy the generated guard 2 <= L['inputs_embeds'].size()[1] and L['inputs_embeds'].size()[1] <= 1024
E0202 11:13:07.476000 61595 torch/_guards.py:370] [0/0] Created at:
E0202 11:13:07.476000 61595 torch/_guards.py:370] [0/0]   File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/convert_frame.py", line 773, in trace_frame
E0202 11:13:07.476000 61595 torch/_guards.py:370] [0/0]     tracer = InstructionTranslator(
E0202 11:13:07.476000 61595 torch/_guards.py:370] [0/0]   File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py", line 3847, in __init__
E0202 11:13:07.476000 61595 torch/_guards.py:370] [0/0]     output=OutputGraph(
E0202 11:13:07.476000 61595 torch/_guards.py:370] [0/0]   File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/output_graph.py", line 508, in __init__
E0202 11:13:07.476000 61595 torch/_guards.py:370] [0/0]     self.init_ambient_guards()
E0202 11:13:07.476000 61595 torch/_guards.py:370] [0/0]   File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/output_graph.py", line 668, in init_ambient_guards
E0202 11:13:07.476000 61595 torch/_guards.py:370] [0/0]     self.guards.add(ShapeEnvSource().make_guard(GuardBuilder.SHAPE_ENV))
Traceback (most recent call last):
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/_trace.py", line 812, in _export_to_torch_ir
    gm_torch_level, _ = torch._dynamo.export(
                        ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 2047, in inner
    raise constraint_violation_error
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 2002, in inner
    result_traced = opt_f(*args, **kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 414, in __call__
    return super().__call__(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 832, in compile_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/convert_frame.py", line 1874, in __call__
    result = self._torchdynamo_orig_backend(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/convert_frame.py", line 688, in __call__
    result = _compile(
             ^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/convert_frame.py", line 1433, in _compile
    guarded_code, tracer_output = compile_inner(code, one_graph, hooks)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_utils_internal.py", line 92, in wrapper_function
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/convert_frame.py", line 1117, in compile_inner
    return _compile_inner(code, one_graph, hooks)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/convert_frame.py", line 1251, in _compile_inner
    check_fn = dynamo_output.build_guards(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/convert_frame.py", line 856, in build_guards
    return CheckFunctionManager(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/guards.py", line 3383, in __init__
    builder, guard_manager = self.build_guards(
                             ^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/guards.py", line 3674, in build_guards
    guard.create(builder)
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_guards.py", line 366, in create
    return self.create_fn(builder, self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/guards.py", line 2537, in SHAPE_ENV
    _get_code_parts(("python", "verbose_python", "cpp"))  # type: ignore[assignment]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/_dynamo/guards.py", line 2522, in _get_code_parts
    return output_graph.shape_env.produce_guards_verbose(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/fx/experimental/symbolic_shapes.py", line 5928, in produce_guards_verbose
    raise ConstraintViolationError(
torch.fx.experimental.symbolic_shapes.ConstraintViolationError: Constraints violated (seq_length_dim)! For more information, run with TORCH_LOGS="+dynamic".
  - Not all values of seq_length_dim = L['inputs_embeds'].size()[1] in the specified range seq_length_dim <= 131072 satisfy the generated guard 2 <= L['inputs_embeds'].size()[1] and L['inputs_embeds'].size()[1] <= 1024

Suggested fixes:
  seq_length_dim = Dim('seq_length_dim', max=1024)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/bin/optimum-cli", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/optimum/commands/optimum_cli.py", line 219, in main
    service.run()
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/optimum/commands/export/executorch.py", line 267, in run
    main_export(
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/optimum/exporters/executorch/__main__.py", line 145, in main_export
    return export_to_executorch(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/optimum/exporters/executorch/convert.py", line 79, in export_to_executorch
    executorch_progs = recipe_func(model, **kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/optimum/exporters/executorch/recipes/xnnpack.py", line 113, in export_to_executorch_with_xnnpack
    exported_progs = model.export()
                     ^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/optimum/exporters/executorch/integrations.py", line 336, in export
    exported_program = exportable_module.export(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/transformers/integrations/executorch.py", line 330, in export
    exported_program = torch.export.export(
                       ^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/__init__.py", line 311, in export
    raise e
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/__init__.py", line 277, in export
    return _export(
           ^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/_trace.py", line 1163, in wrapper
    raise e
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/_trace.py", line 1129, in wrapper
    ep = fn(*args, **kwargs)
         ^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/exported_program.py", line 124, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/_trace.py", line 2255, in _export
    ep = _export_for_training(
         ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/_trace.py", line 1163, in wrapper
    raise e
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/_trace.py", line 1129, in wrapper
    ep = fn(*args, **kwargs)
         ^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/exported_program.py", line 124, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/_trace.py", line 2071, in _export_for_training
    export_artifact = export_func(
                      ^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/_trace.py", line 1415, in _strict_export
    gm_torch_level = _export_to_torch_ir(
                     ^^^^^^^^^^^^^^^^^^^^
  File "/Users/kamalkraj/Documents/Github/optimum-executorch/.venv/lib/python3.12/site-packages/torch/export/_trace.py", line 827, in _export_to_torch_ir
    raise UserError(UserErrorType.CONSTRAINT_VIOLATION, str(e))  # noqa: B904
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch._dynamo.exc.UserError: Constraints violated (seq_length_dim)! For more information, run with TORCH_LOGS="+dynamic".
  - Not all values of seq_length_dim = L['inputs_embeds'].size()[1] in the specified range seq_length_dim <= 131072 satisfy the generated guard 2 <= L['inputs_embeds'].size()[1] and L['inputs_embeds'].size()[1] <= 1024

Suggested fixes:
  seq_length_dim = Dim('seq_length_dim', max=1024)

The error above occurred when calling torch.export.export. If you would like to view some more information about this error, and get a list of all other errors that may occur in your export call, you can replace your `export()` call with `draft_export()`.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions