Skip to content

[Bug][gemma-4-E4B-it]: KeyError: 'sliding_attention' #1837

@XuehaoSun

Description

@XuehaoSun

Problem Description

Quant log: log

  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/bin/auto-round", line 10, in <module>
    sys.exit(run())
             ^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/auto_round/__main__.py", line 866, in run
    start()
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/auto_round/__main__.py", line 596, in start
    tune(args)
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/auto_round/__main__.py", line 807, in tune
    model, folders = autoround.quantize_and_save(args.output_dir, format=args.format)  # pylint: disable=E1101
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/auto_round/compressors/base.py", line 1521, in quantize_and_save
    self.quantize()
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/auto_round/compressors/data_driven.py", line 1140, in quantize
    return self._quantize_impl()
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/auto_round/compressors/data_driven.py", line 1166, in _quantize_impl
    self._quant_rtn_with_imatrix()
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/auto_round/compressors/data_driven.py", line 1100, in _quant_rtn_with_imatrix
    self._quantize_via_rtn_blockwise()
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/auto_round/compressors/data_driven.py", line 1018, in _quantize_via_rtn_blockwise
    input_ids = self.quantizer._get_block_outputs(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/auto_round/algorithms/quantization/base.py", line 452, in _get_block_outputs
    tmp_output = _bf(
                 ^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/auto_round/compressors/utils.py", line 182, in block_forward
    output = block(input_ids, *input_tuple, **input_others)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/transformers/modeling_layers.py", line 93, in __call__
    return super().__call__(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/transformers/models/gemma4/modeling_gemma4.py", line 1395, in forward
    hidden_states, _ = self.self_attn(
                       ^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/uttest/miniforge3/envs/autoround_v0.13.0_release/lib/python3.12/site-packages/transformers/models/gemma4/modeling_gemma4.py", line 1235, in forward
    key_states, value_states = shared_kv_states[self.layer_type]
                               ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'sliding_attention'

Reproduction Steps

auto-round --model_name /models/gemma-4-E4B-it --bits 4 --iters 0 --tasks lambada_openai

Environment Information

No response

Error Logs

Additional Context

No response

Metadata

Metadata

Assignees

Labels

Type

No fields configured for Bug.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions