Skip to content

Unable to infer valid auto device map for gemma-4-E4B-it #4014

@ShenranTomWang

Description

@ShenranTomWang

System Info

accelerate versions 1.11.0, 1.12.0, 1.13.0

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • My own task or dataset (give details below)

Im trying to load model to 2 of my 4 GPUs evenly, but inferring the device map is difficult.

Reproduction

To reproduce:

config = AutoConfig.from_pretrained("google/gemma-4-E4B-it")
with init_empty_weights():
    empty_model = AutoModelForCausalLM.from_config("google/gemma-4-E4B-it")
n_devices = torch.cuda.device_count()
memalloc = {"cpu": "8GiB", 0: "48GiB", 1: "48GiB", 2: "0GiB", 3: "0GiB"}
balanced_memory = get_balanced_memory(empty_model, max_memory=memalloc)
device_map = infer_auto_device_map(empty_model, max_memory=balanced_memory)
# device_map will have strange behavior mapping layer 8
model = AutoModelForCausalLM.from_pretrained("google/gemma-4-E4B-it", device_map=device_map)

Expected behavior

This will cause error 'ValueError: The device_map provided does not give any device for the following parameters: model.language_model.layers.8.layer_scalar'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions