Unable to infer valid auto device map for gemma-4-E4B-it

### System Info

```Shell
accelerate versions 1.11.0, 1.12.0, 1.13.0
```

### Information

- [ ] The official example scripts
- [x] My own modified scripts

### Tasks

- [ ] One of the scripts in the examples/ folder of Accelerate or an officially supported `no_trainer` script in the `examples` folder of the `transformers` repo (such as `run_no_trainer_glue.py`)
- [x] My own task or dataset (give details below)

Im trying to load model to 2 of my 4 GPUs evenly, but inferring the device map is difficult.

### Reproduction

To reproduce:
```python
config = AutoConfig.from_pretrained("google/gemma-4-E4B-it")
with init_empty_weights():
    empty_model = AutoModelForCausalLM.from_config("google/gemma-4-E4B-it")
n_devices = torch.cuda.device_count()
memalloc = {"cpu": "8GiB", 0: "48GiB", 1: "48GiB", 2: "0GiB", 3: "0GiB"}
balanced_memory = get_balanced_memory(empty_model, max_memory=memalloc)
device_map = infer_auto_device_map(empty_model, max_memory=balanced_memory)
# device_map will have strange behavior mapping layer 8
model = AutoModelForCausalLM.from_pretrained("google/gemma-4-E4B-it", device_map=device_map)
```

### Expected behavior

This will cause error 'ValueError: The device_map provided does not give any device for the following parameters: model.language_model.layers.8.layer_scalar'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to infer valid auto device map for gemma-4-E4B-it #4014

System Info

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Unable to infer valid auto device map for gemma-4-E4B-it #4014

Description

System Info

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions