SGlang cannot load AutoRound model

I met `KeyError: 'model.layers.0.mlp.down_proj.qweight'` when loading Intel/Qwen2.5-VL-7B-Instruct-int4-mixed-AutoRound.

Here is the code to reproduce.

```
import sglang as sgl
llm = sgl.Engine(model_path="Intel/Qwen2.5-VL-7B-Instruct-int4-mixed-AutoRound")
```

I also use following scripts to quantize.

```
from auto_round import AutoRoundMLLM

# Load the model
model_name_or_path = "Qwen/Qwen2.5-VL-7B-Instruct"
# Quantize the model
ar = AutoRoundMLLM(model_name_or_path, scheme="W4A16")
output_dir = "./qmodel"
ar.quantize_and_save(output_dir)
```

The quantization works, but loading fails. I use sglang==0.5.4.post2, auto_round==0.8.0, transformers==4.57.1
```
import sglang as sgl
llm = sgl.Engine(model_path="./qmodel")
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SGlang cannot load AutoRound model #982

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SGlang cannot load AutoRound model #982

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions