Skip to content

SGlang cannot load AutoRound model #982

@haofanwang

Description

@haofanwang

I met KeyError: 'model.layers.0.mlp.down_proj.qweight' when loading Intel/Qwen2.5-VL-7B-Instruct-int4-mixed-AutoRound.

Here is the code to reproduce.

import sglang as sgl
llm = sgl.Engine(model_path="Intel/Qwen2.5-VL-7B-Instruct-int4-mixed-AutoRound")

I also use following scripts to quantize.

from auto_round import AutoRoundMLLM

# Load the model
model_name_or_path = "Qwen/Qwen2.5-VL-7B-Instruct"
# Quantize the model
ar = AutoRoundMLLM(model_name_or_path, scheme="W4A16")
output_dir = "./qmodel"
ar.quantize_and_save(output_dir)

The quantization works, but loading fails. I use sglang==0.5.4.post2, auto_round==0.8.0, transformers==4.57.1

import sglang as sgl
llm = sgl.Engine(model_path="./qmodel")

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions