-
Notifications
You must be signed in to change notification settings - Fork 59
Open
Milestone
Description
I met KeyError: 'model.layers.0.mlp.down_proj.qweight' when loading Intel/Qwen2.5-VL-7B-Instruct-int4-mixed-AutoRound.
Here is the code to reproduce.
import sglang as sgl
llm = sgl.Engine(model_path="Intel/Qwen2.5-VL-7B-Instruct-int4-mixed-AutoRound")
I also use following scripts to quantize.
from auto_round import AutoRoundMLLM
# Load the model
model_name_or_path = "Qwen/Qwen2.5-VL-7B-Instruct"
# Quantize the model
ar = AutoRoundMLLM(model_name_or_path, scheme="W4A16")
output_dir = "./qmodel"
ar.quantize_and_save(output_dir)
The quantization works, but loading fails. I use sglang==0.5.4.post2, auto_round==0.8.0, transformers==4.57.1
import sglang as sgl
llm = sgl.Engine(model_path="./qmodel")
Metadata
Metadata
Assignees
Labels
No labels