Skip to content

Commit 0004ec8

Browse files
thinerBinghua Wu
authored andcommitted
fix(autogptq): do not use_triton with qwen-vl (#1985)
* Enhance autogptq backend to support VL models * update dependencies for autogptq * remove redundant auto-gptq dependency * Convert base64 to image_url for Qwen-VL model * implemented model inference for qwen-vl * remove user prompt from generated answer * fixed write image error * fixed use_triton issue when loading Qwen-VL model --------- Co-authored-by: Binghua Wu <[email protected]>
1 parent d692b2c commit 0004ec8

File tree

1 file changed

+0
-1
lines changed

1 file changed

+0
-1
lines changed

backend/python/autogptq/autogptq.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,6 @@ def LoadModel(self, request, context):
3939
self.model_name = "Qwen-VL-Chat"
4040
model = AutoModelForCausalLM.from_pretrained(model_path,
4141
trust_remote_code=request.TrustRemoteCode,
42-
use_triton=request.UseTriton,
4342
device_map="auto").eval()
4443
else:
4544
model = AutoGPTQForCausalLM.from_quantized(model_path,

0 commit comments

Comments
 (0)