Skip to content

Commit c564119

Browse files
Disable chunked prefill for vision model
1 parent dd8aac7 commit c564119

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

examples/multi_modal_inference.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ def run_qwen_vl(questions: list[str], modality: str,
4444
max_model_len=args.max_model_len,
4545
tensor_parallel_size=args.tensor_parallel_size,
4646
gpu_memory_utilization=args.gpu_memory_utilization,
47+
enable_chunked_prefill=False,
4748
max_num_seqs=5,
4849
mm_processor_kwargs={
4950
"size": {

0 commit comments

Comments
 (0)