fix to adapt VLM multi img benchmarking with sglang.bench #5682

davilu-nvidia · 2026-01-27T16:41:55Z

Overview:

benchmark CLI based on sglang.bench to allign withg sglang's PR
python -m sglang.bench_serving --model /raid/model_hub/Qwen/Qwen2.5-VL-7B-Instruct --num-prompts 1 --dataset-name image --random-input-len 128 --random-output-len 256 --image-count 8 --image-resolution 1080p --host localhost --port 8000 --backend vllm-chat --request-rate 1

Made some modifications to make this work, will init a PRI fixed qps =1 and only tune the total num_prompt to be 1 or 10

seems like --multimodal-encode-prefill-worker only works with llama4 but not qwen3 vl series due to vLLM limitations, so I put E and P processes on same GPU id 0 to demonstrate EP/D and limit P worker memory allocation to be 0.4 to avoid OOM.

Details:

sucess benchmark

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

copy-pr-bot · 2026-01-27T16:42:00Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

pull-request-size bot added the size/M label Jan 27, 2026

davilu-nvidia self-assigned this Jan 27, 2026

github-actions bot added backend::vllm Relates to the vllm backend multimodal labels Jan 27, 2026

davilu-nvidia requested a review from GuanLuo January 27, 2026 16:45

fix to adapt VLM multi img benchmarking with sglang.bench

bf6f2d1

davilu-nvidia force-pushed the fix/support-Qwen-multi-img--EPD-for-sglang.bench branch from e4e68c0 to bf6f2d1 Compare January 27, 2026 16:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix to adapt VLM multi img benchmarking with sglang.bench #5682

fix to adapt VLM multi img benchmarking with sglang.bench #5682

davilu-nvidia commented Jan 27, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix to adapt VLM multi img benchmarking with sglang.bench #5682

Are you sure you want to change the base?

fix to adapt VLM multi img benchmarking with sglang.bench #5682

Conversation

davilu-nvidia commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Uh oh!

copy-pr-bot bot commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

davilu-nvidia commented Jan 27, 2026 •

edited

Loading