How to test a multimodal model？ #520

xiaozcy · 2025-12-24T12:30:25Z

xiaozcy
Dec 24, 2025

Hi team, I just learned about GuideLLM recently, and I tried to use a custom dataset to test a multimodal model(Qwen2.5-VL-3B-Instruct), but I found that the request sent was Incorrect.

My dataset looks like:

{"image": "/path/text.jpeg", "text": "What do you see happening in this image?\n<image>"}

And the generated request body looks like:

{
    "messages": [
    {
        "role": "user", 
        "content": [{"type": "text", "text": "What do you see happening in this image?\\n<image>'"}]
    }, 
    {
        "role": "user", 
        "content": [{"type": "image_url", "image_url": "data:image/jpeg;basexxxxxx"}]
    }]
}

But the request body expected to be generated looks like:

{
    "messages": [
    {
        "role": "user", 
        "content": [
             {"type": "text", "text": "What do you see happening in this image?\\n<image>'"}, 
             {"type": "image_url", "image_url": "data:image/jpeg;basexxxxxx"}]
    }]
}

My command looks like:

guidellm benchmark run \
--target http://xxxx \
--data /path/dataset.json \
--request-type chat_completions \
--profile constant \
--rate 1 \
--max-seconds 20

I'm not sure if I'm using it the wrong way, and I'm really looking forward to your support, Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to test a multimodal model？ #520

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How to test a multimodal model？ #520

Uh oh!

xiaozcy Dec 24, 2025

Replies: 0 comments

xiaozcy
Dec 24, 2025