Praveeni/vlm benchmark support by praveen-iyer · Pull Request #7 · lemonade-sdk/lemonade-eval

praveen-iyer · 2026-02-17T20:00:33Z

Summary

Add support for benchmarking Vision-Language Models (VLMs) served via Lemonade Server, alongside the existing text-only LLM benchmarking.

Changes

--image flag: Provide an image file path or URL to the bench tool. Each benchmark iteration sends a multimodal prompt (image + text) using the OpenAI chat completions format.
--image-size flag: Resize the image client-side before sending to control visual token count. Accepts WIDTHxHEIGHT (e.g. 1024x800) for exact dimensions or a single integer (e.g. 384) to cap the longest side while preserving aspect ratio.
VLM prompt prefix: When --image is used, the synthetic prompt prefix is replaced with one that encourages detailed image description for longer output.
README: Added VLM benchmarking section with usage examples.

Example Output

Add --image flag to the bench tool to enable benchmarking VLM models served via Lemonade Server. When an image file path or URL is provided, the benchmark sends multimodal prompts (image + text) using the OpenAI chat completions format. The -p flag controls the text portion of the prompt while image tokens are handled by the server. - Add _prepare_image_url() to ServerAdapter for base64/URL image handling - Add image parameter to ServerAdapter.generate() with multimodal payload - Add --image CLI argument to ServerBench parser - Add VLM-specific prompt prefix for image description benchmarks - Override parse() in ServerBench to extract image before prompt processing - Existing text-only LLM benchmarking is completely unaffected

Replace the --image-detail flag (low/high/auto) with --image-size which accepts either WIDTHxHEIGHT (e.g. 1024x800) for exact dimensions or a single integer (e.g. 384) to cap the longest side while preserving aspect ratio. The image is resized client-side using Pillow before base64 encoding, which reduces visual token count to fit within the model's context window.

Document the --image and --image-size flags for Vision-Language Model benchmarking, with examples showing exact dimensions and aspect-ratio- preserving resize modes.

…or imports inside function

formatting the line

ramkrishna2910

Feature works, added a few comments.

amd-pworfolk · 2026-02-18T05:55:05Z

@praveen-iyer Great addition! Can you also extend the llm-prompt tool to allow an image to be passed in? This allows the user to run llm-prompt prior to running the bench tool in order to verify that everything is working as expected (e.g., model producing good output, etc.).

ramkrishna2910

LGTM! Thanks for the contribution!

praveen-iyer · 2026-02-18T17:34:31Z

@praveen-iyer Great addition! Can you also extend the llm-prompt tool to allow an image to be passed in? This allows the user to run llm-prompt prior to running the bench tool in order to verify that everything is working as expected (e.g., model producing good output, etc.).

Thanks for the idea. I'm hoping to address this as part of a new PR as we need this PR to be merged soon for an engagement!

praveeni added 4 commits February 16, 2026 16:45

Add VLM benchmarking documentation to README

226efe1

Document the --image and --image-size flags for Vision-Language Model benchmarking, with examples showing exact dimensions and aspect-ratio- preserving resize modes.

Add full VLM benchmark example to README

fd0f723

praveen-iyer requested a review from ramkrishna2910 February 17, 2026 20:12

praveen-iyer marked this pull request as ready for review February 17, 2026 20:12

praveen-iyer added 3 commits February 17, 2026 12:54

Update server_load.py to disable pylint check which doesn't account f…

aed098e

…or imports inside function

Update server_load.py

94f4a83

formatting the line

Update server_load.py for black

ec42989

ramkrishna2910 requested changes Feb 18, 2026

View reviewed changes

Comment thread src/lemonade/tools/server_load.py

Comment thread src/lemonade/tools/server_bench.py

Comment thread src/lemonade/tools/server_load.py Outdated

Comment thread src/lemonade/tools/server_load.py

praveeni added 3 commits February 17, 2026 21:32

address review comments

4fd28e8

address pylint issue in CI

4583f95

fix black issue

dec1f84

praveen-iyer requested a review from ramkrishna2910 February 18, 2026 04:56

ramkrishna2910 approved these changes Feb 18, 2026

View reviewed changes

praveen-iyer merged commit 2968532 into main Feb 18, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Praveeni/vlm benchmark support#7

Praveeni/vlm benchmark support#7
praveen-iyer merged 10 commits intomainfrom
praveeni/vlm-benchmark-support

praveen-iyer commented Feb 17, 2026 •

edited

Loading

Uh oh!

ramkrishna2910 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amd-pworfolk commented Feb 18, 2026

Uh oh!

ramkrishna2910 left a comment

Uh oh!

praveen-iyer commented Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

praveen-iyer commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Example Output

Uh oh!

ramkrishna2910 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amd-pworfolk commented Feb 18, 2026

Uh oh!

ramkrishna2910 left a comment

Choose a reason for hiding this comment

Uh oh!

praveen-iyer commented Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

praveen-iyer commented Feb 17, 2026 •

edited

Loading