[VLM] Support request level max_dynamic_patch for OpenAI request #16268

yuan-luo · 2026-01-01T13:31:32Z

Motivation

In the InternVL model, max_dynamic_patch is a configuration parameter that defines the maximum number of image patches (or regions) to dynamically extract and process from an image, allowing the model to handle varying image resolutions and content by focusing on important areas, typically defaulting to 12 (InternVL3_5) but configurable for different tasks and model versions. It works with min_dynamic_patch (default 1) to allow for a range of patch counts, enhancing flexibility for multi-image/multi-round conversations and detailed image understanding.

This PR supports request level max_dynamic_patch for OpenAI chat/completion request. It works for both image and video.

Server:

SGLANG_MM_FEATURE_CACHE_MB=4096 \
SGLANG_USE_CUDA_IPC_TRANSPORT=1 \
SGLANG_VLM_CACHE_SIZE_MB=0 \
SGLANG_VIT_ENABLE_CUDA_GRAPH=0 \
python3 -m sglang.launch_server \
  --host 127.0.0.1 \
  --mem-fraction-static 0.7 \
  --port 30000 \
  --max-running-requests 64 \
  --chunked-prefill-size 8192 \
  --attention-backend fa3 \
  --mm-attention-backend fa3 \
  --enable-multimodal \
  --model OpenGVLab/InternVL3_5-8B \
  --disable-radix-cache \
  --tp-size 4 \
  --log-level debug

Client:

for i in {1..1}; do
  time curl 'http://127.0.0.1:30000/v1/chat/completions' --header 'Content-Type: application/json' --data '{
        "model": "auto",
        "messages": [
            {
                "role": "user",
                "content": [
                  {
                    "type": "video_url", 
                    "video_url": {
                      "max_dynamic_patch": 2,
                      "url": "/tmp/video_test.mp4"
                    }
                  },
                  {
                    "type": "text", 
                    "text": "视频里的招牌写的什么?"
                  }
                ]
            }
        ],
                                                  
        "temperature":0.0,
        "max_tokens":1000,
        "stream": false,
        "chat_template_kwargs": {"enable_thinking": false}
    }'; done

root@6996fb46042d:/sgl-workspace/bench_script# bash bench_one_video.sh 
{"id":"70a7d38855c94e069c00415851041ea5","object":"chat.completion","created":1767274501,"model":"auto","choices":[{"index":0,"message":{"role":"assistant","content":"视频里的招牌上写着“小鞋匠洗鞋”，并附有一个电话号码。","reasoning_content":null,"tool_calls":null},"logprobs":null,"finish_reason":"stop","matched_stop":151645}],"usage":{"prompt_tokens":8522,"total_tokens":8541,"completion_tokens":19,"prompt_tokens_details":null,"reasoning_tokens":0},"metadata":{"weight_version":"default"}}
real    0m0.883s
user    0m0.002s
sys     0m0.004s

Modifications

Accuracy Tests

Main:

root@6996fb46042d:/sgl-workspace/lmms-eval# python3 -m lmms_eval --model openai_compatible   --model_args model_version=OpenGVLab/InternVL3_5-8B   --tasks mmmu_val   --batch_size 16
2026-01-01 12:54:03 | INFO     | __main__:cli_evaluate:311 - Verbosity set to INFO
2026-01-01 12:54:05 | INFO     | __main__:cli_evaluate_single:400 - Evaluation tracker args: {'token': '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'}
2026-01-01 12:54:05 | INFO     | __main__:cli_evaluate_single:480 - Selected Tasks: ['mmmu_val']
2026-01-01 12:54:05 | INFO     | lmms_eval.evaluator:simple_evaluate:161 - Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
2026-01-01 12:54:17 | INFO     | lmms_eval.evaluator:evaluate:402 - Running on rank 0 (local rank 0)
2026-01-01 12:54:17 | INFO     | lmms_eval.api.task:build_all_requests:427 - Building contexts for mmmu_val on rank 0...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 900/900 [00:00<00:00, 13574.92it/s]
2026-01-01 12:54:17 | INFO     | lmms_eval.evaluator:evaluate:495 - Running generate_until requests
Model Responding: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [03:09<00:00,  2.41s/it]2026-01-01 12:57:27 | INFO     | lmms_eval.models.model_utils.gen_metrics:log_metrics:48 - Metric summary - Total time: 1586.028s, Total tokens: 1963, Avg speed: 1.2 tokens/s
Model Responding: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [03:09<00:00,  3.33s/it]
Postprocessing: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 900/900 [00:00<00:00, 10998.50it/s]
{'Overall-Art and Design': {'num': 120, 'acc': 0.725}, 'Art': {'num': 30, 'acc': 0.76667}, 'Art_Theory': {'num': 30, 'acc': 0.83333}, 'Design': {'num': 30, 'acc': 0.9}, 'Music': {'num': 30, 'acc': 0.4}, 'Overall-Business': {'num': 150, 'acc': 0.48667}, 'Accounting': {'num': 30, 'acc': 0.46667}, 'Economics': {'num': 30, 'acc': 0.7}, 'Finance': {'num': 30, 'acc': 0.26667}, 'Manage': {'num': 30, 'acc': 0.4}, 'Marketing': {'num': 30, 'acc': 0.6}, 'Overall-Science': {'num': 150, 'acc': 0.48667}, 'Biology': {'num': 30, 'acc': 0.5}, 'Chemistry': {'num': 30, 'acc': 0.36667}, 'Geography': {'num': 30, 'acc': 0.63333}, 'Math': {'num': 30, 'acc': 0.23333}, 'Physics': {'num': 30, 'acc': 0.7}, 'Overall-Health and Medicine': {'num': 150, 'acc': 0.62}, 'Basic_Medical_Science': {'num': 30, 'acc': 0.73333}, 'Clinical_Medicine': {'num': 30, 'acc': 0.63333}, 'Diagnostics_and_Laboratory_Medicine': {'num': 30, 'acc': 0.53333}, 'Pharmacy': {'num': 30, 'acc': 0.46667}, 'Public_Health': {'num': 30, 'acc': 0.73333}, 'Overall-Humanities and Social Science': {'num': 120, 'acc': 0.78333}, 'History': {'num': 30, 'acc': 0.83333}, 'Literature': {'num': 30, 'acc': 0.93333}, 'Sociology': {'num': 30, 'acc': 0.73333}, 'Psychology': {'num': 30, 'acc': 0.63333}, 'Overall-Tech and Engineering': {'num': 210, 'acc': 0.4381}, 'Agriculture': {'num': 30, 'acc': 0.46667}, 'Architecture_and_Engineering': {'num': 30, 'acc': 0.33333}, 'Computer_Science': {'num': 30, 'acc': 0.5}, 'Electronics': {'num': 30, 'acc': 0.43333}, 'Energy_and_Power': {'num': 30, 'acc': 0.4}, 'Materials': {'num': 30, 'acc': 0.46667}, 'Mechanical_Engineering': {'num': 30, 'acc': 0.46667}, 'Overall': {'num': 900, 'acc': 0.56889}}
2026-01-01 12:57:27 | INFO     | lmms_eval.loggers.evaluation_tracker:save_results_aggregated:239 - Output path not provided, skipping saving results aggregated
openai_compatible (model_version=OpenGVLab/InternVL3_5-8B), gen_kwargs: (), limit: None, num_fewshot: None, batch_size: 16
| Tasks  |Version|Filter|n-shot| Metric |   |Value |   |Stderr|
|--------|------:|------|-----:|--------|---|-----:|---|------|
|mmmu_val|      0|none  |     0|mmmu_acc|↑  |0.5689|±  |   N/A|

PR:

root@6996fb46042d:/sgl-workspace/lmms-eval# python3 -m lmms_eval --model openai_compatible   --model_args model_version=OpenGVLab/InternVL3_5-8B   --tasks mmmu_val   --batch_size 16
2026-01-01 13:17:58 | INFO     | __main__:cli_evaluate:311 - Verbosity set to INFO
2026-01-01 13:18:00 | INFO     | __main__:cli_evaluate_single:400 - Evaluation tracker args: {'token': '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'}
2026-01-01 13:18:00 | INFO     | __main__:cli_evaluate_single:480 - Selected Tasks: ['mmmu_val']
2026-01-01 13:18:00 | INFO     | lmms_eval.evaluator:simple_evaluate:161 - Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
2026-01-01 13:18:06 | INFO     | lmms_eval.evaluator:evaluate:402 - Running on rank 0 (local rank 0)
2026-01-01 13:18:06 | INFO     | lmms_eval.api.task:build_all_requests:427 - Building contexts for mmmu_val on rank 0...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 900/900 [00:00<00:00, 13753.52it/s]
2026-01-01 13:18:06 | INFO     | lmms_eval.evaluator:evaluate:495 - Running generate_until requests
Model Responding: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [03:11<00:00,  2.43s/it]2026-01-01 13:21:18 | INFO     | lmms_eval.models.model_utils.gen_metrics:log_metrics:48 - Metric summary - Total time: 1601.536s, Total tokens: 2018, Avg speed: 1.3 tokens/s
Model Responding: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [03:11<00:00,  3.36s/it]
Postprocessing: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 900/900 [00:00<00:00, 11217.08it/s]
{'Overall-Art and Design': {'num': 120, 'acc': 0.75}, 'Art': {'num': 30, 'acc': 0.8}, 'Art_Theory': {'num': 30, 'acc': 0.86667}, 'Design': {'num': 30, 'acc': 0.9}, 'Music': {'num': 30, 'acc': 0.43333}, 'Overall-Business': {'num': 150, 'acc': 0.46667}, 'Accounting': {'num': 30, 'acc': 0.36667}, 'Economics': {'num': 30, 'acc': 0.7}, 'Finance': {'num': 30, 'acc': 0.26667}, 'Manage': {'num': 30, 'acc': 0.4}, 'Marketing': {'num': 30, 'acc': 0.6}, 'Overall-Science': {'num': 150, 'acc': 0.48}, 'Biology': {'num': 30, 'acc': 0.46667}, 'Chemistry': {'num': 30, 'acc': 0.33333}, 'Geography': {'num': 30, 'acc': 0.66667}, 'Math': {'num': 30, 'acc': 0.23333}, 'Physics': {'num': 30, 'acc': 0.7}, 'Overall-Health and Medicine': {'num': 150, 'acc': 0.64667}, 'Basic_Medical_Science': {'num': 30, 'acc': 0.73333}, 'Clinical_Medicine': {'num': 30, 'acc': 0.63333}, 'Diagnostics_and_Laboratory_Medicine': {'num': 30, 'acc': 0.53333}, 'Pharmacy': {'num': 30, 'acc': 0.53333}, 'Public_Health': {'num': 30, 'acc': 0.8}, 'Overall-Humanities and Social Science': {'num': 120, 'acc': 0.775}, 'History': {'num': 30, 'acc': 0.83333}, 'Literature': {'num': 30, 'acc': 0.9}, 'Sociology': {'num': 30, 'acc': 0.73333}, 'Psychology': {'num': 30, 'acc': 0.63333}, 'Overall-Tech and Engineering': {'num': 210, 'acc': 0.42381}, 'Agriculture': {'num': 30, 'acc': 0.46667}, 'Architecture_and_Engineering': {'num': 30, 'acc': 0.36667}, 'Computer_Science': {'num': 30, 'acc': 0.46667}, 'Electronics': {'num': 30, 'acc': 0.43333}, 'Energy_and_Power': {'num': 30, 'acc': 0.4}, 'Materials': {'num': 30, 'acc': 0.43333}, 'Mechanical_Engineering': {'num': 30, 'acc': 0.4}, 'Overall': {'num': 900, 'acc': 0.56778}}
2026-01-01 13:21:18 | INFO     | lmms_eval.loggers.evaluation_tracker:save_results_aggregated:239 - Output path not provided, skipping saving results aggregated
openai_compatible (model_version=OpenGVLab/InternVL3_5-8B), gen_kwargs: (), limit: None, num_fewshot: None, batch_size: 16
| Tasks  |Version|Filter|n-shot| Metric |   |Value |   |Stderr|
|--------|------:|------|-----:|--------|---|-----:|---|------|
|mmmu_val|      0|none  |     0|mmmu_acc|↑  |0.5678|±  |   N/A|

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments (/tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci) or contact authorized users to do so.
After green CI and required approvals, ask Merge Oncalls to merge.

gemini-code-assist · 2026-01-01T13:31:50Z

Summary of Changes

Hello @yuan-luo, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enables fine-grained control over multimodal processing by allowing users to define the maximum and minimum number of dynamic patches for images and videos at the individual request level. This feature is particularly beneficial for models like InternVL, which utilize dynamic patching to adapt to varying image resolutions and content, providing greater flexibility and optimization for diverse multimodal tasks within an OpenAI-compatible API.

Highlights

Request-level Dynamic Patch Control: Introduced support for specifying max_dynamic_patch and min_dynamic_patch parameters directly within OpenAI-compatible chat and completion requests for both image and video multimodal inputs.
InternVL Integration: The InternVLProcessor now dynamically adjusts the number of image and video patches based on the max_dynamic_patch value provided in the request, enhancing flexibility for different tasks.
API Extension: Extended the OpenAI protocol to include max_dynamic_patch and min_dynamic_patch fields in ChatCompletionMessageContentImageURL, ChatCompletionMessageContentVideoURL, and ChatCompletionRequest models.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds support for request-level max_dynamic_patch for OpenAI requests, enhancing multimodal handling capabilities. The changes are generally well-implemented across protocol definitions, serving logic, and multimodal processors. However, I've identified a critical bug in jinja_template_utils.py where a variable is used without initialization, which could lead to incorrect behavior when processing video URLs. Additionally, I've provided suggestions to refactor duplicated code in serving_chat.py and internvl.py to improve code maintainability. Addressing these points will strengthen the robustness of this new feature.

python/sglang/srt/parser/jinja_template_utils.py

python/sglang/srt/entrypoints/openai/serving_chat.py

python/sglang/srt/multimodal/processors/internvl.py

yuan-luo · 2026-01-01T13:55:33Z

/tag-and-rerun-ci

yuan-luo · 2026-01-01T14:58:15Z

There's regression on Qwen VL model. Checking.

Traceback (most recent call last):
  File "/public_sglang_ci/runner-l1b-gpu-1/_work/sglang/sglang/python/sglang/srt/utils/common.py", line 2501, in retry
    return fn()
  File "/public_sglang_ci/runner-l1b-gpu-1/_work/sglang/sglang/python/sglang/test/test_utils.py", line 1720, in <lambda>
    lambda: super(CustomTestCase, self)._callTestMethod(method),
  File "/usr/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/public_sglang_ci/runner-l1b-gpu-1/_work/sglang/sglang/test/srt/test_vision_openai_server_common.py", line 508, in test_video_chat_completion
    response = client.chat.completions.create(
  File "/usr/local/lib/python3.10/dist-packages/openai/_utils/_utils.py", line 286, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/openai/resources/chat/completions/completions.py", line 1156, in create
    return self._post(
  File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1259, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1047, in request
    raise self._make_status_error_from_response(err.response) from None
openai.InternalServerError: Error code: 500 - {'object': 'error', 'message': "Internal server error: An exception occurred while loading multimodal data: Error while loading data {'url': '/root/.cache/jobs_presenting_ipod.mp4', 'max_dynamic_patch': None}: Unsupported video input type: <class 'dict'>", 'type': 'InternalServerError', 'param': None, 'code': 500}
WARNING:sglang.srt.utils.common:retry() failed once (0th try, maximum 1 retries). Will delay 1.71s and retry. Error: Error code: 500 - {'object': 'error', 'message': "Internal server error: An exception occurred while loading multimodal data: Error while loading data {'url': '/root/.cache/jobs_presenting_ipod.mp4', 'max_dynamic_patch': None}: Unsupported video input type: <class 'dict'>", 'type': 'InternalServerError', 'param': None, 'code': 500}
[2026-01-01 14:24:49] Error in request: An exception occurred while loading multimodal data: Error while loading data {'url': '/root/.cache/jobs_presenting_ipod.mp4', 'max_dynamic_patch': None}: Unsupported video input type: <class 'dict'>
Traceback (most recent call last):
  File "/public_sglang_ci/runner-l1b-gpu-1/_work/sglang/sglang/python/sglang/srt/multimodal/processors/base_processor.py", line 411, in _load_single_item
    return load_video(data, frame_count_limit)
  File "/public_sglang_ci/runner-l1b-gpu-1/_work/sglang/sglang/python/sglang/srt/utils/common.py", line 944, in load_video
    raise ValueError(f"Unsupported video input type: {type(video_file)}")
ValueError: Unsupported video input type: <class 'dict'>

yuan-luo · 2026-01-01T16:33:48Z

In Qwen2.5-VL, it will parse the
video_data=[{'url': '/tmp/video_test.mp4', 'max_dynamic_patch': 2}]
while InternVL3_5, it will parse the
video_data=['/tmp/video_test.mp4']
Which is the point.

[2026-01-01 16:31:50] Using regular tokenizer for 1 inputs
[2026-01-01 16:31:50] [internvl][qwen] placeholders image=0 video=1
_load_single_item ========= data={'url': '/tmp/video_test.mp4', 'max_dynamic_patch': 2}
[2026-01-01 16:31:50] Error in request: An exception occurred while loading multimodal data: Error while loading data {'url': '/tmp/video_test.mp4', 'max_dynamic_patch': 2}: Unsupported video input type: <class 'dict'>
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/sglang/srt/multimodal/processors/base_processor.py", line 412, in _load_single_item
    return load_video(data, frame_count_limit)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/sglang/srt/utils/common.py", line 944, in load_video
    raise ValueError(f"Unsupported video input type: {type(video_file)}")
ValueError: Unsupported video input type: <class 'dict'>

yuan-luo · 2026-01-01T16:55:21Z

In Qwen2.5-VL, it will parse the video_data=[{'url': '/tmp/video_test.mp4', 'max_dynamic_patch': 2}] while InternVL3_5, it will parse the video_data=['/tmp/video_test.mp4'] Which is the point.

[2026-01-01 16:31:50] Using regular tokenizer for 1 inputs
[2026-01-01 16:31:50] [internvl][qwen] placeholders image=0 video=1
_load_single_item ========= data={'url': '/tmp/video_test.mp4', 'max_dynamic_patch': 2}
[2026-01-01 16:31:50] Error in request: An exception occurred while loading multimodal data: Error while loading data {'url': '/tmp/video_test.mp4', 'max_dynamic_patch': 2}: Unsupported video input type: <class 'dict'>
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/sglang/srt/multimodal/processors/base_processor.py", line 412, in _load_single_item
    return load_video(data, frame_count_limit)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/sglang/srt/utils/common.py", line 944, in load_video
    raise ValueError(f"Unsupported video input type: {type(video_file)}")
ValueError: Unsupported video input type: <class 'dict'>

Problem fixed with adding normalization logic in QwenVL model process_mm_data_async.

root@6996fb46042d:/sgl-workspace/bench_script# bash bench_one_video.sh 
{"id":"1dcdec90393b439aba925eaddcfbee71","object":"chat.completion","created":1767286432,"model":"auto","choices":[{"index":0,"message":{"role":"assistant","content":"视频里的招牌上写着“小鞋匠洗鞋”，并且旁边还有一些中文文字，可能是店铺的联系方式或其他相关信息。","reasoning_content":null,"tool_calls":null},"logprobs":null,"finish_reason":"stop","matched_stop":151645}],"usage":{"prompt_tokens":20907,"total_tokens":20933,"completion_tokens":26,"prompt_tokens_details":null,"reasoning_tokens":0},"metadata":{"weight_version":"default"}}
real    0m2.130s
user    0m0.003s
sys     0m0.002s

yuan-luo · 2026-01-02T00:11:44Z

/rerun-failed-ci

yuan-luo · 2026-01-02T01:54:51Z

/rerun-failed-ci

yuan-luo requested review from CatherineSue, JustinTong0323, Ying1123, hnyls2002, ispobock, merrymercy, mickqian, slin1237, xiezhq-hermann and yhyang201 as code owners January 1, 2026 13:31

yuan-luo requested a review from BBuf January 1, 2026 13:32

gemini-code-assist bot reviewed Jan 1, 2026

View reviewed changes

python/sglang/srt/parser/jinja_template_utils.py Show resolved Hide resolved

python/sglang/srt/entrypoints/openai/serving_chat.py Show resolved Hide resolved

python/sglang/srt/multimodal/processors/internvl.py Show resolved Hide resolved

yuan-luo force-pushed the support_max_dynamic_patch branch from 91c201a to 1c66938 Compare January 1, 2026 13:43

github-actions bot added the run-ci label Jan 1, 2026

yuan-luo marked this pull request as draft January 1, 2026 14:57

Support request level max_dynamic_patch for OpenAI request

749e5a3

yuan-luo force-pushed the support_max_dynamic_patch branch from 1c66938 to 749e5a3 Compare January 1, 2026 16:56

yuan-luo marked this pull request as ready for review January 1, 2026 16:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[VLM] Support request level max_dynamic_patch for OpenAI request #16268

[VLM] Support request level max_dynamic_patch for OpenAI request #16268

yuan-luo commented Jan 1, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 1, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yuan-luo commented Jan 1, 2026

Uh oh!

yuan-luo commented Jan 1, 2026

Uh oh!

yuan-luo commented Jan 1, 2026 •

edited

Loading

Uh oh!

yuan-luo commented Jan 1, 2026

Uh oh!

yuan-luo commented Jan 2, 2026

Uh oh!

yuan-luo commented Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[VLM] Support request level max_dynamic_patch for OpenAI request #16268

Are you sure you want to change the base?

[VLM] Support request level max_dynamic_patch for OpenAI request #16268

Conversation

yuan-luo commented Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

gemini-code-assist bot commented Jan 1, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yuan-luo commented Jan 1, 2026

Uh oh!

yuan-luo commented Jan 1, 2026

Uh oh!

yuan-luo commented Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuan-luo commented Jan 1, 2026

Uh oh!

yuan-luo commented Jan 2, 2026

Uh oh!

yuan-luo commented Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yuan-luo commented Jan 1, 2026 •

edited

Loading

yuan-luo commented Jan 1, 2026 •

edited

Loading