fix: forward VLM convert runtime generation settings by geoHeil · Pull Request #3322 · docling-project/docling

geoHeil · 2026-04-17T06:05:31Z

Forward the VLM convert stage generation settings from model_spec when constructing VlmEngineInput, and add debug timing around preprocessing and batch inference.

This change:

adds temperature and extra_generation_config to VlmModelSpec
threads temperature, max_new_tokens, stop_strings, and extra_generation_config through both VlmConvertModel.__call__() and process_images()
logs rasterization/resize timing and batch inference timing in the VLM convert stage
adds regression tests covering both VLM convert entry points

Issue resolved by this Pull Request:
Resolves #3321

Checklist:

Documentation has been updated, if necessary.
Examples have been added, if necessary.
Tests have been added, if necessary.

mergify · 2026-04-17T06:06:06Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

github-actions · 2026-04-17T06:14:37Z

✅ DCO Check Passed

Thanks @geoHeil, all your commits are properly signed off. 🎉

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR forwards VLM convert-stage generation settings from VlmModelSpec into VlmEngineInput, adds debug timing logs around preprocessing and inference, and introduces regression tests to ensure both VLM convert entry points respect the configured generation settings.

Changes:

Extend VlmModelSpec with temperature and extra_generation_config, and forward generation settings when constructing VlmEngineInput.
Add debug timing for image preparation (rasterize/resize) and batch inference in the VLM convert stage.
Add regression tests for VLM convert generation settings and API engine parameter precedence behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/test_vlm_convert_model.py	Adds regression tests asserting generation settings are forwarded for both `process_images()` and `__call__()` paths.
tests/test_api_vlm_engine.py	Adds tests validating API param precedence between model defaults, per-request settings, and explicit user params.
docling/models/stages/vlm_convert/vlm_convert_model.py	Centralizes `VlmEngineInput` construction and adds debug timing logs for preprocessing and inference.
docling/models/inference_engines/vlm/api_openai_compatible_engine.py	Changes API parameter precedence/merging and introduces separate storage for model vs user params.
docling/datamodel/stage_model_specs.py	Adds `temperature` and `extra_generation_config` fields to `VlmModelSpec`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov · 2026-04-17T07:41:01Z

Codecov Report

❌ Patch coverage is 96.72131% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...ing/models/stages/vlm_convert/vlm_convert_model.py	94.44%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

- api_openai_compatible_engine: apply per-request ``stop`` before merging user params so that explicit user ``stop`` wins over the generation defaults (addresses Copilot comment on docling-project#3322). - vlm_convert_model: build the per-batch generation template (stop strings + runtime generation config) once per call and reuse it across every ``VlmEngineInput`` instead of re-allocating per item. - tests: add regression coverage for the user-stop precedence, the watsonx-style vendor exclusivity, and the shared batch template. Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

- api_openai_compatible_engine: apply per-request ``stop`` before merging user params so that explicit user ``stop`` wins over the generation defaults (addresses Copilot comment on docling-project#3322). - vlm_convert_model: build the per-batch generation template (stop strings + runtime generation config) once per call and reuse it across every ``VlmEngineInput`` instead of re-allocating per item. - tests: add regression coverage for the user-stop precedence, the watsonx-style vendor exclusivity, and the shared batch template. Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

geoHeil force-pushed the fix/vlm-convert-runtime-settings branch from 6b50315 to 18a5e20 Compare April 17, 2026 06:09

geoHeil requested a review from Copilot April 17, 2026 06:50

Copilot started reviewing on behalf of geoHeil April 17, 2026 06:56 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

geoHeil force-pushed the fix/vlm-convert-runtime-settings branch from 18a5e20 to 2fd433e Compare April 17, 2026 06:59

geoHeil added a commit to geoHeil/docling that referenced this pull request Apr 17, 2026

style: apply ruff formatter to PR docling-project#3322 tests

3523f1e

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

geoHeil marked this pull request as ready for review April 19, 2026 12:55

geoHeil added 4 commits May 1, 2026 13:49

Forward VLM convert model spec runtime settings

4ab3236

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

style: apply ruff formatter to PR docling-project#3322 tests

38fcc4d

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

refactor(vlm): centralize VLM prediction conversion

8d79e0f

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

geoHeil force-pushed the fix/vlm-convert-runtime-settings branch from 3523f1e to 8d79e0f Compare May 1, 2026 11:51

fix(vlm): annotate API engine initialization state

33c6824

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: forward VLM convert runtime generation settings#3322

fix: forward VLM convert runtime generation settings#3322
geoHeil wants to merge 5 commits into
docling-project:mainfrom
geoHeil:fix/vlm-convert-runtime-settings

geoHeil commented Apr 17, 2026

Uh oh!

mergify Bot commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

geoHeil commented Apr 17, 2026

Uh oh!

mergify Bot commented Apr 17, 2026

Merge Protections

🟢 Enforce conventional commit

Uh oh!

github-actions Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Apr 17, 2026 •

edited

Loading

codecov Bot commented Apr 17, 2026 •

edited

Loading