Skip to content

fix: forward VLM convert runtime generation settings#3322

Open
geoHeil wants to merge 5 commits into
docling-project:mainfrom
geoHeil:fix/vlm-convert-runtime-settings
Open

fix: forward VLM convert runtime generation settings#3322
geoHeil wants to merge 5 commits into
docling-project:mainfrom
geoHeil:fix/vlm-convert-runtime-settings

Conversation

@geoHeil

@geoHeil geoHeil commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Forward the VLM convert stage generation settings from model_spec when constructing VlmEngineInput, and add debug timing around preprocessing and batch inference.

This change:

  • adds temperature and extra_generation_config to VlmModelSpec
  • threads temperature, max_new_tokens, stop_strings, and extra_generation_config through both VlmConvertModel.__call__() and process_images()
  • logs rasterization/resize timing and batch inference timing in the VLM convert stage
  • adds regression tests covering both VLM convert entry points

Issue resolved by this Pull Request:
Resolves #3321

Checklist:

  • Documentation has been updated, if necessary.
  • Examples have been added, if necessary.
  • Tests have been added, if necessary.

@mergify

mergify Bot commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@geoHeil geoHeil force-pushed the fix/vlm-convert-runtime-settings branch from 6b50315 to 18a5e20 Compare April 17, 2026 06:09
@github-actions

github-actions Bot commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

DCO Check Passed

Thanks @geoHeil, all your commits are properly signed off. 🎉

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR forwards VLM convert-stage generation settings from VlmModelSpec into VlmEngineInput, adds debug timing logs around preprocessing and inference, and introduces regression tests to ensure both VLM convert entry points respect the configured generation settings.

Changes:

  • Extend VlmModelSpec with temperature and extra_generation_config, and forward generation settings when constructing VlmEngineInput.
  • Add debug timing for image preparation (rasterize/resize) and batch inference in the VLM convert stage.
  • Add regression tests for VLM convert generation settings and API engine parameter precedence behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/test_vlm_convert_model.py Adds regression tests asserting generation settings are forwarded for both process_images() and __call__() paths.
tests/test_api_vlm_engine.py Adds tests validating API param precedence between model defaults, per-request settings, and explicit user params.
docling/models/stages/vlm_convert/vlm_convert_model.py Centralizes VlmEngineInput construction and adds debug timing logs for preprocessing and inference.
docling/models/inference_engines/vlm/api_openai_compatible_engine.py Changes API parameter precedence/merging and introduces separate storage for model vs user params.
docling/datamodel/stage_model_specs.py Adds temperature and extra_generation_config fields to VlmModelSpec.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docling/models/stages/vlm_convert/vlm_convert_model.py Outdated
@geoHeil geoHeil force-pushed the fix/vlm-convert-runtime-settings branch from 18a5e20 to 2fd433e Compare April 17, 2026 06:59
@codecov

codecov Bot commented Apr 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.72131% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ing/models/stages/vlm_convert/vlm_convert_model.py 94.44% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

geoHeil added a commit to geoHeil/docling that referenced this pull request Apr 17, 2026
- api_openai_compatible_engine: apply per-request ``stop`` before merging
  user params so that explicit user ``stop`` wins over the generation
  defaults (addresses Copilot comment on docling-project#3322).
- vlm_convert_model: build the per-batch generation template (stop
  strings + runtime generation config) once per call and reuse it
  across every ``VlmEngineInput`` instead of re-allocating per item.
- tests: add regression coverage for the user-stop precedence, the
  watsonx-style vendor exclusivity, and the shared batch template.

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
geoHeil added a commit to geoHeil/docling that referenced this pull request Apr 17, 2026
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
@geoHeil geoHeil marked this pull request as ready for review April 19, 2026 12:55
geoHeil added 4 commits May 1, 2026 13:49
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
- api_openai_compatible_engine: apply per-request ``stop`` before merging
  user params so that explicit user ``stop`` wins over the generation
  defaults (addresses Copilot comment on docling-project#3322).
- vlm_convert_model: build the per-batch generation template (stop
  strings + runtime generation config) once per call and reuse it
  across every ``VlmEngineInput`` instead of re-allocating per item.
- tests: add regression coverage for the user-stop precedence, the
  watsonx-style vendor exclusivity, and the shared batch template.

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
@geoHeil geoHeil force-pushed the fix/vlm-convert-runtime-settings branch from 3523f1e to 8d79e0f Compare May 1, 2026 11:51
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

VlmConvertModel drops generation settings when building VlmEngineInput

2 participants