fix: forward VLM convert runtime generation settings#3322
Conversation
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
6b50315 to
18a5e20
Compare
|
✅ DCO Check Passed Thanks @geoHeil, all your commits are properly signed off. 🎉 |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR forwards VLM convert-stage generation settings from VlmModelSpec into VlmEngineInput, adds debug timing logs around preprocessing and inference, and introduces regression tests to ensure both VLM convert entry points respect the configured generation settings.
Changes:
- Extend
VlmModelSpecwithtemperatureandextra_generation_config, and forward generation settings when constructingVlmEngineInput. - Add debug timing for image preparation (rasterize/resize) and batch inference in the VLM convert stage.
- Add regression tests for VLM convert generation settings and API engine parameter precedence behavior.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_vlm_convert_model.py | Adds regression tests asserting generation settings are forwarded for both process_images() and __call__() paths. |
| tests/test_api_vlm_engine.py | Adds tests validating API param precedence between model defaults, per-request settings, and explicit user params. |
| docling/models/stages/vlm_convert/vlm_convert_model.py | Centralizes VlmEngineInput construction and adds debug timing logs for preprocessing and inference. |
| docling/models/inference_engines/vlm/api_openai_compatible_engine.py | Changes API parameter precedence/merging and introduces separate storage for model vs user params. |
| docling/datamodel/stage_model_specs.py | Adds temperature and extra_generation_config fields to VlmModelSpec. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
18a5e20 to
2fd433e
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
- api_openai_compatible_engine: apply per-request ``stop`` before merging user params so that explicit user ``stop`` wins over the generation defaults (addresses Copilot comment on docling-project#3322). - vlm_convert_model: build the per-batch generation template (stop strings + runtime generation config) once per call and reuse it across every ``VlmEngineInput`` instead of re-allocating per item. - tests: add regression coverage for the user-stop precedence, the watsonx-style vendor exclusivity, and the shared batch template. Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
- api_openai_compatible_engine: apply per-request ``stop`` before merging user params so that explicit user ``stop`` wins over the generation defaults (addresses Copilot comment on docling-project#3322). - vlm_convert_model: build the per-batch generation template (stop strings + runtime generation config) once per call and reuse it across every ``VlmEngineInput`` instead of re-allocating per item. - tests: add regression coverage for the user-stop precedence, the watsonx-style vendor exclusivity, and the shared batch template. Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
3523f1e to
8d79e0f
Compare
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
Forward the VLM convert stage generation settings from
model_specwhen constructingVlmEngineInput, and add debug timing around preprocessing and batch inference.This change:
temperatureandextra_generation_configtoVlmModelSpectemperature,max_new_tokens,stop_strings, andextra_generation_configthrough bothVlmConvertModel.__call__()andprocess_images()Issue resolved by this Pull Request:
Resolves #3321
Checklist: