Lora support for VLM Pipeline#3402
Conversation
There was a problem hiding this comment.
Pull request overview
Adds LoRA adapter support to the OpenVINO GenAI VLMPipeline, enabling runtime adapter application while preserving adapter-owned states across generation and chat resets. This also updates benchmarking utilities and adds Python/C++ samples plus README guidance to demonstrate VLM + LoRA usage.
Changes:
- Integrate LoRA adapter handling into
VLMPipelinevia adapter extraction from properties andAdapterControllerapplication during generation. - Extend who-what benchmark VLM loader/model paths to accept adapters/alphas.
- Add new Python and C++ VLM LoRA samples and document how to run them.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/who_what_benchmark/whowhatbench/model_loaders.py | Adds adapter plumbing for VLM GenAI pipeline and PEFT-based LoRA merging for HF visual-text models. |
| src/cpp/src/visual_language/pipeline_base.hpp | Stores an optional AdapterController in the VLM pipeline base for reuse by implementations. |
| src/cpp/src/visual_language/pipeline.cpp | Extracts adapters from properties, initializes/applies AdapterController, and preserves adapter state across resets. |
| src/cpp/src/lora/adapter.cpp | Adds adapter path existence/extension validation for safetensors adapters. |
| samples/python/visual_language_chat/visual_language_lora.py | New Python sample demonstrating VLM generation with and without LoRA adapters. |
| samples/python/visual_language_chat/README.md | Documents the new Python LoRA sample and alpha interpretation. |
| samples/cpp/visual_language_chat/visual_language_lora.cpp | New C++ sample demonstrating VLM generation with and without LoRA adapters. |
| samples/cpp/visual_language_chat/README.md | Documents the new C++ LoRA sample and alpha interpretation. |
| samples/cpp/visual_language_chat/CMakeLists.txt | Builds/installs the new C++ LoRA sample binary. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Vladimir Zlobin <vladimir.zlobin@intel.com>
| def apply_peft_adapters(model, adapters, alphas, merged_adapter_name="merged_lora"): | ||
| adapters, alphas = normalize_lora_adapters_and_alphas(adapters, alphas) | ||
|
|
||
| from peft import PeftModel | ||
|
|
||
| adapter_names = ["adapter_0"] | ||
| model = PeftModel.from_pretrained(model, adapters[0], adapter_name=adapter_names[0]) |
There was a problem hiding this comment.
apply_peft_adapters() calls normalize_lora_adapters_and_alphas(), which returns (None, None) when adapters is None, and then immediately indexes adapters[0]. If apply_peft_adapters() is called with adapters=None (easy to do from external callers since this is a utility), it will crash with a TypeError rather than a clear validation error.
Add an explicit check after normalization (e.g., raise ValueError or return the model unchanged) so the function’s behavior is well-defined for None adapters.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
tools/who_what_benchmark/tests/test_cli_vlm.py:3
Pathis imported but never used in this test module. Unused imports can break linting and add noise; please remove it.
import sys
eb33ce8
Description
Enables LoRa support for VLMPipeline
CVS-180080
Documentation
https://likholat.github.io/openvino.genai/
Checklist: