-
Notifications
You must be signed in to change notification settings - Fork 380
Lora support for VLM Pipeline #3402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
52 commits
Select commit
Hold shift + click to select a range
3c7aefd
lora for vlm
likholat afe925c
hf vlm with lora
likholat 5d1a42a
python sample
likholat de0fe24
Merge remote-tracking branch 'origin/master' into lora_for_vlm
likholat 5c9abc0
enable Lora for VLM CB wwb case
likholat bfea1cc
Merge remote-tracking branch 'origin/master' into lora_for_vlm
likholat 61e3b3a
lora vlm sdpa acc fix
likholat b8f802b
Update src/cpp/src/lora/adapter.cpp
likholat 83ad53d
Update samples/cpp/visual_language_chat/visual_language_lora.cpp
likholat eccee7d
add test
likholat f32818b
Merge remote-tracking branch 'origin/master' into lora_for_vlm
likholat a01b4fb
codestyle fixes
likholat 8bd2d5a
codestyle fixes
likholat fb6a2b8
wwb with and without adapter fix
likholat f218a13
add includes
likholat 521bff3
fix ci test
likholat 6669197
get_tensor_name_prefix().value_or
likholat e03ee2b
reset_language_state func
likholat 24426d4
copyright for python sample
likholat f893dae
codestyle fix
likholat 6597e0c
Merge remote-tracking branch 'origin/master' into lora_for_vlm
likholat 9c9fa74
mv lora test
likholat e210c08
review fixes
likholat a98e1f4
samples fix
likholat 0bf8454
review fixes: samples update
likholat 192d48d
docs update
likholat d8759f7
align python sample output with cpp sample
likholat a709568
docs fix
likholat 79bd448
python sample CACHE_DIR fix
likholat 0b76255
docs fix
likholat b4a1544
fix sample
likholat f171114
normalize_lora_adapters_and_alphas
likholat 13a1a20
codestyle fixes
likholat 00b2c76
review fixes
likholat fffa862
rm device from samples
likholat ba60ad1
review fixes
likholat 2f09df7
func for create adapter config
likholat 8553f83
lora wwb test
likholat 4a80bcc
fix ci fails
likholat 4f72979
add lora tests
likholat 422ed76
fix lora
likholat c54fd31
fix wwb test
likholat 970c833
enable tests
likholat db5cc27
codestyle fix
likholat 5a25225
Merge remote-tracking branch 'origin/master' into lora_for_vlm
likholat bce9726
Update samples/cpp/visual_language_chat/visual_language_lora.cpp
likholat ba7c2b2
mv _download_hf_files_to_cache
likholat 944956d
update requirements
likholat 508eda2
random vlm + random lora for wwb test
likholat 13b51c1
codestyle fix
likholat 5e1f187
fix cache dir for test
likholat 812b3c7
specify peft version
likholat File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| // Copyright (C) 2026 Intel Corporation | ||
| // SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| #include "load_image.hpp" | ||
| #include <openvino/core/except.hpp> | ||
| #include <openvino/genai/visual_language/pipeline.hpp> | ||
| #include <cstdlib> | ||
| #include <filesystem> | ||
| #include <iostream> | ||
| #include <stdexcept> | ||
| #include <string> | ||
| #include <vector> | ||
|
|
||
| ov::genai::StreamingStatus print_subword(std::string&& subword) { | ||
| std::cout << subword << std::flush; | ||
| return ov::genai::StreamingStatus::RUNNING; | ||
| } | ||
| int main(int argc, char* argv[]) try { | ||
| // At least one LoRA adapter must be provided. | ||
| OPENVINO_ASSERT(argc >= 6 && ((argc - 4) % 2) == 0, | ||
| "Usage: ", argv[0], | ||
| " <MODEL_DIR> <IMAGE_FILE OR DIR_WITH_IMAGES> <PROMPT> <LORA_SAFETENSORS> <ALPHA> [<LORA_SAFETENSORS> <ALPHA> ...]"); | ||
|
|
||
| std::vector<ov::Tensor> rgbs = utils::load_images(argv[2]); | ||
|
|
||
| const std::string device = "CPU"; // GPU can be used as well | ||
| ov::AnyMap pipeline_properties; | ||
|
|
||
| const std::string prompt = argv[3]; | ||
|
|
||
| // LoRA args parsed as pairs: <LORA_SAFETENSORS> <ALPHA> | ||
| ov::genai::AdapterConfig adapter_config; | ||
| for (int idx = 4; idx + 1 < argc; idx += 2) { | ||
| ov::genai::Adapter adapter(argv[idx]); | ||
| float alpha = std::stof(argv[idx + 1]); | ||
| adapter_config.add(adapter, alpha); | ||
| } | ||
| pipeline_properties.insert({ov::genai::adapters(adapter_config)}); | ||
|
|
||
| ov::genai::VLMPipeline pipe(argv[1], device, pipeline_properties); | ||
|
|
||
| ov::genai::GenerationConfig generation_config; | ||
| generation_config.max_new_tokens = 100; | ||
|
|
||
| std::cout << "Generating answer with LoRA adapters applied:\n"; | ||
| pipe.generate(prompt, | ||
| ov::genai::images(rgbs), | ||
| ov::genai::generation_config(generation_config), | ||
| ov::genai::streamer(print_subword)); | ||
|
|
||
| std::cout << "\n----------\nGenerating answer without LoRA adapters applied:\n"; | ||
| pipe.generate(prompt, | ||
| ov::genai::images(rgbs), | ||
| ov::genai::generation_config(generation_config), | ||
| ov::genai::adapters(), | ||
| ov::genai::streamer(print_subword)); | ||
| std::cout << "\n----------\n"; | ||
|
|
||
| } catch (const std::exception& error) { | ||
| try { | ||
| std::cerr << error.what() << '\n'; | ||
| } catch (const std::ios_base::failure&) {} | ||
| return EXIT_FAILURE; | ||
| } catch (...) { | ||
| try { | ||
| std::cerr << "Non-exception object thrown\n"; | ||
| } catch (const std::ios_base::failure&) {} | ||
| return EXIT_FAILURE; | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
128 changes: 128 additions & 0 deletions
128
samples/python/visual_language_chat/visual_language_lora.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,128 @@ | ||
| #!/usr/bin/env python3 | ||
| # Copyright (C) 2026 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| import argparse | ||
| import numpy as np | ||
| import openvino_genai as ov_genai | ||
likholat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
likholat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| from pathlib import Path | ||
| from PIL import Image | ||
| from openvino import Tensor | ||
|
|
||
|
|
||
| def streamer(subword: str) -> bool: | ||
| """ | ||
|
|
||
| Args: | ||
| subword: sub-word of the generated text. | ||
|
|
||
| Returns: Return flag corresponds whether generation should be stopped. | ||
|
|
||
| """ | ||
| print(subword, end="", flush=True) | ||
|
|
||
| # No value is returned as in this example we don't want to stop the generation in this method. | ||
| # "return None" will be treated the same as "return openvino_genai.StreamingStatus.RUNNING". | ||
likholat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
likholat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
likholat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| def read_image(path: str) -> Tensor: | ||
| """ | ||
|
|
||
| Args: | ||
| path: The path to the image. | ||
|
|
||
| Returns: the ov.Tensor containing the image. | ||
|
|
||
| """ | ||
| pic = Image.open(path).convert("RGB") | ||
| image_data = np.array(pic) | ||
| return Tensor(image_data) | ||
|
|
||
|
|
||
| def read_images(path: str) -> list[Tensor]: | ||
| entry = Path(path) | ||
| if entry.is_dir(): | ||
| return [read_image(str(file)) for file in sorted(entry.iterdir())] | ||
| return [read_image(path)] | ||
|
|
||
|
|
||
| def parse_lora_pairs(raw): | ||
| if len(raw) < 2: | ||
| raise argparse.ArgumentTypeError( | ||
| "At least one LoRA adapter pair is required: <LORA_SAFETENSORS> <ALPHA> [<LORA_SAFETENSORS> <ALPHA> ...]" | ||
| ) | ||
| if len(raw) % 2 != 0: | ||
| raise argparse.ArgumentTypeError("LoRA args must come in pairs: <LORA_SAFETENSORS> <ALPHA> ...") | ||
|
|
||
| pairs = [] | ||
| for i in range(0, len(raw), 2): | ||
| path = raw[i] | ||
| try: | ||
| alpha = float(raw[i + 1]) | ||
| except ValueError as e: | ||
| raise argparse.ArgumentTypeError(f"Invalid alpha '{raw[i + 1]}' for LoRA '{path}'") from e | ||
| pairs.append((path, alpha)) | ||
| return pairs | ||
|
|
||
|
|
||
| def main() -> int: | ||
| p = argparse.ArgumentParser( | ||
| description="OpenVINO GenAI VLM sample: run with and without LoRA adapters.", | ||
| formatter_class=argparse.RawTextHelpFormatter, | ||
| ) | ||
| p.add_argument("model_dir", help="Path to model directory") | ||
| p.add_argument("images_path", help="Image file OR directory with images") | ||
| p.add_argument("prompt", help="Prompt/question to ask") | ||
| p.add_argument( | ||
| "lora_pairs", | ||
| nargs="+", | ||
| metavar="LORA_ALPHA", | ||
| help="Pairs: <LORA_SAFETENSORS> <ALPHA> ...", | ||
| ) | ||
|
|
||
| args = p.parse_args() | ||
| prompt = args.prompt | ||
| loras = parse_lora_pairs(args.lora_pairs) | ||
|
|
||
| rgbs = read_images(args.images_path) | ||
|
|
||
| device = "CPU" # GPU can be used as well | ||
|
|
||
| pipe_kwargs = {} | ||
|
|
||
likholat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| # Configure LoRA adapters with weights (alphas) | ||
| if loras: | ||
| adapter_config = ov_genai.AdapterConfig() | ||
| for lora_path, alpha in loras: | ||
| adapter_config.add(ov_genai.Adapter(lora_path), alpha) | ||
| pipe_kwargs["adapters"] = adapter_config | ||
|
|
||
| pipe = ov_genai.VLMPipeline(args.model_dir, device, **pipe_kwargs) | ||
|
|
||
| gen_cfg = ov_genai.GenerationConfig() | ||
| gen_cfg.max_new_tokens = 100 | ||
|
|
||
| print("Generating answer with LoRA adapters applied:") | ||
| pipe.generate( | ||
| prompt, | ||
| images=rgbs, | ||
| generation_config=gen_cfg, | ||
| streamer=streamer, | ||
| ) | ||
|
|
||
| print("\n----------\nGenerating answer without LoRA adapters applied:") | ||
| pipe.generate( | ||
likholat marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| prompt, | ||
| images=rgbs, | ||
| generation_config=gen_cfg, | ||
| adapters=ov_genai.AdapterConfig(), | ||
| streamer=streamer, | ||
| ) | ||
|
|
||
| print("\n----------") | ||
| return 0 | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.