-
Notifications
You must be signed in to change notification settings - Fork 312
[NPUW][VLM] Add support for multiple images for VLMs on NPU #3110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[NPUW][VLM] Add support for multiple images for VLMs on NPU #3110
Conversation
AlexanderKalistratov
commented
Dec 12, 2025
- Remove unnecessary assert
- Add test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enables support for multiple images in VLM (Vision Language Model) pipelines when using NPU devices by removing an overly restrictive assertion that limited batch sizes to 1.
Key Changes:
- Removed the assertion that restricted NPU VLM pipelines to a single image/video
- Added a new test to verify multiple image support on NPU
- Simplified an existing test by removing unnecessary loop over multiple images
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/cpp/src/visual_language/pipeline.cpp | Removed assertion limiting NPU to single image/video processing |
| tests/python_tests/test_vlm_pipeline.py | Added test for multiple images, simplified existing test, and added missing platform skip decorator |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ov_pipe.generate( | ||
| PROMPTS[0], images=[cat_tensor], generation_config=generation_config | ||
| ) |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation of the ov_pipe.generate() call has changed from being inside a loop to standalone, but the indentation level appears inconsistent. The function call should be indented 4 spaces from the function definition, matching the style of other statements at the same level.