-
Notifications
You must be signed in to change notification settings - Fork 189
[Samples]: Add VLM alerts sample in python 1/2 #620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
4d1e8b0
samples: add Python VLM alerts sample using HF Optimum + gvagenai
oonyshch 032c733
fix for the pipeline and additional packages resolution
oonyshch 361fe2f
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch 085783d
modify requirements.txt
oonyshch a6c5825
fix in requirements.txt
oonyshch 026849d
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch edd52a2
vlm_alerts: fix pipeline string
oonyshch cd3601f
vlm_alerts: add README.md
oonyshch 173d146
vlm_alerts: refactoring script after being pylint-shamed
oonyshch 2bad652
vlm_alerts: trying to avoid the gst pylint error and restoring the li…
oonyshch cbcf7e6
vlm_alerts: make pylint ignore the gst import
oonyshch 4e33bf2
vlm_alerts: disable pylint on both gst and glib
oonyshch d2ccfed
Windows - install VS Build Tools in setup script (#630)
dmichalo 94ee21f
Fixed inconsistencies between code and comments. (#632)
jmotow 0d0f348
Enable custom code to add GstAnalytics data outside of DLS components…
tjanczak 2cfcd49
Extend Optimizer about input device selection and improved results re…
tbujewsk a7d9843
Disable gstreamer gpl plugins (#636)
mholowni ac6c189
[POST-PROC][YOLOv26 OBB] add blob parsing function to handle obb dime…
walidbarakat b3ee1db
Install Visual C++ runtime in setup (#635)
yunowo d9a159d
[GST gvawatermark] fix watermark default text backgroung behaviour (#…
walidbarakat 2eecaf7
[DOCS] fix formatting (#641)
kblaszczak-intel c18fb0f
Fix yolo_v10.cpp compile error on windows (#645)
yunowo 3fdc177
[DOCS] Add a warning about improper proxy handling by PAHO library (#…
msmiatac 0941c5c
Update to OpenVino 2026.0.0 (#640)
tbujewsk b13008e
[vlm_alerts.py]: refine alert logic and improve processing flow
oonyshch fa1e62d
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch c1b9594
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch 223951c
cancel changes in cmake and dockerfiles
oonyshch ced3814
refactoring README.md and requirements.txt
oonyshch 8684109
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch 59a80be
vlm_alerts: add CLI help section to README and fix gi import order
oonyshch 70caa31
vlm_alerts: improve graph in README and change venv name
oonyshch 50d78eb
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch 60c1dc3
vlm_alerts: forgot parenthesis in graph
oonyshch 86f4c1e
vlm_alerts: refactoring of requirements to match new sample conventio…
oonyshch File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,122 @@ | ||
| # VLM Alerts | ||
|
|
||
| This sample demonstrates an edge AI alerting pipeline using Vision-Language Models (VLMs). | ||
|
|
||
| It shows how to: | ||
|
|
||
| - Download a VLM from Hugging Face | ||
| - Convert it to OpenVINO IR using `optimum-cli` | ||
| - Run inference inside a DL Streamer pipeline | ||
| - Generate structured JSON alerts per processed frame | ||
| - Produce MP4 output | ||
|
|
||
| ## Use Case: Alert-Based Monitoring | ||
|
|
||
| VLMs can help accurately detect rare or contextual events using natural language prompts — for example, detecting a police car in a traffic video. | ||
| This enables alerting for events, like in prompts: | ||
|
|
||
| - Is there a police car? | ||
| - Is there smoke or fire? | ||
| - Is a person lying on the ground? | ||
|
|
||
| ## Model Preparation | ||
|
|
||
| Any image-text-to-text model supported by optimum-intel can be used. Smaller models (1B-4B parameters) are recommended for edge deployment. For example, OpenGVLab/InternVL3_5-2B. | ||
|
|
||
| The script runs: | ||
|
|
||
| ```code | ||
| optimum-cli export openvino \ | ||
| --model <model_id> \ | ||
| --task image-text-to-text \ | ||
| --trust-remote-code \ | ||
| <output_dir> | ||
| ``` | ||
|
|
||
| Exported artifacts are stored under `models/<ModelName>/`. | ||
| The export runs once and is cached. To skip export, pass `--model-path` directly. | ||
|
|
||
| ## Video Preparation | ||
|
|
||
| Similarly to model, provide either: | ||
|
|
||
| - `--video-path` for a local file | ||
| - `--video-url` to download automatically | ||
|
|
||
| Downloaded videos are cached under `videos/`. | ||
|
|
||
| ## Pipeline Architecture | ||
|
|
||
| The pipeline is built dynamically in Python using `Gst.parse_launch`. | ||
|
|
||
| ```mermaid | ||
| graph LR | ||
oonyshch marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| A[filesrc] --> B[decodebin3] | ||
| B --> C[gvagenai] | ||
| C --> D[gvametapublish] | ||
| D --> E[gvafpscounter] | ||
| E --> F[gvawatermark] | ||
| F --> G["encode (vah264enc + h264parse + mp4mux)"] | ||
| G --> H[filesink] | ||
| ``` | ||
|
|
||
| ## Setup | ||
|
|
||
| 1. Create and activate a virtual environment: | ||
| ```code | ||
| cd samples/gstreamer/python/vlm_alerts | ||
| python3 -m venv .vlm-venv | ||
| source .vlm-venv/bin/activate | ||
| ``` | ||
|
|
||
| 2. Install dependencies: | ||
| ```code | ||
| curl -LO https://raw.githubusercontent.com/openvinotoolkit/openvino.genai/refs/heads/releases/2026/0/samples/export-requirements.txt | ||
| pip install -r export-requirements.txt PyGObject==3.50.0 | ||
| ``` | ||
|
|
||
| > A DL Streamer build that includes the `gvagenai` element is required. | ||
|
|
||
| ## Running | ||
|
|
||
| Required arguments: | ||
|
|
||
| - `--prompt` | ||
| - `--video-path` or `--video-url` | ||
| - `--model-id` or `--model-path` | ||
|
|
||
| Example: | ||
|
|
||
| ```code | ||
| python3 vlm_alerts.py \ | ||
| --video-url https://videos.pexels.com/video-files/2103099/2103099-hd_1280_720_60fps.mp4 \ | ||
| --model-id OpenGVLab/InternVL3_5-2B \ | ||
| --prompt "Is there a police car? Answer yes or no." | ||
| ``` | ||
|
|
||
| Optional arguments: | ||
|
|
||
| | Argument | Default | Description | | ||
| |---|---|---| | ||
| | `--device` | `GPU` | Inference device | | ||
| | `--max-tokens` | `20` | Maximum tokens in the model response | | ||
| | `--frame-rate` | `1.0` | Frames per second passed to `gvagenai` | | ||
| | `--videos-dir` | `./videos` | Directory for downloaded videos | | ||
| | `--models-dir` | `./models` | Directory for exported models | | ||
| | `--results-dir` | `./results` | Directory for output files | | ||
|
|
||
| ## Output | ||
|
|
||
| ``` | ||
| results/<ModelName>-<video_stem>.jsonl | ||
| results/<ModelName>-<video_stem>.mp4 | ||
| ``` | ||
|
|
||
| The `.jsonl` file contains one model response per processed frame and can be used to trigger downstream alerting logic. | ||
|
|
||
| ### Help | ||
|
|
||
| To display all available arguments and defaults: | ||
|
|
||
| ```code | ||
| python3 vlm_alerts.py --help | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is out of scope, just noticed misalignment in formatting.