Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
4d1e8b0
samples: add Python VLM alerts sample using HF Optimum + gvagenai
oonyshch Feb 17, 2026
032c733
fix for the pipeline and additional packages resolution
oonyshch Feb 17, 2026
361fe2f
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch Feb 17, 2026
085783d
modify requirements.txt
oonyshch Feb 17, 2026
a6c5825
fix in requirements.txt
oonyshch Feb 19, 2026
026849d
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch Feb 23, 2026
edd52a2
vlm_alerts: fix pipeline string
oonyshch Feb 23, 2026
cd3601f
vlm_alerts: add README.md
oonyshch Feb 23, 2026
173d146
vlm_alerts: refactoring script after being pylint-shamed
oonyshch Feb 23, 2026
2bad652
vlm_alerts: trying to avoid the gst pylint error and restoring the li…
oonyshch Feb 23, 2026
cbcf7e6
vlm_alerts: make pylint ignore the gst import
oonyshch Feb 23, 2026
4e33bf2
vlm_alerts: disable pylint on both gst and glib
oonyshch Feb 23, 2026
d2ccfed
Windows - install VS Build Tools in setup script (#630)
dmichalo Feb 23, 2026
94ee21f
Fixed inconsistencies between code and comments. (#632)
jmotow Feb 23, 2026
0d0f348
Enable custom code to add GstAnalytics data outside of DLS components…
tjanczak Feb 24, 2026
2cfcd49
Extend Optimizer about input device selection and improved results re…
tbujewsk Feb 24, 2026
a7d9843
Disable gstreamer gpl plugins (#636)
mholowni Feb 24, 2026
ac6c189
[POST-PROC][YOLOv26 OBB] add blob parsing function to handle obb dime…
walidbarakat Feb 25, 2026
b3ee1db
Install Visual C++ runtime in setup (#635)
yunowo Feb 25, 2026
d9a159d
[GST gvawatermark] fix watermark default text backgroung behaviour (#…
walidbarakat Feb 25, 2026
2eecaf7
[DOCS] fix formatting (#641)
kblaszczak-intel Feb 25, 2026
c18fb0f
Fix yolo_v10.cpp compile error on windows (#645)
yunowo Feb 26, 2026
3fdc177
[DOCS] Add a warning about improper proxy handling by PAHO library (#…
msmiatac Feb 26, 2026
0941c5c
Update to OpenVino 2026.0.0 (#640)
tbujewsk Feb 26, 2026
b13008e
[vlm_alerts.py]: refine alert logic and improve processing flow
oonyshch Feb 26, 2026
fa1e62d
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch Feb 26, 2026
c1b9594
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch Feb 26, 2026
223951c
cancel changes in cmake and dockerfiles
oonyshch Feb 26, 2026
ced3814
refactoring README.md and requirements.txt
oonyshch Feb 26, 2026
8684109
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch Feb 26, 2026
59a80be
vlm_alerts: add CLI help section to README and fix gi import order
oonyshch Feb 26, 2026
70caa31
vlm_alerts: improve graph in README and change venv name
oonyshch Feb 27, 2026
50d78eb
Merge branch 'main' into oonyshch/vlm_alerts
oonyshch Feb 27, 2026
60c1dc3
vlm_alerts: forgot parenthesis in graph
oonyshch Feb 27, 2026
86f4c1e
vlm_alerts: refactoring of requirements to match new sample conventio…
oonyshch Feb 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion dependencies/gstreamer.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ ExternalProject_Add(
-Ddevtools=disabled
-Dorc=disabled
-Dgpl=disabled
-Dpython=enabled
-Dpython=enabled
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is out of scope, just noticed misalignment in formatting.

-Dgst-plugins-base:nls=disabled
-Dgst-plugins-base:gl=disabled
-Dgst-plugins-base:xvideo=enabled
Expand Down
122 changes: 122 additions & 0 deletions samples/gstreamer/python/vlm_alerts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# VLM Alerts

This sample demonstrates an edge AI alerting pipeline using Vision-Language Models (VLMs).

It shows how to:

- Download a VLM from Hugging Face
- Convert it to OpenVINO IR using `optimum-cli`
- Run inference inside a DL Streamer pipeline
- Generate structured JSON alerts per processed frame
- Produce MP4 output

## Use Case: Alert-Based Monitoring

VLMs can help accurately detect rare or contextual events using natural language prompts — for example, detecting a police car in a traffic video.
This enables alerting for events, like in prompts:

- Is there a police car?
- Is there smoke or fire?
- Is a person lying on the ground?

## Model Preparation

Any image-text-to-text model supported by optimum-intel can be used. Smaller models (1B-4B parameters) are recommended for edge deployment. For example, OpenGVLab/InternVL3_5-2B.

The script runs:

```code
optimum-cli export openvino \
--model <model_id> \
--task image-text-to-text \
--trust-remote-code \
<output_dir>
```

Exported artifacts are stored under `models/<ModelName>/`.
The export runs once and is cached. To skip export, pass `--model-path` directly.

## Video Preparation

Similarly to model, provide either:

- `--video-path` for a local file
- `--video-url` to download automatically

Downloaded videos are cached under `videos/`.

## Pipeline Architecture

The pipeline is built dynamically in Python using `Gst.parse_launch`.

```mermaid
graph LR
A[filesrc] --> B[decodebin3]
B --> C[gvagenai]
C --> D[gvametapublish]
D --> E[gvafpscounter]
E --> F[gvawatermark]
F --> G["encode (vah264enc + h264parse + mp4mux)"]
G --> H[filesink]
```

## Setup

1. Create and activate a virtual environment:
```code
cd samples/gstreamer/python/vlm_alerts
python3 -m venv .vlm-venv
source .vlm-venv/bin/activate
```

2. Install dependencies:
```code
curl -LO https://raw.githubusercontent.com/openvinotoolkit/openvino.genai/refs/heads/releases/2026/0/samples/export-requirements.txt
pip install -r export-requirements.txt PyGObject==3.50.0
```

> A DL Streamer build that includes the `gvagenai` element is required.

## Running

Required arguments:

- `--prompt`
- `--video-path` or `--video-url`
- `--model-id` or `--model-path`

Example:

```code
python3 vlm_alerts.py \
--video-url https://videos.pexels.com/video-files/2103099/2103099-hd_1280_720_60fps.mp4 \
--model-id OpenGVLab/InternVL3_5-2B \
--prompt "Is there a police car? Answer yes or no."
```

Optional arguments:

| Argument | Default | Description |
|---|---|---|
| `--device` | `GPU` | Inference device |
| `--max-tokens` | `20` | Maximum tokens in the model response |
| `--frame-rate` | `1.0` | Frames per second passed to `gvagenai` |
| `--videos-dir` | `./videos` | Directory for downloaded videos |
| `--models-dir` | `./models` | Directory for exported models |
| `--results-dir` | `./results` | Directory for output files |

## Output

```
results/<ModelName>-<video_stem>.jsonl
results/<ModelName>-<video_stem>.mp4
```

The `.jsonl` file contains one model response per processed frame and can be used to trigger downstream alerting logic.

### Help

To display all available arguments and defaults:

```code
python3 vlm_alerts.py --help
Loading
Loading