Skip to content
Open
Show file tree
Hide file tree
Changes from 49 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
e035171
add NPU env var
ajagadi1 Sep 26, 2025
90a85d8
added npu and cpu pipelines
ajagadi1 Sep 28, 2025
1fc8534
added npu and cpu pipelines for all apps
ajagadi1 Sep 28, 2025
22f844d
updated woker safety docs
ajagadi1 Sep 29, 2025
27dc24b
updated weld porosity docs
ajagadi1 Sep 29, 2025
e4ff787
update weld docs
ajagadi1 Sep 29, 2025
47e4ca9
updated docs for pcb
ajagadi1 Sep 29, 2025
ab05537
added cpu, gpu and npu pipeline for smart aprking
ajagadi1 Oct 7, 2025
6930be0
updated docs for smart parking
ajagadi1 Oct 7, 2025
4cff8c9
updated smart parking pipelines
ajagadi1 Oct 7, 2025
58a1bbc
added gpu pipeline for loitering detection
ajagadi1 Oct 7, 2025
91687f6
Merge branch 'main' of https://github.com/open-edge-platform/edge-ai-…
ajagadi1 Oct 8, 2025
1e1b837
Update config.json to use the correct device for NPU pipelines (#771)
sairampillai Oct 9, 2025
a487baf
[Metro AI Suite] Update smart parking with yolov11s model (#814)
ajagadi1 Oct 15, 2025
83c618a
updated model paths (#817)
ajagadi1 Oct 16, 2025
dd584c9
Update loitering detection model from FP32 to FP16 (#822)
ajagadi1 Oct 16, 2025
984f982
Loitering detection: add cpu gpu and npu docs (#844)
ajagadi1 Oct 23, 2025
841842d
Add benchmarking mode for loitering-detection
sairampillai Oct 23, 2025
21c462d
Add benchmarking mode for smart-parking
sairampillai Oct 23, 2025
9f9cfe9
removing duplicated overlay on the final frame.. (#864)
deepaks2 Oct 24, 2025
9bc6eae
Merge branch 'magic9-manufacturing-metro-vision-apps' into sairampill…
deepaks2 Oct 24, 2025
70fcb46
Update benchmark script for looped video and benchmark payload
sairampillai Oct 24, 2025
66c032e
Add npu payload
sairampillai Oct 24, 2025
c7cf541
Update benchmarking logic inline with WSF
sairampillai Oct 27, 2025
7ade5f4
Merge main
sairampillai Oct 27, 2025
0801351
Merge branch 'main' of https://github.com/open-edge-platform/edge-ai-…
sairampillai Oct 27, 2025
7c2cfc7
Fix failing CI for docs
sairampillai Oct 27, 2025
9e8153a
Fix another CI and merge conflict reversal
sairampillai Oct 27, 2025
2760560
Merge branch 'main' into sairampillai/metro-m9-benchmarking
sairampillai Oct 27, 2025
92f10f4
Merge branch 'main' into sairampillai/metro-m9-benchmarking
sairampillai Oct 28, 2025
867d3bd
Update docs
sairampillai Oct 28, 2025
8f88005
Update how-to-use-cpu-for-inference.md to fix CI
sairampillai Oct 28, 2025
f4e8bf1
Remove benchmarking pipelines
sairampillai Oct 29, 2025
7bab047
Merge branch 'sairampillai/metro-m9-benchmarking' of https://github.c…
sairampillai Oct 29, 2025
9141720
Update and refactor payload files
sairampillai Oct 29, 2025
56f632b
Update benchmark script to take in refactored payload file and pipeline
sairampillai Oct 29, 2025
1a1b1d0
Update documentation
sairampillai Oct 29, 2025
928d5ac
Merge branch 'main' into sairampillai/metro-m9-benchmarking
sairampillai Oct 29, 2025
be1a0a5
Merge branch 'main' into sairampillai/metro-m9-benchmarking
sairampillai Oct 29, 2025
a60bf38
Merge branch 'main' into sairampillai/metro-m9-benchmarking
sairampillai Oct 30, 2025
c2c26bb
Update default pipeline parameters
deepaks2 Oct 31, 2025
ca062c9
Update smart-parking default pipeline parameters
deepaks2 Oct 31, 2025
d219dff
Update docs
deepaks2 Oct 31, 2025
119c0b6
Merge branch 'main' into sairampillai/metro-m9-benchmarking
sairampillai Oct 31, 2025
cee51f9
Merge branch 'main' into sairampillai/metro-m9-benchmarking
sairampillai Oct 31, 2025
de7797c
Add device group rules for gpu support
tejaswinijayashanker943 Oct 31, 2025
aadb4c7
Add GPU pipelines
tejaswinijayashanker943 Oct 31, 2025
d07a9d0
Update sample scripts for smart-intersection
tejaswinijayashanker943 Oct 31, 2025
2f489b8
Merge branch 'sairampillai/metro-m9-benchmarking' into sairam/metro-s…
sairampillai Oct 31, 2025
28855dd
Merge branch 'main' into sairam/metro-smart-intersection-gpu
sairampillai Nov 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
486 changes: 486 additions & 0 deletions metro-ai-suite/metro-vision-ai-app-recipe/benchmark_start.sh

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -164,8 +164,16 @@ services:
- scenescape
privileged: true
entrypoint: ["./run.sh"]
group_add:
- "109"
- "110"
- "992"
device_cgroup_rules:
- 'c 189:* rmw'
- 'c 209:* rmw'
- 'a 189:* rwm'
devices:
- "/dev/dri:/dev/dri"
- "/dev:/dev"
depends_on:
- broker
- ntpserver
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
[
{
"pipeline": "object_tracking_cpu",
"payload":{
"source": {
"uri": "file:///home/pipeline-server/videos/VIRAT_S_000101_looped.mp4",
"type": "uri"
},
"destination": {
"metadata": {
"type": "mqtt",
"topic": "object_tracking_$x",
"publish_frame":false
},
"frame": {
"type": "webrtc",
"peer-id": "object_tracking_$x"
}
},
"parameters": {
"detection-properties": {
"model": "/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml",
"device": "CPU"
}
}
}
},
{
"pipeline": "object_tracking_gpu",
"payload":{
"source": {
"uri": "file:///home/pipeline-server/videos/VIRAT_S_000101_looped.mp4",
"type": "uri"
},
"destination": {
"metadata": {
"type": "mqtt",
"topic": "object_tracking_$x",
"publish_frame":false
},
"frame": {
"type": "webrtc",
"peer-id": "object_tracking_$x"
}
},
"parameters": {
"detection-properties": {
"model": "/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml",
"device": "GPU"
}
}
}
},
{
"pipeline": "object_tracking_npu",
"payload":{
"source": {
"uri": "file:///home/pipeline-server/videos/VIRAT_S_000101_looped.mp4",
"type": "uri"
},
"destination": {
"metadata": {
"type": "mqtt",
"topic": "object_tracking_$x",
"publish_frame":false
},
"frame": {
"type": "webrtc",
"peer-id": "object_tracking_$x"
}
},
"parameters": {
"detection-properties": {
"model": "/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml",
"device": "NPU"
}
}
}
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# How to Benchmark Performance

This document provides instructions on how to run performance benchmarks for the Vision AI applications using the provided benchmarking scripts. The script determines the maximum number of concurrent video streams a system can process (stream density) while maintaining a target performance level.

## Prerequisites

- The `edge-ai-suites` repository must be cloned to your system.

## Step 1: Understand the Benchmarking Script

The core of the benchmarking process is the `benchmark_start.sh` script, located in the `metro-vision-ai-app-recipe/` directory. This script automates the process of starting video streams, monitoring their performance (Frames Per Second - FPS), and calculating key performance indicators (KPIs) to find the maximum sustainable stream density.

### Stream Density Logic

The script uses a binary search algorithm to efficiently find the optimal stream count within a given range (`lower_bound` and `upper_bound`). Here is a summary of the logic from the `benchmark_start.sh` script:

1. **Initialization:** The script starts with a lower bound (`lns`) and an upper bound (`uns`) for the number of streams. The current number of streams to test (`ns`) is initialized to the lower bound. A variable (`tns`) tracks the highest successful stream count found so far.

2. **Binary Search Loop:** The script iterates until the range between the lower and upper bounds is 1, and both bounds have been tested. In each iteration:
* It runs a workload with the current number of streams (`ns`).
* It measures the `throughput min` (the lowest FPS achieved among all streams) and compares it to the `target_fps`.

3. **Adjusting the Range:**
* **If Performance Target is NOT Met** (`throughput min` < `target_fps`): The current stream count (`ns`) is too high. It becomes the new upper bound (`uns = ns`). The next stream count to test is calculated as the midpoint between the old lower bound and this new upper bound.
* **If Performance Target is Met** (`throughput min` >= `target_fps`): The system can handle this workload. The current stream count (`ns`) becomes the new lower bound (`lns = ns`), and the highest successful stream count (`tns`) is updated. The next stream count to test is calculated as the midpoint between this new lower bound and the old upper bound.

4. **Convergence:** This process of testing midpoints and narrowing the search range continues until the loop condition is met. The final value of `tns` represents the highest number of streams that successfully met the performance target, which is reported as the final stream density.

### Average FPS Calculation

During each test run, the script logs the `avg_fps` for every active pipeline instance at regular intervals. At the end of the run, an `awk` script processes these logs to calculate several KPIs for the collection of FPS samples from each stream:

- **Percentile Throughput:** Calculates a specific percentile (e.g., 90th) of the FPS values to ignore outliers.
- **Average Throughput:** The mean FPS across all streams.
- **Median Throughput:** The median FPS value.
- **Cumulative Throughput:** The sum of the FPS from all streams.
- **Min Throughput:** The lowest (worst-case) FPS achieved among all streams. This value is critical for the stream density calculation.

## Step 2: Prepare for Benchmarking

1. **Set Up and Start the Application:** Before running the benchmark, you must set up and start the desired application (e.g., Loitering Detection). This ensures all services, including the DL Streamer Pipeline Server, are running and available. For setup instructions, please refer to the `get-started.md` guide located in the specific application's documentation folder (e.g., `loitering-detection/docs/user-guide/`).

2. **Navigate to Script Directory:** Open a terminal and navigate to the `metro-vision-ai-app-recipe` directory.

```bash
cd edge-ai-suites/metro-ai-suite/metro-vision-ai-app-recipe/
```

3. **Stop Existing Pipelines:** Ensure no other pipelines are running before you start the benchmark. You can stop any running pipelines with the `sample_stop.sh` script.

```bash
./sample_stop.sh
```

## Step 3: Run the Benchmark

The `benchmark_start.sh` script requires a pipeline name and stream count boundaries to run. The available pipelines are defined in the `benchmark_app_payload.json` file located within each application's directory (e.g., `loitering-detection/`).

<details>
<summary>Example Payload with Detection and Classification</summary>

The `benchmark_app_payload.json` file contains an array of pipeline configurations. Each configuration specifies the pipeline name and a payload with parameters for source, destination, and AI models. The script uses the pipeline name to select the corresponding payload for benchmarking.

Here is an example of a GPU pipeline configuration that includes both `detection-properties` and `classification-properties` with additional parameters:

```json
{
"pipeline": "object_tracking_gpu",
"payload": {
"source": {
"uri": "file:///home/pipeline-server/videos/VIRAT_S_000101_looped.mp4",
"type": "uri"
},
"destination": {
"metadata": {
"type": "mqtt",
"topic": "object_detection_$x",
"publish_frame": false
},
"frame": {
"type": "webrtc",
"peer-id": "object_detection_$x"
}
},
"parameters": {
"detection-properties": {
"model": "/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml",
"device": "GPU",
"inference-interval": 3,
"inference-region": 0,
"batch-size": 8,
"nireq": 2,
"ie-config": "NUM_STREAMS=2",
"pre-process-backend": "va-surface-sharing",
"threshold": 0.7
}
}
}
}
```
</details>

### Example: Running Stream Density Benchmark for Loitering Detection

This example will find the maximum number of loitering detection streams that can run on the CPU while maintaining at least 15 FPS.

1. Execute the `benchmark_start.sh` script, providing the desired pipeline name (`object_tracking_cpu` in this case). Here, we test a range of 1 to 16 streams.

```bash
# Usage: ./benchmark_start.sh -p <pipeline_name> -l <lower_bound> -u <upper_bound> -t <target_fps>

./benchmark_start.sh -p object_tracking_cpu -l 1 -u 16 -t 15
```

2. The script will output its progress as it tests different stream counts. The final output will show the optimal stream density found.

```text
✅ FINAL RESULT: Stream-Density Benchmark Completed!
stream density: 8
======================================================

KPIs for the optimal configuration (8 streams):
throughput #1: 29.98
throughput #2: 29.98
...
throughput #8: 29.98
throughput median: 29.98
throughput average: 29.98
throughput stdev: 0
throughput cumulative: 239.84
throughput min: 29.98
```

## Step 4: Stop the Benchmark

After the benchmark is complete, or if you need to stop it manually, use the `sample_stop.sh` script. This will delete all running pipeline instances.

```bash
./sample_stop.sh
```
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
# How to use CPU for inference

## CPU specific element properties

DL Streamer inference elements also provides property such as `device=CPU` and `pre-process-backend=opencv` to infer and pre-process on CPU. Read DL Streamer [docs](https://dlstreamer.github.io/dev_guide/model_preparation.html#model-pre-and-post-processing) for more.

## Tutorial on how to use CPU specific pipelines
### Tutorial on how to use CPU specific pipelines

The pipeline `object_tracking_cpu` in [pipeline-server-config](../../src/dlstreamer-pipeline-server/config.json) contains CPU specific elements and uses CPU backend for inferencing. We can start the pipeline as follows:

```sh
./sample_start.sh cpu
```

Go to grafana as explained in [get-started](./get-started.md) to view the dashboard.
Go to grafana as explained in [get-started](./get-started.md) to view the dashboard.
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ By utilizing cutting-edge technologies and pre-trained deep learning models, thi
how-to-customize-application
how-to-deploy-with-helm
how-to-deploy-with-edge-orchestrator
how-to-benchmark
how-to-view-telemetry-data
how-to-use-gpu-for-inference
how-to-use-cpu-for-inference
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,4 +60,4 @@
}
]
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"name": "object_tracking_cpu",
"source": "gstreamer",
"queue_maxsize": 50,
"pipeline": "{auto_source} name=source ! decodebin force-sw-decoders=true ! gvaattachroi roi=0,200,300,400 ! gvadetect inference-region=1 model=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml model_proc=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/pedestrian-and-vehicle-detector-adas-0001.json device=CPU pre-process-backend=opencv model-instance-id=inst0 name=detection ! queue ! gvatrack tracking-type=short-term-imageless ! queue ! gvametaconvert add-empty-results=true name=metaconvert ! queue ! gvafpscounter ! appsink name=destination",
"pipeline": "{auto_source} name=source ! decodebin force-sw-decoders=true ! gvaattachroi roi=0,200,300,400 ! gvadetect inference-region=1 model=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml model_proc=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/pedestrian-and-vehicle-detector-adas-0001.json device=CPU pre-process-backend=opencv model-instance-id=inst0 inference-interval=3 threshold=0.7 name=detection ! queue ! gvatrack tracking-type=short-term-imageless ! queue ! gvametaconvert add-empty-results=true name=metaconvert ! queue ! gvafpscounter ! appsink name=destination",
"description": "Object detection with yolov8",
"parameters": {
"type": "object",
Expand All @@ -18,13 +18,14 @@
}
}
},

"auto_start": false
},
{
"name": "object_tracking_gpu",
"source": "gstreamer",
"queue_maxsize": 50,
"pipeline": "{auto_source} name=source ! parsebin ! vah264dec ! vapostproc ! video/x-raw(memory:VAMemory) ! gvaattachroi roi=0,200,300,400 ! gvadetect inference-region=1 model=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml model_proc=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/pedestrian-and-vehicle-detector-adas-0001.json device=GPU pre-process-backend=va-surface-sharing model-instance-id=instgpu0 name=detection ! queue ! gvatrack tracking-type=short-term-imageless ! queue ! gvametaconvert add-empty-results=true name=metaconvert ! queue ! gvafpscounter ! appsink name=destination",
"pipeline": "{auto_source} name=source ! parsebin ! vah264dec ! vapostproc ! video/x-raw(memory:VAMemory) ! gvaattachroi roi=0,200,300,400 ! gvadetect inference-region=1 inference-interval=3 batch-size=8 nireq=2 ie-config=\"NUM_STREAMS=2\" threshold=0.7 model=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml model_proc=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/pedestrian-and-vehicle-detector-adas-0001.json device=GPU pre-process-backend=va-surface-sharing model-instance-id=instgpu0 name=detection ! queue ! gvatrack tracking-type=short-term-imageless ! queue ! gvametaconvert add-empty-results=true name=metaconvert ! queue ! gvafpscounter ! appsink name=destination",
"description": "Object detection with yolov8",
"parameters": {
"type": "object",
Expand All @@ -43,7 +44,7 @@
"name": "object_tracking_npu",
"source": "gstreamer",
"queue_maxsize": 50,
"pipeline": "{auto_source} name=source ! parsebin ! vah264dec ! vapostproc ! video/x-raw(memory:VAMemory) ! gvaattachroi roi=0,200,300,400 ! gvadetect inference-region=1 model=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml model_proc=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/pedestrian-and-vehicle-detector-adas-0001.json device=NPU pre-process-backend=va model-instance-id=instnpu0 name=detection ! queue ! gvatrack tracking-type=short-term-imageless ! queue ! gvametaconvert add-empty-results=true name=metaconvert ! queue ! gvafpscounter ! appsink name=destination",
"pipeline": "{auto_source} name=source ! parsebin ! vah264dec ! vapostproc ! video/x-raw(memory:VAMemory) ! gvaattachroi roi=0,200,300,400 ! gvadetect inference-region=1 inference-interval=3 batch-size=1 nireq=4 threshold=0.7 model=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/FP16/pedestrian-and-vehicle-detector-adas-0001.xml model_proc=/home/pipeline-server/models/intel/pedestrian-and-vehicle-detector-adas-0001/pedestrian-and-vehicle-detector-adas-0001.json device=NPU pre-process-backend=va model-instance-id=instnpu0 name=detection ! queue ! gvatrack tracking-type=short-term-imageless ! queue ! gvametaconvert add-empty-results=true name=metaconvert ! queue ! gvafpscounter ! appsink name=destination",
"description": "Object detection with yolov8",
"parameters": {
"type": "object",
Expand Down
Loading