Skip to content

Commit 3a811c7

Browse files
[Edge AI Suites] IRD and Metro apps: Add how to use NPU docs (open-edge-platform#1873)
Signed-off-by: Katakol, Rohit <rohit.katakol@intel.com> Co-authored-by: Rajput, Sajeev <sajeev.rajput@intel.com>
1 parent 1b0dc7a commit 3a811c7

File tree

11 files changed

+315
-3
lines changed

11 files changed

+315
-3
lines changed

manufacturing-ai-suite/industrial-edge-insights-vision/docs/user-guide/pcb-anomaly-detection/how-to-guides.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ This section collects guides for PCB Anomaly Detection sample application.
66
- [Manage pipelines](./how-to-guides/manage-pipelines.md)
77
- [Run multiple AI pipelines](./how-to-guides/run-multiple-ai-pipelines.md)
88
- [Use GPU For Inference](./how-to-guides/use-gpu-for-inference.md)
9+
- [Use NPU For Inference](./how-to-guides/use-npu-for-inference.md)
910
- [Use Your AI Model and Video](./how-to-guides/use-your-ai-model-and-video.md)
1011
- [Change the Input Video Source](./how-to-guides/change-input-video-source.md)
1112
- [Scale Video Resolution](./how-to-guides/scale-video-resolution.md)
@@ -25,6 +26,7 @@ This section collects guides for PCB Anomaly Detection sample application.
2526
./how-to-guides/manage-pipelines
2627
./how-to-guides/run-multiple-ai-pipelines
2728
./how-to-guides/use-gpu-for-inference
29+
./how-to-guides/use-npu-for-inference
2830
./how-to-guides/use-your-ai-model-and-video
2931
./how-to-guides/change-input-video-source
3032
./how-to-guides/scale-video-resolution
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# How to use NPU for inference
2+
3+
## Pre-requisites
4+
5+
To take full advantage of hardware acceleration, pipelines can be designed so that different stages—such as decoding and inference—are executed on the most suitable hardware devices.
6+
7+
Low-power accelerators like a Neural Processing Unit (NPU) can offload neural network computation from the CPU or GPU, enabling more efficient resource utilization and improved overall system performance.
8+
9+
DLStreamer and the DLStreamer Pipeline Server support inference on NPU devices, allowing applications built on these frameworks to leverage NPU acceleration for improved efficiency and performance.
10+
11+
Before running inference on an NPU, ensure that:
12+
- The host system includes a supported NPU device
13+
- The required NPU drivers are installed and properly configured
14+
15+
For detailed setup instructions, refer to the [documentation](https://docs.openedgeplatform.intel.com/dev/edge-ai-libraries/dlstreamer/dev_guide/advanced_install/advanced_install_guide_prerequisites.html#optional-prerequisite-2-install-intel-npu-drivers).
16+
17+
For containerized application, following additional changes are required.
18+
19+
### Provide NPU access to the container
20+
21+
This can be done by making the following changes to the docker compose file.
22+
23+
```yaml
24+
services:
25+
dlstreamer-pipeline-server:
26+
group_add:
27+
# render group ID for ubuntu 22.04 host OS
28+
- "110"
29+
# render group ID for ubuntu 24.04 host OS
30+
- "992"
31+
devices:
32+
# you can add specific devices in case you don't want to provide access to all like below.
33+
- "/dev:/dev"
34+
```
35+
The changes above adds the container user to the `render` group and provides access to the NPU devices.
36+
37+
### Hardware specific encoder/decoders
38+
39+
Unlike the changes done for the container above, the following requires a modification to the media pipeline itself.
40+
41+
Gstreamer has a variety of hardware specific encoders and decoders elements such as Intel specific VA-API elements that you can benefit from by adding them into your media pipeline. Examples of such elements are `vah264dec`, `vah264enc`, `vajpegdec`, `vajpegdec`, etc.
42+
43+
Additionally, one can also enforce zero-copy of buffers using GStreamer caps (capabilities) to the pipeline by adding `video/x-raw(memory: VAMemory)` for Intel NPUs.
44+
45+
Read DL Streamer [docs](https://dlstreamer.github.io/dev_guide/gpu_device_selection.html) for more details.
46+
47+
### NPU specific element properties
48+
49+
DL Streamer inference elements also provides property such as `device=NPU` and `pre-process-backend=va` which should be used in pipelines with NPU memory. It performs mapping to the system memory and uses VA pre-processor. Read DL Streamer [docs](https://dlstreamer.github.io/dev_guide/model_preparation.html#model-pre-and-post-processing) for more.
50+
51+
## Tutorial on how to use NPU specific pipelines
52+
53+
> Note - This sample application already provides a default `docker-compose.yml` file that includes the necessary NPU access to the containers.
54+
55+
The pipeline `pcb_anomaly_detection_npu` in `pipeline-server-config.json` contains NPU specific elements and uses NPU backend for inferencing. Follow the steps below to run the pipeline.
56+
57+
### Steps
58+
59+
1. Ensure that the sample application is up and running. If not, follow the steps [here](../get-started.md#set-up-the-application) to setup the application and then bring the services up
60+
61+
>If you're running multiple instances of app, start the services using `./run.sh up` instead.
62+
63+
```sh
64+
docker compose up -d
65+
```
66+
2. Start the pipeline.
67+
```sh
68+
./sample_start.sh -p pcb_anomaly_detection_npu
69+
```
70+
71+
This will start the pipeline. The inference stream can be viewed on WebRTC, in a browser, at the following url:
72+
73+
>If you're running multiple instances of app, ensure to provide `NGINX_HTTPS_PORT` number in the url for the app instance i.e. replace <HOST_IP> with <HOST_IP>:<NGINX_HTTPS_PORT>
74+
75+
```bash
76+
https://<HOST_IP>/mediamtx/anomaly/
77+
```

manufacturing-ai-suite/industrial-edge-insights-vision/docs/user-guide/weld-porosity/how-to-guides.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ This section collects guides for Weld Porosity sample application.
66
- [Manage pipelines](./how-to-guides/manage-pipelines.md)
77
- [Run multiple AI pipelines](./how-to-guides/run-multiple-ai-pipelines.md)
88
- [Use GPU For Inference](./how-to-guides/use-gpu-for-inference.md)
9+
- [Use NPU For Inference](./how-to-guides/use-npu-for-inference.md)
910
- [Use Your AI Model and Video](./how-to-guides/use-your-ai-model-and-video.md)
1011
- [Change the Input Video Source](./how-to-guides/change-input-video-source.md)
1112
- [Scale Video Resolution](./how-to-guides/scale-video-resolution.md)
@@ -25,6 +26,7 @@ This section collects guides for Weld Porosity sample application.
2526
./how-to-guides/manage-pipelines
2627
./how-to-guides/run-multiple-ai-pipelines
2728
./how-to-guides/use-gpu-for-inference
29+
./how-to-guides/use-npu-for-inference
2830
./how-to-guides/use-your-ai-model-and-video
2931
./how-to-guides/change-input-video-source
3032
./how-to-guides/scale-video-resolution
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# How to use NPU for inference
2+
3+
## Pre-requisites
4+
5+
To take full advantage of hardware acceleration, pipelines can be designed so that different stages—such as decoding and inference—are executed on the most suitable hardware devices.
6+
7+
Low-power accelerators like a Neural Processing Unit (NPU) can offload neural network computation from the CPU or GPU, enabling more efficient resource utilization and improved overall system performance.
8+
9+
DLStreamer and the DLStreamer Pipeline Server support inference on NPU devices, allowing applications built on these frameworks to leverage NPU acceleration for improved efficiency and performance.
10+
11+
Before running inference on an NPU, ensure that:
12+
- The host system includes a supported NPU device
13+
- The required NPU drivers are installed and properly configured
14+
15+
For detailed setup instructions, refer to the [documentation](https://docs.openedgeplatform.intel.com/dev/edge-ai-libraries/dlstreamer/dev_guide/advanced_install/advanced_install_guide_prerequisites.html#optional-prerequisite-2-install-intel-npu-drivers).
16+
17+
For containerized application, following additional changes are required.
18+
19+
### Provide NPU access to the container
20+
21+
This can be done by making the following changes to the docker compose file.
22+
23+
```yaml
24+
services:
25+
dlstreamer-pipeline-server:
26+
group_add:
27+
# render group ID for ubuntu 22.04 host OS
28+
- "110"
29+
# render group ID for ubuntu 24.04 host OS
30+
- "992"
31+
devices:
32+
# you can add specific devices in case you don't want to provide access to all like below.
33+
- "/dev:/dev"
34+
```
35+
The changes above adds the container user to the `render` group and provides access to the NPU devices.
36+
37+
### Hardware specific encoder/decoders
38+
39+
Unlike the changes done for the container above, the following requires a modification to the media pipeline itself.
40+
41+
Gstreamer has a variety of hardware specific encoders and decoders elements such as Intel specific VA-API elements that you can benefit from by adding them into your media pipeline. Examples of such elements are `vah264dec`, `vah264enc`, `vajpegdec`, `vajpegdec`, etc.
42+
43+
Additionally, one can also enforce zero-copy of buffers using GStreamer caps (capabilities) to the pipeline by adding `video/x-raw(memory: VAMemory)` for Intel NPUs.
44+
45+
Read DL Streamer [docs](https://dlstreamer.github.io/dev_guide/gpu_device_selection.html) for more details.
46+
47+
### NPU specific element properties
48+
49+
DL Streamer inference elements also provides property such as `device=NPU` and `pre-process-backend=va` which should be used in pipelines with NPU memory. It performs mapping to the system memory and uses VA pre-processor. Read DL Streamer [docs](https://dlstreamer.github.io/dev_guide/model_preparation.html#model-pre-and-post-processing) for more.
50+
51+
## Tutorial on how to use NPU specific pipelines
52+
53+
> Note - This sample application already provides a default `docker-compose.yml` file that includes the necessary NPU access to the containers.
54+
55+
The pipeline `weld_porosity_classification_npu` in `pipeline-server-config.json` contains NPU specific elements and uses NPU backend for inferencing. Follow the steps below to run the pipeline.
56+
57+
### Steps
58+
59+
1. Ensure that the sample application is up and running. If not, follow the steps [here](../get-started.md#set-up-the-application) to setup the application and then bring the services up
60+
61+
>If you're running multiple instances of app, start the services using `./run.sh up` instead.
62+
63+
```sh
64+
docker compose up -d
65+
```
66+
2. Start the pipeline.
67+
```sh
68+
./sample_start.sh -p weld_porosity_classification_npu
69+
```
70+
71+
This will start the pipeline. The inference stream can be viewed on WebRTC, in a browser, at the following url:
72+
73+
>If you're running multiple instances of app, ensure to provide `NGINX_HTTPS_PORT` number in the url for the app instance i.e. replace <HOST_IP> with <HOST_IP>:<NGINX_HTTPS_PORT>
74+
75+
```bash
76+
https://<HOST_IP>/mediamtx/weld/
77+
```

metro-ai-suite/image-based-video-search/docs/user-guide/how-to-use-gpu-for-inference.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ if not already done.
77

88
### Volume mount GPU config
99

10-
Comment out CPU and NPU config and uncomment the GPU config present in [compose.yml](https://github.com/open-edge-platform/edge-ai-suites/blob/main/metro-ai-suite/image-based-video-search/compose.yml)
10+
Comment out CPU and NPU volume mount and uncomment the GPU volume mount present in [compose.yml](https://github.com/open-edge-platform/edge-ai-suites/blob/main/metro-ai-suite/image-based-video-search/compose.yml)
1111
file under `volumes` section as shown below:
1212

1313
```sh
@@ -19,7 +19,7 @@ file under `volumes` section as shown below:
1919

2020
### Start and run the application
2121

22-
After the above changes to docker compose file, follow from step 3 as mentioned in the
22+
After the above changes to docker compose file, follow from step 3 till end of the section as mentioned in the
2323
[Get Started](./get-started.md#set-up-and-first-use) guide.
2424

2525
## Helm deployment
@@ -28,7 +28,7 @@ Follow step 1 mentioned in this [document](./get-started/deploy-with-helm.md#ste
2828

2929
### Update values.yaml
3030

31-
In `values.yaml` file, change value of `pipeline` config present under
31+
In [`values.yaml`](https://github.com/open-edge-platform/edge-ai-suites/blob/main/metro-ai-suite/image-based-video-search/chart/values.yaml) file, change value of `pipeline` config present under
3232
`dlstreamerpipelineserver` section as shown below:
3333

3434
```sh
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# How to use NPU for inference
2+
3+
## Docker deployment
4+
5+
Follow steps 1 and 2 mentioned in [Get Started](./get-started.md#set-up-and-first-use) guide
6+
if not already done.
7+
8+
### Volume mount NPU config
9+
10+
Comment out CPU and GPU volume mount and uncomment the NPU volume mount present in [compose.yml](https://github.com/open-edge-platform/edge-ai-suites/blob/main/metro-ai-suite/image-based-video-search/compose.yml)
11+
file under `volumes` section as shown below:
12+
13+
```sh
14+
volumes:
15+
# - "./src/dlstreamer-pipeline-server/configs/filter-pipeline/config.cpu.json:/home/pipeline-server/config.json"
16+
# - "./src/dlstreamer-pipeline-server/configs/filter-pipeline/config.gpu.json:/home/pipeline-server/config.json"
17+
- "./src/dlstreamer-pipeline-server/configs/filter-pipeline/config.npu.json:/home/pipeline-server/config.json"
18+
```
19+
20+
### Start and run the application
21+
22+
After the above changes to docker compose file, follow from step 3 till end of the section as mentioned in the
23+
[Get Started](./get-started.md#set-up-and-first-use) guide.
24+
25+
## Helm deployment
26+
27+
Follow step 1 mentioned in this [document](./get-started/deploy-with-helm.md#steps-to-deploy) if not already done.
28+
29+
### Update values.yaml
30+
31+
In [values.yaml](https://github.com/open-edge-platform/edge-ai-suites/blob/main/metro-ai-suite/image-based-video-search/chart/values.yaml) file, change value of `pipeline` config present under
32+
`dlstreamerpipelineserver` section as shown below:
33+
34+
```sh
35+
dlstreamerpipelineserver:
36+
# key: dlstreamerpipelineserver.repository
37+
repository:
38+
# key: dlstreamerpipelineserver.repository.image
39+
image: docker.io/intel/dlstreamer-pipeline-server
40+
# key: dlstreamerpipelineserver.repository.tag
41+
tag: 2025.2.0-ubuntu24
42+
# key: dlstreamerpipelineserver.replicas
43+
replicas: 1
44+
# key: dlstreamerpipelineserver.nodeSelector
45+
nodeSelector: {}
46+
# key: dlstreamerpipelineserver.pipeline
47+
pipeline: config.npu.json #### Changed value from config.cpu.json to config.npu.json
48+
```
49+
50+
### Start the application
51+
52+
After above changes to `values.yaml` file, follow from step 2 as mentioned in the
53+
[Helm Deployment Guide](./get-started/deploy-with-helm.md#steps-to-deploy).

metro-ai-suite/image-based-video-search/docs/user-guide/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,7 @@ continuously and appears in the UI as soon as the application starts.
8787
get-started
8888
how-it-works
8989
how-to-use-gpu-for-inference
90+
how-to-use-npu-for-inference
9091
troubleshooting
9192
release-notes
9293

metro-ai-suite/metro-vision-ai-app-recipe/loitering-detection/docs/user-guide/how-to-guides.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ This section collects guides for the Loitering Detection sample application.
1313
1414
./how-to-guides/customize-application
1515
./how-to-guides/use-gpu-for-inference
16+
./how-to-guides/use-npu-for-inference
1617
./how-to-guides/view-telemetry-data
1718
./how-to-guides/benchmark
1819
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Use NPU for Inference
2+
3+
## Pre-requisites
4+
5+
In order to benefit from hardware acceleration, pipelines can be constructed in a manner that
6+
different stages such as decoding, inference etc., can make use of these devices.
7+
For containerized applications built using the DL Streamer Pipeline Server, first we need to
8+
provide NPU device(s) access to the container user.
9+
10+
### Provide NPU access to the container
11+
This can be done by making the following changes to the docker compose file.
12+
13+
```yaml
14+
services:
15+
dlstreamer-pipeline-server:
16+
group_add:
17+
# render group ID for ubuntu 22.04 host OS
18+
- "110"
19+
# render group ID for ubuntu 24.04 host OS
20+
- "992"
21+
devices:
22+
# you can add specific devices in case you don't want to provide access to all like below.
23+
- "/dev:/dev"
24+
```
25+
The changes above adds the container user to the `render` group and provides access to the NPU
26+
devices.
27+
28+
### Hardware specific encoder/decoders
29+
Unlike the changes done for the container above, the following requires a modification to the
30+
media pipeline itself.
31+
32+
Gstreamer has a variety of hardware specific encoders and decoders elements such as Intel
33+
specific VA-API elements that you can benefit from by adding them into your media pipeline.
34+
Examples of such elements are `vah264dec`, `vah264enc`, `vajpegdec`, `vajpegdec`, etc.
35+
36+
## Tutorial on how to use NPU specific pipelines
37+
38+
> **Note:** This sample application already provides a default `compose-without-scenescape.yml`
39+
> file that includes the necessary NPU access to the containers.
40+
41+
The pipeline `object_tracking_npu` in DLStreamer Pipeline Server's `config.json`
42+
contains NPU specific elements and uses NPU backend for inferencing. We can start the pipeline
43+
as follows:
44+
45+
```sh
46+
./sample_start.sh npu
47+
```
48+
49+
Go to Grafana as explained in [Get Started](../get-started.md) to view the dashboard.

metro-ai-suite/metro-vision-ai-app-recipe/smart-parking/docs/user-guide/how-to-guides.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ This section collects guides for the Smart Parking sample application.
1515
./how-to-guides/customize-application
1616
./how-to-guides/generate-offline-package
1717
./how-to-guides/use-gpu-for-inference
18+
./how-to-guides/use-npu-for-inference
1819
./how-to-guides/view-telemetry-data
1920
./how-to-guides/benchmark
2021

0 commit comments

Comments
 (0)