Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 103 additions & 25 deletions docs/user-guide/dev_guide/optimizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,18 @@ Arguments:
Options:
--search-duration SEARCH_DURATION How long should the optimizer search for better pipelines.
--sample-duration SAMPLE_DURATION How long should every pipeline be sampled for performance.
--detection-threshold THRESHOLD Minimum threshold of detections that tested pipelines are
not allowed to cross in order to count as valid alternatives.
--multistream-fps-limit LIMIT Minimum fps limit which streams are not allowed to cross
when optimizing for a multi-stream scenario.
--enable-cross-stream-batching Enable cross stream batching for inference elements in fps mode.
--allowed-devices ALLOWED_DEVICES List of allowed devices (CPU, GPU, NPU) to be used by the optimizer.
If not specified, all available, detected devices will be used.
Tool does not support discrete GPU selection.
eg.--allowed-devices CPU NPU,--allowed-devices GPU
--log-level LEVEL Configure the logging detail level.
-v, --verbose Print information about every candidate pipeline investigated during
optimization process.
```
**`search-duration`** default: `300` seconds \
Increasing the **search duration** will increase the chances of discovering more performant pipelines.
Expand All @@ -64,9 +72,15 @@ but the final result is liable to support less streams overall.
**`enable-cross-stream-batching`** \
Levy the inference instance feature of DL Streamer to batch work across multiple streams in fps mode.

**`allowed-devices`** \
Allows you to limit the set of devices that will be considered during the optimization process.

**`log-level`** default: `INFO` \
Available **log levels** are: CRITICAL, FATAL, ERROR, WARN, INFO, DEBUG.

**`verbose`** \
Prints extra information about the candidate pipelines which were examined during the optimization process.

>**Note**\
>Search duration and sample duration both affect the amount of pipelines that will be explored during the search. \
>The total amount should be approximately `search_duration / sample_duration` pipelines.
Expand Down Expand Up @@ -99,14 +113,14 @@ In this case the optimizer started with a pipeline that ran at ~45fps, and found

## Using the optimizer as a library

The easiest way of importing the optimizer into your scripts is to include it in your `PYTHONPATH` environment variable:
The easiest way of importing the optimizer into your scripts is to include it in your `PYTHONPATH` environment variable: \
```export PYTHONPATH=/opt/intel/dlstreamer/scripts/optimizer```

Targets which are exported in order to facilitate usage inside of scripts:

### `preprocess_pipeline(pipeline) -> processed_pipeline`
- `pipeline` - A string containing a valid DL Streamer pipeline.
- `processed_pipeline` - A string containing the pipeline with all relevant substitutions.
- `pipeline: string` - A string containing a valid DL Streamer pipeline.
- `processed_pipeline: string` - A string containing the pipeline with all relevant substitutions.

Perform quick search and replace for known combinations of elements with more performant alternatives.

Expand All @@ -117,69 +131,134 @@ Initialized without any arguments
optimizer = DLSOptimizer()
```
#### Methods
**`set_search_duration(duration)`**
- `duration` - The duration of searching for optimized pipelines in seconds, default `300`.
**`get_baseline_pipeline() -> pipeline, fps, streams`**
- `pipeline: string` - The baseline pipeline from which optimization started.
- `fps: float` - Fps measured for the baseline pipeline.
- `streams: int` - Number of streams in the baseline pipeline.

Configures the search duration used in optimization sessions.
Returns information about the original pipeline used in the optimization process. Returned values are meaningless until at least one optimization operation is performed.
```
optimizer = DLSOptimizer()
optimizer.set_search_duration(600)
for (_, _) in optimizer.iter_optimize_for_fps(pipeline):
pass
pipeline, fps, streams = optimizer.get_baseline_pipeline()
```
---
**`get_optimal_pipeline() -> pipeline, fps, streams`**
- `pipeline: string` - The best pipeline found during optimization.
- `fps: float` - Fps measured for the optimal pipeline.
- `streams: int` - Number of streams in the optimal pipeline.

Returns information about the best pipeline found during the optimization process. Returned values are meaningless until at least one optimization operation is performed.
```
optimizer = DLSOptimizer()
for (_, _) in optimizer.iter_optimize_for_streams(pipeline):
pass
best_pipeline, best_fps, best_streams = optimizer.get_optimal_pipeline()
```
---
**`set_sample_duration(duration)`**
- `duration` - The duration of sampling each candidate pipeline in seconds, default `10`.
- `duration: int` - The duration of sampling each candidate pipeline in seconds, default `10`.

Configures the sample duration used in optimization sessions.
```
optimizer = DLSOptimizer()
optimizer.set_sample_duration(15)
```
---
**`set_detections_error_threshold(threshold)`**
- `threshold: float` - The threshold of counted detections, between `0.0` and `1.0`, default `0.95`.

Minimum threshold of detections that tested pipelines are not allowed to cross in order to count as valid alternatives.
```
optimizer = DLSOptimizer()
optimizer.set_detections_error_threshold(0.8)
```
---
**`enable_cross_stream_batching(enable)`**
- `enable` - Enable the cross stream batching feature, default `False`.
- `enable: bool` - Enable the cross stream batching feature, default `False`.

Levy the inference instance feature of DL Streamer to batch work across multiple streams when optimizing for fps.
```
optimizer = DLSOptimizer()
optimizer.enable_cross_stream_batching(True)
```

---
**`set_mutlistream_fps_limit(limit)`**
- `limit` - The minimum fps limit allowed for individual streams when optimizing for amount of streams, default `30`.
- `limit: int` - The minimum fps limit allowed for individual streams when optimizing for amount of streams, default `30`.

Configures the minimum fps limit that streams are not allowed to fall below when optimizing for a multi-stream scenario.
```
optimizer = DLSOptimizer()
optimizer.set_multistream_fps_limit(45)
```
---
**`set_allowed_devices(devices)`**
- `devices: list[string]` - A list of device identifiers.

**`optimize_for_fps(pipeline) -> optimized_pipeline, fps`**
- `pipeline` - A string containing a valid DL Streamer pipeline.
- `optimized_pipeline` - A string containing the best performing pipeline that has been found during the search.
- `fps` - The measured fps of the best perfmorming pipeline.
Limits the set of devices which will be considered during the optimization process.
```
optimizer = DLSOptimizer()
optimizer.set_allowed_devices(["CPU", "GPU"])
```
---
**`optimize_for_fps(pipeline, search_duration) -> optimized_pipeline, fps`**
- `pipeline: string` - A string containing a valid DL Streamer pipeline.
- `search_duration: int` - The duration of searching for better pipelines, default `300`.
- `optimized_pipeline: string` - A string containing the best performing pipeline that has been found during the search.
- `fps: float` - The measured fps of the best perfmorming pipeline.

Runs a series of optimization steps on the pipeline searching for version with better performance measured by fps.
Runs a series of optimization steps on the pipeline searching for a version with better performance measured by fps.
```
pipeline = "urisourcebin buffer-size=4096 uri=https://videos.pexels.com/video-files/1192116/1192116-sd_640_360_30fps.mp4 ! decodebin ! gvadetect model=/home/optimizer/models/public/yolo11s/INT8/yolo11s.xml ! queue ! gvawatermark ! fakesink"
optimizer = DLSOptimizer()
optimizer.optimize_for_fps(pipeline)
```
---
**`iter_optimize_for_fps(pipeline) -> optimized_pipeline, fps`**
- `pipeline: string` - A string containing a valid DL Streamer pipeline.
- `optimized_pipeline: string` - A string containing a candidate pipeline that has been tested.
- `fps: float` - The measured fps of the candidate pipeline.

**`optimize_for_streams(pipeline) -> optimized_pipeline, fps, streams`**
- `pipeline` - A string containing a valid DL Streamer pipeline.
- `optimized_pipeline` - A string containing the best performing pipeline that has been found during the search.
- `fps` - The measured fps of the best perfmorming pipeline.
- `streams` - The number of streams capable of running above the fps limit with the optimized pipeline.
Runs a series of optimization steps on the pipeline searching for version with better performance measured by fps. Returns each and every candidate pipeline that has been considered.
```
pipeline = "urisourcebin buffer-size=4096 uri=https://videos.pexels.com/video-files/1192116/1192116-sd_640_360_30fps.mp4 ! decodebin ! gvadetect model=/home/optimizer/models/public/yolo11s/INT8/yolo11s.xml ! queue ! gvawatermark ! fakesink"
optimizer = DLSOptimizer()
for (pipeline, fps) in optimizer.iter_optimize_for_fps(pipeline):
print(f"Tested: {pipeline} @ {fps}")
best_pipeline, best_fps, _ = optimizer.get_optimal_pipeline()
print(f"Optimal pipeline: {best_pipeline} @ {best_fps}")
```
---
**`optimize_for_streams(pipeline, search_duration) -> optimized_pipeline, fps, streams`**
- `pipeline: string` - A string containing a valid DL Streamer pipeline.
- `search_duration: int` - The duration of searching for better pipelines, default `300`.
- `optimized_pipeline: string` - A string containing the best performing pipeline that has been found during the search.
- `fps: float` - The measured fps of the best perfmorming pipeline.
- `streams: int` - The number of streams capable of running above the fps limit with the optimized pipeline.

Searching for a version of the input pipeline which can support the highest number of concurrent streams.
```
pipeline = "urisourcebin buffer-size=4096 uri=https://videos.pexels.com/video-files/1192116/1192116-sd_640_360_30fps.mp4 ! decodebin ! gvadetect model=/home/optimizer/models/public/yolo11s/INT8/yolo11s.xml ! queue ! gvawatermark ! fakesink"
optimizer = DLSOptimizer()
optimizer.optimize_for_streams(pipeline)
```
---
**`iter_optimize_for_streams(pipeline) -> candidate_pipeline, fps, streams`**
- `pipeline: string` - A string containing a valid DL Streamer pipeline.
- `optimized_pipeline: string` - A string containing a candidate pipeline that has been tested.
- `fps: float` - The measured fps of the candidate pipeline.
- `streams: int` - The number of streams capable of running above the fps limit with the candidate pipeline.

Runs a series of optimization steps on the pipeline searching for a better performing versions.

Searching for a version of the input pipeline which can support the highest number of concurrent streams. Returns each and every candidate pipeline that has been considered.
```
pipeline = "urisourcebin buffer-size=4096 uri=https://videos.pexels.com/video-files/1192116/1192116-sd_640_360_30fps.mp4 ! decodebin ! gvadetect model=/home/optimizer/models/public/yolo11s/INT8/yolo11s.xml ! queue ! gvawatermark ! fakesink"
optimizer = DLSOptimizer()
for (pipeline, fps, streams) in optimizer.iter_optimize_for_streams(pipeline):
print(f"Tested: {pipeline} @ {streams} & {fps}")
best_pipeline, best_fps, best_streams = optimizer.get_optimal_pipeline()
print(f"Optimal pipeline: {best_pipeline} @ {best_streams} & {best_fps}")
```
---

**Example:**
Expand All @@ -190,9 +269,8 @@ from optimizer import get_optimized_pipeline
pipeline = "urisourcebin buffer-size=4096 uri=https://videos.pexels.com/video-files/1192116/1192116-sd_640_360_30fps.mp4 ! decodebin ! gvadetect model=/home/optimizer/models/public/yolo11s/INT8/yolo11s.xml ! queue ! gvawatermark ! fakesink"

optimizer = DLSOptimizer()
optimizer.set_search_duration(600)
optimizer.set_sample_duration(15)
optimized_pipeline, fps = optimizer.optimize_for_fps(pipeline)
optimized_pipeline, fps = optimizer.optimize_for_fps(pipeline, search_duration = 600)
print("Best discovered pipeline: " + optimized_pipeline)
print("Measured fps: " + fps)
```
Expand Down
83 changes: 68 additions & 15 deletions scripts/optimizer/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,44 @@
import logging
import textwrap
import sys
import time

from optimizer import DLSOptimizer # pylint: disable=no-name-in-module

def _display_result(pipeline, fps):
logger.info("============================== CANDIDATE =============================")
logger.info("Sampled pipeline: %s", str(pipeline))
logger.info("")
logger.info("Recorded fps: %.2f", fps)
logger.info("======================================================================")

def _display_summary_fps(best_pipeline, best_fps, initial_pipeline, initial_fps):
logger.info("=============================== SUMMARY ==============================")
if best_fps > initial_fps:
logger.info("Optimized pipeline found with %.2f fps improvement over the original pipeline.", best_fps - initial_fps)
logger.info("Original pipeline FPS: %.2f", initial_fps)
logger.info("Optimized pipeline: %s", str(best_pipeline))
logger.info("Optimized pipeline FPS: %.2f", best_fps)
else:
logger.info("No optimized pipeline found that outperforms the original pipeline.")
logger.info("Original pipeline: %s", str(initial_pipeline))
logger.info("Original pipeline FPS: %.2f", initial_fps)
logger.info("======================================================================")

def _display_summary_streams(best_pipeline, best_fps, streams):
full_pipeline = []
for _ in range(0, streams):
full_pipeline.append(best_pipeline)
full_pipeline = " ".join(full_pipeline)

logger.info("=============================== SUMMARY ==============================")
logger.info("Optimized pipeline: %s", str(best_pipeline))
logger.info("Number of streams pipeline can support: %d", streams)
logger.info("Optimized pipeline FPS at max streams: %.2f", best_fps)
logger.info("")
logger.info("Full pipeline: %s", full_pipeline)
logger.info("======================================================================")

parser = argparse.ArgumentParser(
prog="DLStreamer Pipeline Optimization Tool",
formatter_class=argparse.RawTextHelpFormatter,
Expand All @@ -34,6 +69,8 @@
'''))
parser.add_argument("PIPELINE", nargs="+",
help="Pipeline to be analyzed")
parser.add_argument("-v", "--verbose", action="store_true",
help="Print more information about the optimization progress")
parser.add_argument("--search-duration", default=300, type=float,
help="Duration in seconds of time which should be spent searching for optimized pipelines (default: %(default)s)")
parser.add_argument("--sample-duration", default=10, type=float,
Expand All @@ -58,7 +95,6 @@

try:
optimizer = DLSOptimizer()
optimizer.set_search_duration(args.search_duration)
optimizer.set_sample_duration(args.sample_duration)
optimizer.set_detections_error_threshold(args.detection_threshold)
optimizer.set_multistream_fps_limit(args.multistream_fps_limit)
Expand All @@ -77,18 +113,35 @@
try:
match args.mode:
case "fps":
best_pipeline, best_fps = optimizer.optimize_for_fps(pipeline)
case "streams":
best_pipeline, best_fps, streams = optimizer.optimize_for_streams(pipeline)

full_pipeline = []
for _ in range(0, streams):
full_pipeline.append(best_pipeline)
start_time = time.time()
for (pipeline, fps) in optimizer.iter_optimize_for_fps(pipeline):
if args.verbose:
_display_result(pipeline, fps)

cur_time = time.time()
if cur_time - start_time > args.search_duration:
break

base_pipeline, base_fps, _ = optimizer.get_baseline_pipeline()
best_pipeline, best_fps, _ = optimizer.get_optimal_pipeline()
_display_summary_fps(best_pipeline, best_fps, base_pipeline, base_fps)

full_pipeline = " ".join(full_pipeline)

logger.info("Optimized found pipeline for multi-streams: %s", full_pipeline)
logger.info("with fps: %.2f", best_fps)
logger.info("max achieved streams: %d", streams)
except Exception as e: # pylint: disable=broad-exception-caught
logger.error("Failed to optimize pipeline: %s", e)
case "streams":
start_time = time.time()
for (pipeline, fps, streams) in optimizer.iter_optimize_for_streams(pipeline):
full_pipeline = []
for _ in range(0, streams):
full_pipeline.append(pipeline)
full_pipeline = " ".join(full_pipeline)

if args.verbose:
_display_result(full_pipeline, fps)

cur_time = time.time()
if cur_time - start_time > args.search_duration:
break

best_pipeline, best_fps, streams = optimizer.get_optimal_pipeline()
_display_summary_streams(best_pipeline, best_fps, streams)
except RuntimeError as e: # pylint: disable=broad-exception-caught
logger.error("Failed to optimize pipeline: %s", e)
Loading
Loading