Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions docs/source/elements/gvaclassify.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,9 @@ Element Properties:
batch-size : Number of frames batched together for a single inference. If the batch-size is 0, then it will be set by default to be optimal for the device. Not all models support batching. Use model optimizer to ensure that the model has batching support.
flags: readable, writable
Unsigned Integer. Range: 0 - 1024 Default: 0
batch-timeout : Timeout (ms) for OpenVINO™ Automatic Batching. Waits for batch to accumulate inference requests before execution. If the number of frames collected reaches batch-size, inference is executed with a full batch and the timer is reset. If timeout occurs before collecting all frames specified by batch-size, inference is executed on collected frames individually (as if batch-size=1) and the timer is reset. If batch-timeout is set to 0, it operates as if batch-size were set to 1, executing inference on individual frames. Value -1 disables timeout, waiting indefinitely for full batch. Note: Not supported with VA backends (pre-process-backend=va or va-surface-sharing).
flags: readable, writable
Integer. Range: -1 - 2147483647 Default: -1
cpu-throughput-streams: Deprecated. Use ie-config=CPU_THROUGHPUT_STREAMS=<number-streams> instead
flags: readable, writable, deprecated
Unsigned Integer. Range: 0 - 4294967295 Default: 0
Expand Down Expand Up @@ -155,8 +158,8 @@ scale-method : Scale method to use in pre-preprocessing before inference.
String. Default: null
scheduling-policy : Scheduling policy across streams sharing same model instance: throughput (select first incoming frame), latency (select frames with earliest presentation time out of the streams sharing same model-instance-id; recommended batch-size less than or equal to the number of streams)
flags: readable, writable
String. Default: null
share-va-display-ctx: Feature allowing sharing VA Display context across inference elements
String. Default: "throughput"
share-va-display-ctx: Whether to share VA Display context across inference elements: true (share context, default), false (do not share context)
flags: readable, writable
Boolean. Default: true
```
7 changes: 5 additions & 2 deletions docs/source/elements/gvadetect.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,9 @@ Element Properties:
batch-size : Number of frames batched together for a single inference. If the batch-size is 0, then it will be set by default to be optimal for the device. Not all models support batching. Use model optimizer to ensure that the model has batching support.
flags: readable, writable
Unsigned Integer. Range: 0 - 1024 Default: 0
batch-timeout : Timeout (ms) for OpenVINO™ Automatic Batching. Waits for batch to accumulate inference requests before execution. If the number of frames collected reaches batch-size, inference is executed with a full batch and the timer is reset. If timeout occurs before collecting all frames specified by batch-size, inference is executed on collected frames individually (as if batch-size=1) and the timer is reset. If batch-timeout is set to 0, it operates as if batch-size were set to 1, executing inference on individual frames. Value -1 disables timeout, waiting indefinitely for full batch. Note: Not supported with VA backends (pre-process-backend=va or va-surface-sharing).
flags: readable, writable
Integer. Range: -1 - 2147483647 Default: -1
cpu-throughput-streams: Deprecated. Use ie-config=CPU_THROUGHPUT_STREAMS=<number-streams> instead
flags: readable, writable, deprecated
Unsigned Integer. Range: 0 - 4294967295 Default: 0
Expand Down Expand Up @@ -149,8 +152,8 @@ Element Properties:
String. Default: null
scheduling-policy : Scheduling policy across streams sharing same model instance: throughput (select first incoming frame), latency (select frames with earliest presentation time out of the streams sharing same model-instance-id; recommended batch-size less than or equal to the number of streams)
flags: readable, writable
String. Default: null
share-va-display-ctx: Feature allowing sharing VA Display context across inference elements
String. Default: "throughput"
share-va-display-ctx: Whether to share VA Display context across inference elements: true (share context, default), false (do not share context)
flags: readable, writable
Boolean. Default: true
threshold : Threshold for detection results. Only regions of interest with confidence values above the threshold will be added to the frame
Expand Down
7 changes: 5 additions & 2 deletions docs/source/elements/gvainference.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,9 @@ Element Properties:
batch-size : Number of frames batched together for a single inference. If the batch-size is 0, then it will be set by default to be optimal for the device. Not all models support batching. Use model optimizer to ensure that the model has batching support.
flags: readable, writable
Unsigned Integer. Range: 0 - 1024 Default: 0
batch-timeout : Timeout (ms) for OpenVINO™ Automatic Batching. Waits for batch to accumulate inference requests before execution. If the number of frames collected reaches batch-size, inference is executed with a full batch and the timer is reset. If timeout occurs before collecting all frames specified by batch-size, inference is executed on collected frames individually (as if batch-size=1) and the timer is reset. If batch-timeout is set to 0, it operates as if batch-size were set to 1, executing inference on individual frames. Value -1 disables timeout, waiting indefinitely for full batch. Note: Not supported with VA backends (pre-process-backend=va or va-surface-sharing).
flags: readable, writable
Integer. Range: -1 - 2147483647 Default: -1
cpu-throughput-streams: Deprecated. Use ie-config=CPU_THROUGHPUT_STREAMS=<number-streams> instead
flags: readable, writable, deprecated
Unsigned Integer. Range: 0 - 4294967295 Default: 0
Expand Down Expand Up @@ -147,8 +150,8 @@ Element Properties:
String. Default: null
scheduling-policy : Scheduling policy across streams sharing same model instance: throughput (select first incoming frame), latency (select frames with earliest presentation time out of the streams sharing same model-instance-id; recommended batch-size less than or equal to the number of streams)
flags: readable, writable
String. Default: null
share-va-display-ctx: Feature allowing sharing VA Display context across inference elements
String. Default: "throughput"
share-va-display-ctx: Whether to share VA Display context across inference elements: true (share context, default), false (do not share context)
flags: readable, writable
Boolean. Default: true
```
57 changes: 42 additions & 15 deletions src/monolithic/gst/inference_elements/base/gva_base_inference.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@
#define DEFAULT_MAX_BATCH_SIZE 1024
#define DEFAULT_BATCH_SIZE 0

// Note: DEFAULT_BATCH_TIMEOUT is defined in gva_base_inference.h for external access
#define DEFAULT_MIN_BATCH_TIMEOUT -1
#define DEFAULT_MAX_BATCH_TIMEOUT INT_MAX

#define DEFAULT_MIN_RESHAPE_WIDTH 0
#define DEFAULT_MAX_RESHAPE_WIDTH UINT_MAX
#define DEFAULT_RESHAPE_WIDTH 0
Expand Down Expand Up @@ -90,6 +94,7 @@ enum {
PROP_INFERENCE_INTERVAL,
PROP_RESHAPE,
PROP_BATCH_SIZE,
PROP_BATCH_TIMEOUT,
PROP_RESHAPE_WIDTH,
PROP_RESHAPE_HEIGHT,
PROP_NO_BLOCK,
Expand Down Expand Up @@ -289,6 +294,21 @@ void gva_base_inference_class_init(GvaBaseInferenceClass *klass) {
"that the model has batching support.",
DEFAULT_MIN_BATCH_SIZE, DEFAULT_MAX_BATCH_SIZE, DEFAULT_BATCH_SIZE, param_flags));

g_object_class_install_property(
gobject_class, PROP_BATCH_TIMEOUT,
g_param_spec_int("batch-timeout", "Batch timeout",
"Timeout (ms) for OpenVINO™ Automatic Batching. Waits for batch to accumulate inference "
"requests before execution. "
"If the number of frames collected reaches batch-size, inference is executed with a full "
"batch and the timer is reset. "
"If timeout occurs before collecting all frames specified by batch-size, inference is "
"executed on collected frames individually (as if batch-size=1) and the timer is reset. "
"If batch-timeout is set to 0, it operates as if batch-size were set to 1, executing "
"inference on individual frames. "
"Value -1 disables timeout, waiting indefinitely for full batch. "
"Note: Not supported with VA backends (pre-process-backend=va or va-surface-sharing).",
DEFAULT_MIN_BATCH_TIMEOUT, DEFAULT_MAX_BATCH_TIMEOUT, DEFAULT_BATCH_TIMEOUT, param_flags));

g_object_class_install_property(
gobject_class, PROP_INFERENCE_INTERVAL,
g_param_spec_uint("inference-interval", "Inference Interval",
Expand Down Expand Up @@ -489,6 +509,7 @@ void gva_base_inference_init(GvaBaseInference *base_inference) {
base_inference->inference_interval = DEFAULT_INFERENCE_INTERVAL;
base_inference->reshape = DEFAULT_RESHAPE;
base_inference->batch_size = DEFAULT_BATCH_SIZE;
base_inference->batch_timeout = DEFAULT_BATCH_TIMEOUT;
base_inference->reshape_width = DEFAULT_RESHAPE_WIDTH;
base_inference->reshape_height = DEFAULT_RESHAPE_HEIGHT;
base_inference->no_block = DEFAULT_NO_BLOCK;
Expand Down Expand Up @@ -619,6 +640,9 @@ void gva_base_inference_set_property(GObject *object, guint property_id, const G
case PROP_BATCH_SIZE:
base_inference->batch_size = g_value_get_uint(value);
break;
case PROP_BATCH_TIMEOUT:
base_inference->batch_timeout = g_value_get_int(value);
break;
case PROP_RESHAPE_WIDTH:
base_inference->reshape_width = g_value_get_uint(value);
break;
Expand Down Expand Up @@ -750,6 +774,9 @@ void gva_base_inference_get_property(GObject *object, guint property_id, GValue
case PROP_BATCH_SIZE:
g_value_set_uint(value, base_inference->batch_size);
break;
case PROP_BATCH_TIMEOUT:
g_value_set_int(value, base_inference->batch_timeout);
break;
case PROP_RESHAPE_WIDTH:
g_value_set_uint(value, base_inference->reshape_width);
break;
Expand Down Expand Up @@ -1043,21 +1070,21 @@ gboolean gva_base_inference_start(GstBaseTransform *trans) {

GST_DEBUG_OBJECT(base_inference, "start");

GST_INFO_OBJECT(base_inference,
"%s inference parameters:\n -- Model: %s\n -- Model proc: %s\n "
"-- Device: %s\n -- Inference interval: %d\n -- Reshape: %s\n -- Batch size: %d\n "
"-- Reshape width: %d\n -- Reshape height: %d\n -- No block: %s\n -- Num of requests: %d\n "
"-- Model instance ID: %s\n -- CPU streams: %d\n -- GPU streams: %d\n -- IE config: %s\n "
"-- Allocator name: %s\n -- Preprocessing type: %s\n -- Object class: %s\n "
"-- Labels: %s\n",
GST_ELEMENT_NAME(GST_ELEMENT_CAST(base_inference)), base_inference->model,
base_inference->model_proc, base_inference->device, base_inference->inference_interval,
base_inference->reshape ? "true" : "false", base_inference->batch_size,
base_inference->reshape_width, base_inference->reshape_height,
base_inference->no_block ? "true" : "false", base_inference->nireq,
base_inference->model_instance_id, base_inference->cpu_streams, base_inference->gpu_streams,
base_inference->ie_config, base_inference->allocator_name, base_inference->pre_proc_type,
base_inference->object_class, base_inference->labels);
GST_INFO_OBJECT(
base_inference,
"%s inference parameters:\n -- Model: %s\n -- Model proc: %s\n "
"-- Device: %s\n -- Inference interval: %d\n -- Reshape: %s\n -- Batch size: %d\n -- Batch timeout: %d\n "
"-- Reshape width: %d\n -- Reshape height: %d\n -- No block: %s\n -- Num of requests: %d\n "
"-- Model instance ID: %s\n -- CPU streams: %d\n -- GPU streams: %d\n -- IE config: %s\n "
"-- Allocator name: %s\n -- Preprocessing type: %s\n -- Object class: %s\n "
"-- Labels: %s\n",
GST_ELEMENT_NAME(GST_ELEMENT_CAST(base_inference)), base_inference->model, base_inference->model_proc,
base_inference->device, base_inference->inference_interval, base_inference->reshape ? "true" : "false",
base_inference->batch_size, base_inference->batch_timeout, base_inference->reshape_width,
base_inference->reshape_height, base_inference->no_block ? "true" : "false", base_inference->nireq,
base_inference->model_instance_id, base_inference->cpu_streams, base_inference->gpu_streams,
base_inference->ie_config, base_inference->allocator_name, base_inference->pre_proc_type,
base_inference->object_class, base_inference->labels);

if (!gva_base_inference_check_properties_correctness(base_inference)) {
return base_inference->initialized;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright (C) 2018-2025 Intel Corporation
* Copyright (C) 2018-2026 Intel Corporation
*
* SPDX-License-Identifier: MIT
******************************************************************************/
Expand All @@ -16,6 +16,8 @@
#include <gst/gst.h>
#include <gst/video/video.h>

#define DEFAULT_BATCH_TIMEOUT -1

G_BEGIN_DECLS

#define GST_TYPE_GVA_BASE_INFERENCE_REGION (gst_gva_base_inference_get_inf_region())
Expand All @@ -40,6 +42,7 @@ typedef struct _GvaBaseInference {
gboolean share_va_display_ctx;
guint inference_interval;
guint batch_size;
gint batch_timeout;
guint reshape_width;
guint reshape_height;
guint nireq;
Expand Down
18 changes: 14 additions & 4 deletions src/monolithic/gst/inference_elements/base/inference_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,15 @@
#include "config.h"
#include "gmutex_lock_guard.h"
#include "gst_allocator_wrapper.h"
#include "gva_base_inference.h"
#include "gva_base_inference_priv.hpp"
#include "gva_caps.h"
#include "gva_utils.h"
#include "inference_backend/logger.h"
#include "inference_backend/pre_proc.h"
#include "logger_functions.h"
#include "model_proc_provider.h"
#include "processor_types.h"
#include "region_of_interest.h"
#include "safe_arithmetic.hpp"
#include "scope_guard.h"
Expand All @@ -33,6 +35,7 @@
#include <gst/analytics/analytics.h>
#include <map>
#include <memory>
#include <openvino/runtime/core.hpp>
#include <openvino/runtime/properties.hpp>
#include <regex>
#include <sstream>
Expand Down Expand Up @@ -214,6 +217,11 @@ InferenceConfig CreateNestedInferenceConfig(GvaBaseInference *gva_base_inference
}
base[KEY_CAPS_FEATURE] = std::to_string(static_cast<int>(gva_base_inference->caps_feature));

const int batch_timeout = gva_base_inference->batch_timeout;
if (batch_timeout > -1) {
inference[ov::auto_batch_timeout.name()] = std::to_string(batch_timeout);
}

// add KEY_VAAPI_THREAD_POOL_SIZE, KEY_VAAPI_FAST_SCALE_LOAD_FACTOR elements to preprocessor config
// other elements from pre_processor info are consumed by model proc info
for (const auto &element : Utils::stringToMap(gva_base_inference->pre_proc_config)) {
Expand Down Expand Up @@ -834,8 +842,9 @@ InferenceImpl::Model InferenceImpl::CreateModel(GvaBaseInference *gva_base_infer
model.inference = image_inference;
model.name = image_inference->GetModelName();

// if auto batch size was requested, use the actual batch size determined by inference instance
if (gva_base_inference->batch_size == 0)
// if auto batch size or OpenVINO Automatic Batching was requested, use the actual batch size determined by
// inference instance
if (gva_base_inference->batch_size == 0 || gva_base_inference->batch_timeout != DEFAULT_BATCH_TIMEOUT)
gva_base_inference->batch_size = model.inference->GetBatchSize();

return model;
Expand Down Expand Up @@ -866,7 +875,8 @@ InferenceImpl::InferenceImpl(GvaBaseInference *gva_base_inference) {
allocator = CreateAllocator(gva_base_inference->allocator_name);

GVA_INFO("Loading model: device=%s, path=%s", std::string(gva_base_inference->device).c_str(), model_file.c_str());
GVA_INFO("Initial settings: batch_size=%u, nireq=%u", gva_base_inference->batch_size, gva_base_inference->nireq);
GVA_INFO("Initial settings: batch_size=%u, batch_timeout=%d, nireq=%u", gva_base_inference->batch_size,
gva_base_inference->batch_timeout, gva_base_inference->nireq);
this->model = CreateModel(gva_base_inference, model_file, model_proc, labels_str, custom_preproc_lib);
}

Expand Down Expand Up @@ -911,7 +921,7 @@ void InferenceImpl::UpdateModelReshapeInfo(GvaBaseInference *gva_base_inference)
return;
}

if (gva_base_inference->batch_size > 1) {
if (gva_base_inference->batch_size > 1 && gva_base_inference->batch_timeout == -1) {
GVA_WARNING("reshape switched to TRUE because batch-size (%u) is greater than one",
gva_base_inference->batch_size);
gva_base_inference->reshape = true;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright (C) 2018-2025 Intel Corporation
* Copyright (C) 2018-2026 Intel Corporation
*
* SPDX-License-Identifier: MIT
******************************************************************************/
Expand Down Expand Up @@ -79,6 +79,7 @@ void fillElementProps(GvaBaseInference *targetElem, GvaBaseInference *masterElem
COPY_GSTRING(targetElem->device, masterElem->device);
COPY_GSTRING(targetElem->model_proc, masterElem->model_proc);
targetElem->batch_size = masterElem->batch_size;
targetElem->batch_timeout = masterElem->batch_timeout;
targetElem->inference_interval = masterElem->inference_interval;
targetElem->no_block = masterElem->no_block;
targetElem->nireq = masterElem->nireq;
Expand Down
Loading
Loading