Skip to content

[Feat] Support multi-device inference and OCR batch inference #3923

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 49 commits into
base: develop
Choose a base branch
from

Conversation

Bobholamovic
Copy link
Member

@Bobholamovic Bobholamovic commented Apr 28, 2025

  1. 大部分OCR类产线支持batch size>1。
  2. 新增并行推理支持:内建多卡推理能力;提供多卡、多实例推理应用代码示例;新增和更新文档。

@Bobholamovic Bobholamovic changed the title [Fix] Support multi-device inference and OCR batch inference [Feat] Support multi-device inference and OCR batch inference Apr 28, 2025
Copy link

paddle-bot bot commented Apr 28, 2025

Thanks for your contribution!

@Bobholamovic Bobholamovic removed the wip label May 6, 2025
@@ -39,7 +39,6 @@ In short, just three steps:
* `use_hpip`:`bool` type, whether to enable the high-performance inference plugin;
* `hpi_config`:`dict | None` type, high-performance inference configuration;
* _`inference hyperparameters`_: used to set common inference hyperparameters. Please refer to specific model description document for details.
* Return Value: `BasePredictor` type.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

实际返回的类型并不是BasePredictor,而是一个单下划线开头的内部类,故去除这里的错误说明。

@@ -4,7 +4,7 @@ comments: true

# PaddleX High-Performance Inference Guide

In real production environments, many applications impose strict performance metrics—especially in response time—on deployment strategies to ensure system efficiency and a smooth user experience. To address this, PaddleX offers a high-performance inference plugin that, through automatic configuration and multi-backend inference capabilities, enables users to significantly accelerate model inference without concerning themselves with complex configurations and low-level details.
In real production environments, many applications impose strict performance metrics—especially in response time—on deployment strategies to ensure system efficiency and a smooth user experience. To address this, PaddleX offers a high-performance inference plugin that, through automatic configuration and multi-backend inference capabilities, enables users to significantly accelerate model inference without concerning themselves with complex configurations and low-level details. In addition to supporting inference acceleration on pipelines, the PaddleX high-performance inference plugin can also be used to accelerate inference when modules are used standalone.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

基本上,各种“部署”相关的概念是适用于产线的,但高性能推理比较特别,也适用于模块,所以这里特别提了一下。

@@ -1,6 +1,8 @@

pipeline_name: PP-StructureV3

batch_size: 8
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

默认使用测试中表现最好的配置

@@ -833,7 +833,7 @@ def _build_ui_runtime(self, backend, backend_config, ui_option=None):
for name, shapes in backend_config.dynamic_shapes.items():
ui_option.trt_option.set_shape(name, *shapes)
else:
logging.warning(
logging.info(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里不应该是警告,而是预期的行为,所以用info

@@ -335,6 +335,8 @@ def _prepare_pp_option(
device_info = None
if pp_option is None:
pp_option = PaddlePredictorOption(model_name=self.model_name)
elif pp_option.model_name is None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

支持自动注入model_name,从而允许产线也通过pp_option设置一些基本配置

@@ -105,6 +104,8 @@ def resize_image_type1(self, img):
resize_w = ori_w * resize_h / ori_h
N = math.ceil(resize_w / 32)
resize_w = N * 32
if resize_h == ori_h and resize_w == ori_w:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于不resize的情况直接返回原图,减小开销

config["use_hpip"] = use_hpip
if hpi_config is not None:
config["hpi_config"] = hpi_config
if use_hpip is None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里原本的逻辑有点问题,可能导致配置文件中的设置不生效,这里修复了

@@ -21,7 +21,7 @@
set_env_for_device,
update_device_num,
)
from ...utils.flags import FLAGS_json_format_model, DISABLE_CINN_MODEL_WL
from ...utils.flags import DISABLE_CINN_MODEL_WL, FLAGS_json_format_model
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linter自动修复

block.region_label not in mask_labels
and block.secondary_direction == cut_direction
):
if len(all_boxes) > 0:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

缺少对边界情况all_boxes为空的处理

@@ -101,7 +101,9 @@ def create_model(self, config: Dict, **kwargs) -> BasePredictor:
model_dir=model_dir,
device=self.device,
batch_size=config.get("batch_size", 1),
pp_option=self.pp_option,
pp_option=(
self.pp_option.copy() if self.pp_option is not None else self.pp_option
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PaddlePredictorOption和Predictor是组合关系(predictor对pp_option的生命周期负责)而不是聚合关系,同时采用依赖注入的方式,为了防止predictor对pp_option的原地修改导致一个产线的不同模型错误地共享trt dynamic shape等信息,每次创建模型时对pp_option作一次拷贝

@Bobholamovic Bobholamovic requested a review from cuicheng01 May 6, 2025 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant