diff --git a/docs/installation/paddlepaddle_install.en.md b/docs/installation/paddlepaddle_install.en.md
index 4e6b693004..e7935fdd9b 100644
--- a/docs/installation/paddlepaddle_install.en.md
+++ b/docs/installation/paddlepaddle_install.en.md
@@ -46,7 +46,7 @@ nvidia-docker run --name paddlex -v $PWD:/paddle  --shm-size=8G --network=host -
 To use [Paddle Inference TensorRT Subgraph Engine](https://www.paddlepaddle.org.cn/documentation/docs/en/install/pip/linux-pip_en.html#gpu), install TensorRT by executing the following instructions in the 'paddlex' container that has just been started
 
 ```bash
-python -m pip install /usr/local/TensorRT-8.6.1.6/python/tensorrt-8.6.1-cp310-none-linux_x86_64.whl
+python -m pip install /usr/local/TensorRT-*/python/tensorrt-*-cp310-none-linux_x86_64.whl
 ```
 
 ## Installing PaddlePaddle via pip
@@ -94,7 +94,7 @@ tar xvf TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz
 # Install TensorRT wheel package
 python -m pip install TensorRT-8.6.1.6/python/tensorrt-8.6.1-cp310-none-linux_x86_64.whl
 # Add the absolute path of TensorRT's `lib` directory to LD_LIBRARY_PATH
-export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:TensorRT-8.6.1.6/lib
+export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:TensorRT-8.6.1.6/lib"
 ```
 
 > ❗ <b>Note</b>: If you encounter any issues during the installation process, feel free to [submit an issue](https://github.com/PaddlePaddle/Paddle/issues) in the Paddle repository.
diff --git a/docs/installation/paddlepaddle_install.md b/docs/installation/paddlepaddle_install.md
index 06aeb190f1..dd16cead5f 100644
--- a/docs/installation/paddlepaddle_install.md
+++ b/docs/installation/paddlepaddle_install.md
@@ -47,7 +47,7 @@ nvidia-docker run --name paddlex -v $PWD:/paddle --shm-size=8G --network=host -i
 在刚刚启动的 `paddlex` 容器中执行下面指令安装 TensorRT，即可使用 [Paddle Inference TensorRT 子图引擎](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/paddle_v3_features/paddle_trt_cn.html)：
 
 ```bash
-python -m pip install /usr/local/TensorRT-8.6.1.6/python/tensorrt-8.6.1-cp310-none-linux_x86_64.whl
+python -m pip install /usr/local/TensorRT-*/python/tensorrt-*-cp310-none-linux_x86_64.whl
 ```
 
 ## 基于 pip 安装飞桨
@@ -94,7 +94,7 @@ tar xvf TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz
 # 安装 TensorRT wheel 包
 python -m pip install TensorRT-8.6.1.6/python/tensorrt-8.6.1-cp310-none-linux_x86_64.whl
 # 添加 TensorRT 的 `lib` 目录的绝对路径到 LD_LIBRARY_PATH 中
-export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:TensorRT-8.6.1.6/lib
+export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:TensorRT-8.6.1.6/lib"
 ```
 
 > ❗ <b>注</b>：如果在安装的过程中，出现任何问题，欢迎在Paddle仓库中[提Issue](https://github.com/PaddlePaddle/Paddle/issues)。
diff --git a/docs/module_usage/instructions/model_python_API.en.md b/docs/module_usage/instructions/model_python_API.en.md
index f99d8789e4..71cc956bf1 100644
--- a/docs/module_usage/instructions/model_python_API.en.md
+++ b/docs/module_usage/instructions/model_python_API.en.md
@@ -39,7 +39,6 @@ In short, just three steps:
     * `use_hpip`：`bool` type, whether to enable the high-performance inference plugin;
     * `hpi_config`：`dict | None` type, high-performance inference configuration;
     * _`inference hyperparameters`_: used to set common inference hyperparameters. Please refer to specific model description document for details.
-  * Return Value: `BasePredictor` type.
 
 ### 2. Perform Inference Prediction by Calling the `predict()` Method of the Prediction Model Object
 
diff --git a/docs/module_usage/instructions/model_python_API.md b/docs/module_usage/instructions/model_python_API.md
index 238ab6c0ce..9cfa1b3377 100644
--- a/docs/module_usage/instructions/model_python_API.md
+++ b/docs/module_usage/instructions/model_python_API.md
@@ -40,7 +40,6 @@ for res in output:
     * `use_hpip`：`bool` 类型，是否启用高性能推理插件；
     * `hpi_config`：`dict | None` 类型，高性能推理配置；
     * _`推理超参数`_：支持常见推理超参数的修改，具体参数说明详见具体模型文档；
-  * 返回值：`BasePredictor` 类型。
 
 ### 2. 调用预测模型对象的`predict()`方法进行推理预测
 
diff --git a/docs/pipeline_deploy/edge_deploy.en.md b/docs/pipeline_deploy/edge_deploy.en.md
index aaf9e1730b..564accfe4f 100644
--- a/docs/pipeline_deploy/edge_deploy.en.md
+++ b/docs/pipeline_deploy/edge_deploy.en.md
@@ -190,7 +190,7 @@ This guide applies to 8 models across 6 modules:
     <b>Note</b>:
     - `{Pipeline_Name}` and `{Demo_Name}` are placeholders. Refer to the table at the end of this section for specific values.
     - `download.sh` and `run.sh` support passing in model names to specify models. If not specified, the default model will be used. Refer to the `Model_Name` column in the table at the end of this section for currently supported models.
-    - To use your own trained model, refer to the [Model Conversion Method](https://paddlepaddle.github.io/Paddle-Lite/develop/model_optimize_tool/) to obtain the `.nb` model, place it in the `PaddleX_Lite_Deploy/{Pipeline_Name}/assets/{Model_Name}` directory, where `{Model_Name}` is the model name, e.g., `PaddleX_Lite_Deploy/object_detection/assets/PicoDet-L`.
+    - To use your own trained model, refer to the [Model Conversion Method](https://paddlepaddle.github.io/Paddle-Lite/develop/model_optimize_tool/) to obtain the `.nb` model, place it in the `PaddleX_Lite_Deploy/{Pipeline_Name}/assets/{Model_Name}` directory, where `{Model_Name}` is the model name, e.g., `PaddleX_Lite_Deploy/object_detection/assets/PicoDet-L`. Please note that converting static graph models in `.json` format to `.nb` format is currently not supported. When exporting a static graph model using PaddleX, please set the environment variable `FLAGS_json_format_model` to `0`.
     - Before running the `build.sh` script, change the path specified by `NDK_ROOT` to the actual installed NDK path.
     - Keep ADB connected when running the `build.sh` script.
     - On Windows systems, you can use Git Bash to execute the deployment steps.
@@ -305,6 +305,7 @@ This section describes the deployment steps applicable to the demos listed in th
 </table>
 
 <b>Note</b>
+
 - Currently, there is no demo for deploying the Layout Area Detection module on the edge, so the `picodet_detection` demo is reused to deploy the `PicoDet_layout_1x` model.
 
 ## Reference Materials
diff --git a/docs/pipeline_deploy/edge_deploy.md b/docs/pipeline_deploy/edge_deploy.md
index 26dc0b290d..4ffc556b27 100644
--- a/docs/pipeline_deploy/edge_deploy.md
+++ b/docs/pipeline_deploy/edge_deploy.md
@@ -190,7 +190,7 @@ comments: true
     <b>注意：</b>
     - `Pipeline_Name` 和 `Demo_Name` 为占位符，具体值可参考本节最后的表格。
     - `download.sh` 和 `run.sh` 支持传入模型名来指定模型，若不指定则使用默认模型。目前适配的模型可参考本节最后表格的 `Model_Name` 列。
-    - 若想使用自己训练的模型，参考 [模型转换方法](https://paddlepaddle.github.io/Paddle-Lite/develop/model_optimize_tool/) 得到 `.nb` 模型，放到`PaddleX_Lite_Deploy/{Pipeline_Name}/assets/{Model_Name}`目录下，  `Model_Name`为模型名，例如 `PaddleX_Lite_Deploy/object_detection/assets/PicoDet-L`。
+    - 若想使用自己训练的模型，参考 [模型转换方法](https://paddlepaddle.github.io/Paddle-Lite/develop/model_optimize_tool/) 得到 `.nb` 模型，放到`PaddleX_Lite_Deploy/{Pipeline_Name}/assets/{Model_Name}`目录下，  `Model_Name`为模型名，例如 `PaddleX_Lite_Deploy/object_detection/assets/PicoDet-L`。请注意，目前暂不支持将 `.json` 格式的静态图模型转换为 `.nb` 格式。在使用 PaddleX 导出静态图模型时，请设置环境变量 `FLAGS_json_format_model` 为 `0`。
     - 在运行 `build.sh` 脚本前，需要更改 `NDK_ROOT` 指定的路径为实际安装的 NDK 路径。
     - 在运行 `build.sh` 脚本时需保持 ADB 连接。
     - 在 Windows 系统上可以使用 Git Bash 执行部署步骤。
@@ -307,7 +307,8 @@ detection, image size: 768, 576, detect object: dog, score: 0.731584, location:
 </table>
 
 <b>备注</b>
-- 目前没有版面区域检测模块的端侧部署 demo，因此复用 `picodet_detection`demo 来部署`PicoDet_layout_1x`模型。
+
+- 目前没有版面区域检测模块的端侧部署 demo，因此复用 `picodet_detection` demo 来部署 `PicoDet_layout_1x` 模型。
 
 ## 参考资料
 
diff --git a/docs/pipeline_deploy/high_performance_inference.en.md b/docs/pipeline_deploy/high_performance_inference.en.md
index bc4e510fa0..1b082224d9 100644
--- a/docs/pipeline_deploy/high_performance_inference.en.md
+++ b/docs/pipeline_deploy/high_performance_inference.en.md
@@ -4,7 +4,7 @@ comments: true
 
 # PaddleX High-Performance Inference Guide
 
-In real production environments, many applications impose strict performance metrics—especially in response time—on deployment strategies to ensure system efficiency and a smooth user experience. To address this, PaddleX offers a high-performance inference plugin that, through automatic configuration and multi-backend inference capabilities, enables users to significantly accelerate model inference without concerning themselves with complex configurations and low-level details.
+In real production environments, many applications impose strict performance metrics—especially in response time—on deployment strategies to ensure system efficiency and a smooth user experience. To address this, PaddleX offers a high-performance inference plugin that, through automatic configuration and multi-backend inference capabilities, enables users to significantly accelerate model inference without concerning themselves with complex configurations and low-level details. In addition to supporting inference acceleration on pipelines, the PaddleX high-performance inference plugin can also be used to accelerate inference when modules are used standalone.
 
 ## Table of Contents
 
@@ -24,7 +24,7 @@ In real production environments, many applications impose strict performance met
 
 Before using the high-performance inference plugin, please ensure that you have completed the PaddleX installation according to the [PaddleX Local Installation Tutorial](../installation/installation.en.md) and have run the quick inference using the PaddleX pipeline command line or the PaddleX pipeline Python script as described in the usage instructions.
 
-The high-performance inference plugin supports handling multiple model formats, including **PaddlePaddle static graph (`.pdmodel`, `.json`)**, **ONNX (`.onnx`)** and **Huawei OM (`.om`)**, among others. For ONNX models, you can convert them using the [Paddle2ONNX Plugin](./paddle2onnx.en.md). If multiple model formats are present in the model directory, PaddleX will automatically choose the appropriate one as needed, and automatic model conversion may be performed. **It is recommended to install the Paddle2ONNX plugin first before installing the high-performance inference plugin, so that PaddleX can convert model formats when needed.**
+The high-performance inference plugin supports handling multiple model formats, including **PaddlePaddle static graph (`.pdmodel`, `.json`)**, **ONNX (`.onnx`)** and **Huawei OM (`.om`)**, among others. For ONNX models, you can convert them using the [Paddle2ONNX Plugin](./paddle2onnx.en.md). If multiple model formats are present in the model directory, PaddleX will automatically choose the appropriate one as needed, and automatic model conversion may be performed.
 
 ### 1.1 Installing the High-Performance Inference Plugin
 
@@ -86,12 +86,14 @@ Refer to [Get PaddleX based on Docker](../installation/installation.en.md#21-obt
   </tbody>
 </table>
 
-In the official PaddleX Docker image, TensorRT is installed by default. The high-performance inference plugin can then accelerate inference using the Paddle Inference TensorRT subgraph engine.
+The official PaddleX Docker images come with the Paddle2ONNX plugin pre-installed, allowing PaddleX to convert model formats on demand. In addition, the GPU version of the image includes TensorRT, so the high-performance inference plugin can leverage the Paddle Inference TensorRT subgraph engine for accelerated inference.
 
 **Please note that the aforementioned Docker image refers to the official PaddleX image described in [Get PaddleX via Docker](../installation/installation.en.md#21-get-paddlex-based-on-docker), rather than the PaddlePaddle official image described in [PaddlePaddle Local Installation Tutorial](../installation/paddlepaddle_install.en.md#installing-paddlepaddle-via-docker). For the latter, please refer to the local installation instructions for the high-performance inference plugin.**
 
 #### 1.1.2 Installing the High-Performance Inference Plugin Locally
 
+**It is recommended to install the Paddle2ONNX plugin first before installing the high-performance inference plugin, so that PaddleX can convert model formats when needed.**
+
 **To install the CPU version of the high-performance inference plugin:**
 
 Run:
@@ -322,7 +324,7 @@ The available configuration items for `backend_config` vary for different backen
 
 ### 2.3 Modifying the High-Performance Inference Configuration
 
-Due to the diversity of actual deployment environments and requirements, the default configuration might not meet all needs. In such cases, manual adjustment of the high-performance inference configuration may be necessary. Users can modify the configuration by editing the **pipeline/module configuration file** or by passing the `hpi_config` field in the parameters via **CLI** or **Python API**. **Parameters passed via CLI or Python API will override the settings in the pipeline/module configuration file.** Different levels of configurations in the config file are automatically merged, and the deepest-level settings take the highest priority. The following examples illustrate how to modify the configuration.
+When the model is initialized, the log will, by default, record the high-performance inference configuration that is about to be used. Due to the diversity of actual deployment environments and requirements, the default configuration might not meet all needs. In such cases, manual adjustment of the high-performance inference configuration may be necessary. Users can modify the configuration by editing the pipeline/module configuration file or by passing the `hpi_config` field in the parameters via CLI or Python API. Parameters passed via CLI or Python API will override the settings in the pipeline/module configuration file. Different levels of configurations in the config file are automatically merged, and the deepest-level settings take the highest priority. The following examples illustrate how to modify the configuration.
 
 **For the general OCR pipeline, use the `onnxruntime` backend for all models:**
 
@@ -566,3 +568,11 @@ For the GPU version of the high-performance inference plugin, the official Paddl
 **4. Why does the program freeze during runtime or display some "WARNING" and "ERROR" messages after using the high-performance inference feature? What should be done in such cases?**
 
 When initializing the model, operations such as subgraph optimization may take longer and may generate some "WARNING" and "ERROR" messages. However, as long as the program does not exit automatically, it is recommended to wait patiently, as the program usually continues to run to completion.
+
+**5. When using GPU for inference, enabling the high-performance inference plugin increases memory usage and causes OOM. How can this be resolved?**
+
+Some acceleration methods trade off memory usage to support a broader range of inference scenarios. If memory becomes a bottleneck, consider the following optimization strategies:
+
+* **Adjust pipeline configurations**: Disable unnecessary features to avoid loading redundant models. Appropriately reduce the batch size based on business requirements to balance throughput and memory usage.
+* **Switch inference backends**: Different inference backends have varying memory management strategies. Try benchmarking various backends to compare memory usage and performance.
+* **Optimize dynamic shape configurations**: For modules using TensorRT or Paddle Inference TensorRT subgraph engine, narrow the dynamic shape range based on the actual distribution of input data.
diff --git a/docs/pipeline_deploy/high_performance_inference.md b/docs/pipeline_deploy/high_performance_inference.md
index 4ee284a274..d7757a8913 100644
--- a/docs/pipeline_deploy/high_performance_inference.md
+++ b/docs/pipeline_deploy/high_performance_inference.md
@@ -4,7 +4,7 @@ comments: true
 
 # PaddleX 高性能推理指南
 
-在实际生产环境中，许多应用对部署策略的性能指标（尤其是响应速度）有着较严苛的标准，以确保系统的高效运行与用户体验的流畅性。为此，PaddleX 提供高性能推理插件，通过自动配置和多后端推理功能，让用户无需关注复杂的配置和底层细节，即可显著提升模型的推理速度。
+在实际生产环境中，许多应用对部署策略的性能指标（尤其是响应速度）有着较严苛的标准，以确保系统的高效运行与用户体验的流畅性。为此，PaddleX 提供高性能推理插件，通过自动配置和多后端推理功能，让用户无需关注复杂的配置和底层细节，即可显著提升模型的推理速度。除了支持产线的推理加速外，PaddleX 高性能推理插件也可用于单独使用模块时的推理加速。
 
 ## 目录
 
@@ -22,9 +22,9 @@ comments: true
 
 ## 1. 安装与基础使用方法
 
-使用高性能推理插件前，请确保您已经按照 [PaddleX本地安装教程](../installation/installation.md) 完成了PaddleX的安装，且按照PaddleX产线命令行使用说明或PaddleX产线Python脚本使用说明跑通了产线的快速推理。
+使用高性能推理插件前，请确保您已经按照 [PaddleX本地安装教程](../installation/installation.md) 完成了 PaddleX 的安装，且按照 PaddleX 产线命令行使用说明或 PaddleX 产线 Python 脚本使用说明跑通了产线的快速推理。
 
-高性能推理插件支持处理 **PaddlePaddle 静态图（`.pdmodel`、 `.json`）**、**ONNX（`.onnx`）**、**华为 OM（`.om`）** 等多种模型格式。对于 ONNX 模型，可以使用 [Paddle2ONNX 插件](./paddle2onnx.md) 转换得到。如果模型目录中存在多种格式的模型，PaddleX 会根据需要自动选择，并可能进行自动模型转换。**建议在安装高性能推理插件前，首先安装 Paddle2ONNX 插件，以便 PaddleX 可以在需要时转换模型格式。**
+高性能推理插件支持处理 **飞桨静态图（`.pdmodel`、 `.json`）**、**ONNX（`.onnx`）**、**华为 OM（`.om`）** 等多种模型格式。对于 ONNX 模型，可以使用 [Paddle2ONNX 插件](./paddle2onnx.md) 转换得到。如果模型目录中存在多种格式的模型，PaddleX 会根据需要自动选择，并可能进行自动模型转换。
 
 ### 1.1 安装高性能推理插件
 
@@ -86,12 +86,14 @@ comments: true
       </tbody>
   </table>
 
-PaddleX 官方 Docker 镜像中默认安装了 TensorRT，高性能推理插件可以使用 Paddle Inference TensorRT 子图引擎进行推理加速。
+PaddleX 官方 Docker 镜像中预装了 Paddle2ONNX 插件，以便 PaddleX 可以在需要时转换模型格式。此外，GPU 版本的镜像中安装了 TensorRT，高性能推理插件可以使用 Paddle Inference TensorRT 子图引擎进行推理加速。
 
 **请注意，以上提到的镜像指的是 [基于Docker获取PaddleX](../installation/installation.md#21-基于docker获取paddlex) 中描述的 PaddleX 官方镜像，而非 [飞桨PaddlePaddle本地安装教程](../installation/paddlepaddle_install.md#基于-docker-安装飞桨) 中描述的飞桨框架官方镜像。对于后者，请参考高性能推理插件本地安装说明。**
 
 #### 1.1.2 本地安装高性能推理插件
 
+**建议在安装高性能推理插件前，首先安装 Paddle2ONNX 插件，以便 PaddleX 可以在需要时转换模型格式。**
+
 **安装 CPU 版本的高性能推理插件：**
 
 执行：
@@ -107,7 +109,7 @@ paddlex --install hpi-cpu
 - [安装 CUDA 11.8](https://developer.nvidia.com/cuda-11-8-0-download-archive)
 - [安装 cuDNN 8.9](https://docs.nvidia.com/deeplearning/cudnn/archives/cudnn-890/install-guide/index.html)
 
-如果使用的是飞桨框架官方镜像，则镜像中的 CUDA 和 cuDNN 版本已经是满足要求的，无需重新安装。
+如果使用的是飞桨框架官方镜像，则镜像中的 CUDA 和 cuDNN 版本已经是满足要求的，无需额外安装。
 
 如果通过 pip 安装飞桨，通常 CUDA、cuDNN 的相关 Python 包将被自动安装。在这种情况下，**仍需要通过安装非 Python 专用的 CUDA 与 cuDNN**。同时，建议安装的 CUDA 和 cuDNN 版本与环境中存在的 Python 包版本保持一致，以避免不同版本的库共存导致的潜在问题。可以通过如下方式可以查看 CUDA 和 cuDNN 相关 Python 包的版本：
 
@@ -132,7 +134,7 @@ paddlex --install hpi-gpu
 
 **注意：**
 
-1. **目前 PaddleX 官方仅提供 CUDA 11.8 + cuDNN 8.9 的预编译包**；CUDA 12 已经在支持中。
+1. **目前 PaddleX 官方仅提供 CUDA 11.8 + cuDNN 8.9 的预编译包**。CUDA 12 已经在支持中。
 
 2. 同一环境中只应该存在一个版本的高性能推理插件。
 
@@ -246,7 +248,7 @@ output = model.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/
 </tr>
 <tr>
   <td><code>auto_paddle2onnx</code></td>
-  <td>是否将 PaddlePaddle 静态图模型自动转换为 ONNX 模型。当 Paddle2ONNX 插件不可用时，不执行转换。</td>
+  <td>是否将飞桨静态图模型自动转换为 ONNX 模型。当 Paddle2ONNX 插件不可用时，不执行转换。</td>
   <td><code>bool</code></td>
   <td><code>True</code></td>
 </tr>
@@ -323,7 +325,7 @@ output = model.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/
 
 ### 2.3 修改高性能推理配置
 
-由于实际部署环境和需求的多样性，默认配置可能无法满足所有要求。这时，可能需要手动调整高性能推理配置。用户可以通过修改**产线/模块配置文件**、**CLI**或**Python API**所传递参数中的 `hpi_config` 字段内容来修改配置。**通过 CLI 或 Python API 传递的参数将覆盖产线/模块配置文件中的设置**。配置文件中不同层级的配置将自动合并，最深层的配置具有最高的优先级。以下将结合一些例子介绍如何修改配置。
+模型初始化时，日志中默认会记录将要使用的高性能推理配置。由于实际部署环境和需求的多样性，默认配置可能无法满足所有要求。这时，可能需要手动调整高性能推理配置。用户可以通过修改产线/模块配置文件、CLI或Python API所传递参数中的 `hpi_config` 字段内容来修改配置。通过 CLI 或 Python API 传递的参数将覆盖产线/模块配置文件中的设置。配置文件中不同层级的配置将自动合并，最深层的配置具有最高的优先级。以下将结合一些例子介绍如何修改配置。
 
 **通用OCR产线的所有模型使用 `onnxruntime` 后端：**
 
@@ -566,3 +568,11 @@ python -m pip install ../../python/dist/ultra_infer*.whl
 **4. 为什么使用高性能推理功能后，程序在运行过程中会卡住或者显示一些“WARNING”和“ERROR”信息？这种情况下应该如何处理？**
 
 在初始化模型时，子图优化等操作可能会导致程序耗时较长，并生成一些“WARNING”和“ERROR”信息。然而，只要程序没有自动退出，建议耐心等待，程序通常会继续运行至完成。
+
+**5. 使用 GPU 推理时，启用高性能推理插件后显存占用增大并导致 OOM，如何解决？**
+
+部分提速手段会以牺牲显存为代价，以支持更广泛的推理场景。如果显存成为瓶颈，可参考以下优化思路：
+
+- 调整产线配置：禁用不需要使用的功能，避免加载多余模型；根据业务需求合理降低 batch size，平衡吞吐与显存使用。
+- 切换推理后端：不同推理后端在显存管理策略上各有差异，可尝试各种后端测评显存占用与性能。
+- 优化动态形状配置：对于使用 TensorRT 或 Paddle Inference TensorRT 子图引擎的模块，根据实际输入数据的分布，缩小动态形状范围。
diff --git a/docs/pipeline_deploy/paddle2onnx.en.md b/docs/pipeline_deploy/paddle2onnx.en.md
index 3ceb9ad7ad..11291833f6 100644
--- a/docs/pipeline_deploy/paddle2onnx.en.md
+++ b/docs/pipeline_deploy/paddle2onnx.en.md
@@ -34,7 +34,7 @@ paddlex --install paddle2onnx
         <tr>
             <td>opset_version</td>
             <td>int</td>
-            <td>The ONNX opset version to use. Defaults to <code>7</code>.</td>
+            <td>The ONNX opset version to use. If a lower-version opset cannot complete the conversion, a higher-version opset will be automatically selected for the conversion. Defaults to <code>7</code>.</td>
         </tr>
     </tbody>
 </table>
diff --git a/docs/pipeline_deploy/paddle2onnx.md b/docs/pipeline_deploy/paddle2onnx.md
index 3ec5fd042b..84255f15e3 100644
--- a/docs/pipeline_deploy/paddle2onnx.md
+++ b/docs/pipeline_deploy/paddle2onnx.md
@@ -1,7 +1,7 @@
 
 # Paddle2ONNX 插件的安装与使用
 
-PaddleX 的 Paddle2ONNX 插件提供了将 PaddlePaddle 静态图模型转化到 ONNX 格式模型的能力，底层使用[Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX)。
+PaddleX 的 Paddle2ONNX 插件提供了将飞桨静态图模型转化到 ONNX 格式模型的能力，底层使用 [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX)。
 
 ## 1. 安装
 
@@ -25,17 +25,17 @@ paddlex --install paddle2onnx
         <tr>
             <td>paddle_model_dir</td>
             <td>str</td>
-            <td>包含Paddle模型的目录。</td>
+            <td>包含 Paddle 模型的目录。</td>
         </tr>
         <tr>
             <td>onnx_model_dir</td>
             <td>str</td>
-            <td>ONNX模型的输出目录，可以与Paddle模型目录相同。默认为<code>onnx</code>。</td>
+            <td>ONNX 模型的输出目录，可以与 Paddle 模型目录相同。默认为 <code>onnx</code>。</td>
         </tr>
         <tr>
             <td>opset_version</td>
             <td>int</td>
-            <td>使用的ONNX opset版本。默认为<code>7</code>。</td>
+            <td>使用的 ONNX opset 版本。当使用低版本 opset 无法完成转换时，将自动选择更高版本的 opset 进行转换。默认为 <code>7</code>。</td>
         </tr>
     </tbody>
 </table>
@@ -47,9 +47,9 @@ paddlex --install paddle2onnx
 ```bash
 paddlex \
     --paddle2onnx \  # 使用paddle2onnx功能
-    --paddle_model_dir /your/paddle_model/dir \  # 指定Paddle模型所在的目录
-    --onnx_model_dir /your/onnx_model/output/dir \  # 指定转换后ONNX模型的输出目录
-    --opset_version 7  # 指定要使用的ONNX opset版本
+    --paddle_model_dir /your/paddle_model/dir \  # 指定 Paddle 模型所在的目录
+    --onnx_model_dir /your/onnx_model/output/dir \  # 指定转换后 ONNX 模型的输出目录
+    --opset_version 7  # 指定要使用的 ONNX opset 版本
 ```
 
 以 image_classification 模块中的 ResNet18 模型为例：
diff --git a/docs/pipeline_deploy/serving.en.md b/docs/pipeline_deploy/serving.en.md
index 730fc378fb..a20dd474dc 100644
--- a/docs/pipeline_deploy/serving.en.md
+++ b/docs/pipeline_deploy/serving.en.md
@@ -76,11 +76,11 @@ The command-line options related to serving are as follows:
 </tr>
 <tr>
 <td><code>--host</code></td>
-<td>Hostname or IP address the server binds to. Defaults to `0.0.0.0`.</td>
+<td>Hostname or IP address the server binds to. Defaults to <code>0.0.0.0</code>.</td>
 </tr>
 <tr>
 <td><code>--port</code></td>
-<td>Port number the server listens on. Defaults to `8080`.</td>
+<td>Port number the server listens on. Defaults to <code>8080</code>.</td>
 </tr>
 <tr>
 <td><code>--use_hpip</code></td>
@@ -323,8 +323,8 @@ With the image prepared, navigate to the `server` directory and execute the foll
 docker run \
     -it \
     -e PADDLEX_HPS_DEVICE_TYPE={deployment device type} \
-    -v "$(pwd)":/workspace \
-    -w /workspace \
+    -v "$(pwd)":/app \
+    -w /app \
     --rm \
     --gpus all \
     --init \
diff --git a/docs/pipeline_deploy/serving.md b/docs/pipeline_deploy/serving.md
index e1900f459d..28cca2c63f 100644
--- a/docs/pipeline_deploy/serving.md
+++ b/docs/pipeline_deploy/serving.md
@@ -76,11 +76,11 @@ INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
 </tr>
 <tr>
 <td><code>--host</code></td>
-<td>服务器绑定的主机名或 IP 地址。默认为 `0.0.0.0`。</td>
+<td>服务器绑定的主机名或 IP 地址。默认为 <code>0.0.0.0</code>。</td>
 </tr>
 <tr>
 <td><code>--port</code></td>
-<td>服务器监听的端口号。默认为 `8080`。</td>
+<td>服务器监听的端口号。默认为 <code>8080</code>。</td>
 </tr>
 <tr>
 <td><code>--use_hpip</code></td>
@@ -323,8 +323,8 @@ paddlex --serve --pipeline image_classification --use_hpip
 docker run \
     -it \
     -e PADDLEX_HPS_DEVICE_TYPE={部署设备类型} \
-    -v "$(pwd)":/workspace \
-    -w /workspace \
+    -v "$(pwd)":/app \
+    -w /app \
     --rm \
     --gpus all \
     --init \
diff --git a/docs/pipeline_usage/instructions/parallel_inference.en.md b/docs/pipeline_usage/instructions/parallel_inference.en.md
new file mode 100644
index 0000000000..2e2231865f
--- /dev/null
+++ b/docs/pipeline_usage/instructions/parallel_inference.en.md
@@ -0,0 +1,196 @@
+# Pipeline Parallel Inference
+
+## Specifying Multiple Inference Devices
+
+For some pipelines in both the CLI and Python API, PaddleX supports specifying multiple inference devices simultaneously. If multiple devices are specified, at initialization each device will host its own instance of the underlying pipeline class, and incoming inputs will be inferred in parallel across them. For example, for the PP-StructureV3 pipeline:
+
+```bash
+paddlex --pipeline PP-StructureV3 \
+        --input input_images/ \
+        --use_doc_orientation_classify False \
+        --use_doc_unwarping False \
+        --use_textline_orientation False \
+        --save_path ./output \
+        --device 'gpu:0,1,2,3'
+```
+
+```python
+pipeline = create_pipeline(pipeline="PP-StructureV3", device="gpu:0,1,2,3")
+output = pipeline.predict(
+    input="input_images/",
+    use_doc_orientation_classify=False,
+    use_doc_unwarping=False,
+    use_textline_orientation=False,
+)
+```
+
+In both examples above, four GPUs (IDs 0, 1, 2, 3) are used to perform parallel inference on all files in the `input_images` directory.
+
+When specifying multiple devices, the inference interface remains the same as when specifying a single device. Please refer to the pipeline usage guide to check whether a given pipeline supports multiple-device inference.
+
+## Example of Multi-Process Parallel Inference
+
+Beyond PaddleX’s built-in multi-GPU parallel inference, users can also implement parallelism by wrapping PaddleX pipeline API calls themselves according to their specific scenario, with a view to achieving a better speedup. Below is an example of using Python’s `multiprocessing` to run multiple cards and multiple pipeline instances in parallel over the files in an input directory:
+
+```python
+import argparse
+import sys
+from multiprocessing import Manager, Process
+from pathlib import Path
+from queue import Empty
+
+from paddlex import create_pipeline
+from paddlex.utils.device import constr_device, parse_device
+
+
+def worker(pipeline_name_or_config_path, device, task_queue, batch_size, output_dir):
+    pipeline = create_pipeline(pipeline_name_or_config_path, device=device)
+
+    should_end = False
+    batch = []
+
+    while not should_end:
+        try:
+            input_path = task_queue.get_nowait()
+        except Empty:
+            should_end = True
+        else:
+            batch.append(input_path)
+
+        if batch and (len(batch) == batch_size or should_end):
+            try:
+                for result in pipeline.predict(batch):
+                    input_path = Path(result["input_path"])
+                    if result.get("page_index") is not None:
+                        output_path = f"{input_path.stem}_{result['page_index']}.json"
+                    else:
+                        output_path = f"{input_path.stem}.json"
+                    output_path = str(Path(output_dir, output_path))
+                    result.save_to_json(output_path)
+                    print(f"Processed {repr(str(input_path))}")
+            except Exception as e:
+                print(
+                    f"Error processing {batch} on {repr(device)}: {e}", file=sys.stderr
+                )
+            batch.clear()
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--pipeline", type=str, required=True, help="Pipeline name or config path."
+    )
+    parser.add_argument("--input_dir", type=str, required=True, help="Input directory.")
+    parser.add_argument(
+        "--device",
+        type=str,
+        required=True,
+        help="Specifies the devices for performing parallel inference.",
+    )
+    parser.add_argument(
+        "--output_dir", type=str, default="output", help="Output directory."
+    )
+    parser.add_argument(
+        "--instances_per_device",
+        type=int,
+        default=1,
+        help="Number of pipeline instances per device.",
+    )
+    parser.add_argument(
+        "--batch_size",
+        type=int,
+        default=1,
+        help="Inference batch size for each pipeline instance.",
+    )
+    parser.add_argument(
+        "--input_glob_pattern",
+        type=str,
+        default="*",
+        help="Pattern to find the input files.",
+    )
+    args = parser.parse_args()
+
+    input_dir = Path(args.input_dir)
+    if not input_dir.exists():
+        print(f"The input directory does not exist: {input_dir}", file=sys.stderr)
+        return 2
+    if not input_dir.is_dir():
+        print(f"{repr(str(input_dir))} is not a directory.", file=sys.stderr)
+        return 2
+
+    output_dir = Path(args.output_dir)
+    if output_dir.exists() and not output_dir.is_dir():
+        print(f"{repr(str(output_dir))} is not a directory.", file=sys.stderr)
+        return 2
+    output_dir.mkdir(parents=True, exist_ok=True)
+
+    device_type, device_ids = parse_device(args.device)
+    if device_ids is None or len(device_ids) == 1:
+        print(
+            "Please specify at least two devices for performing parallel inference.",
+            file=sys.stderr,
+        )
+        sys.exit(2)
+
+    if args.batch_size <= 0:
+        print("Batch size must be greater than 0.", file=sys.stderr)
+        sys.exit(2)
+
+    manager = Manager()
+    task_queue = manager.Queue()
+    for img_path in input_dir.glob(args.input_glob_pattern):
+        task_queue.put(str(img_path))
+
+    processes = []
+    for device_id in device_ids:
+        for _ in range(args.instances_per_device):
+            device = constr_device(device_type, [device_id])
+            p = Process(
+                target=worker,
+                args=(
+                    args.pipeline,
+                    device,
+                    task_queue,
+                    args.batch_size,
+                    str(output_dir),
+                ),
+            )
+            p.start()
+            processes.append(p)
+
+    for p in processes:
+        p.join()
+
+    print("All done")
+
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
+```
+
+Assuming you save the script above as `mp_infer.py`, here are some example invocations:
+
+```bash
+# PP-StructureV3 pipeline
+# Process all files in `input_images`
+# Use GPUs 0,1,2,3 with 1 pipeline instance per GPU and batch size 1
+python mp_infer.py \
+    --pipeline PP-StructureV3 \
+    --input_dir input_images/ \
+    --device 'gpu:0,1,2,3' \
+    --output_dir output1
+
+# PP-StructureV3 pipeline
+# Process all `.jpg` files in `input_images`
+# Use GPUs 0 and 2 with 2 pipeline instances per GPU and batch size 4
+python mp_infer.py \
+    --pipeline PP-StructureV3 \
+    --input_dir input_images/ \
+    --device 'gpu:0,2' \
+    --output_dir output2 \
+    --instances_per_device 2 \
+    --batch_size 4 \
+    --input_glob_pattern '*.jpg'
+```
diff --git a/docs/pipeline_usage/instructions/parallel_inference.md b/docs/pipeline_usage/instructions/parallel_inference.md
new file mode 100644
index 0000000000..db101afcc2
--- /dev/null
+++ b/docs/pipeline_usage/instructions/parallel_inference.md
@@ -0,0 +1,196 @@
+# 产线并行推理
+
+## 指定多个推理设备
+
+对于部分产线的 CLI 和 Python API，PaddleX 支持同时指定多个推理设备。如果指定了多个设备，产线初始化时将在每个设备上创建一个底层产线类对象的实例，并对接收到的输入进行并行推理。例如，对于通用版面解析 v3 产线：
+
+```bash
+paddlex --pipeline PP-StructureV3 \
+        --input input_images/ \
+        --use_doc_orientation_classify False \
+        --use_doc_unwarping False \
+        --use_textline_orientation False \
+        --save_path ./output \
+        --device 'gpu:0,1,2,3'
+```
+
+```python
+pipeline = create_pipeline(pipeline="PP-StructureV3", device="gpu:0,1,2,3")
+output = pipeline.predict(
+    input="input_images/",
+    use_doc_orientation_classify=False,
+    use_doc_unwarping=False,
+    use_textline_orientation=False,
+)
+```
+
+以上两个例子均使用 4 块 GPU（编号为 0、1、2、3）对 `input_images` 目录中的文件进行并行推理。
+
+指定多个设备时，推理接口仍然与指定单设备时保持一致。请查看产线使用教程以了解某一产线是否支持指定多个推理设备。
+
+## 多进程并行推理示例
+
+除了使用 PaddleX 内置的多设备并行推理功能外，用户也可以结合实际场景，通过包装 PaddleX 产线推理 API 调用逻辑等手段实现并行推理加速，以期达到更优的加速比。如下是使用 Python 多进程实现多卡、多实例并行处理输入目录中的文件的示例代码：
+
+```python
+import argparse
+import sys
+from multiprocessing import Manager, Process
+from pathlib import Path
+from queue import Empty
+
+from paddlex import create_pipeline
+from paddlex.utils.device import constr_device, parse_device
+
+
+def worker(pipeline_name_or_config_path, device, task_queue, batch_size, output_dir):
+    pipeline = create_pipeline(pipeline_name_or_config_path, device=device)
+
+    should_end = False
+    batch = []
+
+    while not should_end:
+        try:
+            input_path = task_queue.get_nowait()
+        except Empty:
+            should_end = True
+        else:
+            batch.append(input_path)
+
+        if batch and (len(batch) == batch_size or should_end):
+            try:
+                for result in pipeline.predict(batch):
+                    input_path = Path(result["input_path"])
+                    if result.get("page_index") is not None:
+                        output_path = f"{input_path.stem}_{result['page_index']}.json"
+                    else:
+                        output_path = f"{input_path.stem}.json"
+                    output_path = str(Path(output_dir, output_path))
+                    result.save_to_json(output_path)
+                    print(f"Processed {repr(str(input_path))}")
+            except Exception as e:
+                print(
+                    f"Error processing {batch} on {repr(device)}: {e}", file=sys.stderr
+                )
+            batch.clear()
+
+
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--pipeline", type=str, required=True, help="Pipeline name or config path."
+    )
+    parser.add_argument("--input_dir", type=str, required=True, help="Input directory.")
+    parser.add_argument(
+        "--device",
+        type=str,
+        required=True,
+        help="Specifies the devices for performing parallel inference.",
+    )
+    parser.add_argument(
+        "--output_dir", type=str, default="output", help="Output directory."
+    )
+    parser.add_argument(
+        "--instances_per_device",
+        type=int,
+        default=1,
+        help="Number of pipeline instances per device.",
+    )
+    parser.add_argument(
+        "--batch_size",
+        type=int,
+        default=1,
+        help="Inference batch size for each pipeline instance.",
+    )
+    parser.add_argument(
+        "--input_glob_pattern",
+        type=str,
+        default="*",
+        help="Pattern to find the input files.",
+    )
+    args = parser.parse_args()
+
+    input_dir = Path(args.input_dir)
+    if not input_dir.exists():
+        print(f"The input directory does not exist: {input_dir}", file=sys.stderr)
+        return 2
+    if not input_dir.is_dir():
+        print(f"{repr(str(input_dir))} is not a directory.", file=sys.stderr)
+        return 2
+
+    output_dir = Path(args.output_dir)
+    if output_dir.exists() and not output_dir.is_dir():
+        print(f"{repr(str(output_dir))} is not a directory.", file=sys.stderr)
+        return 2
+    output_dir.mkdir(parents=True, exist_ok=True)
+
+    device_type, device_ids = parse_device(args.device)
+    if device_ids is None or len(device_ids) == 1:
+        print(
+            "Please specify at least two devices for performing parallel inference.",
+            file=sys.stderr,
+        )
+        sys.exit(2)
+
+    if args.batch_size <= 0:
+        print("Batch size must be greater than 0.", file=sys.stderr)
+        sys.exit(2)
+
+    manager = Manager()
+    task_queue = manager.Queue()
+    for img_path in input_dir.glob(args.input_glob_pattern):
+        task_queue.put(str(img_path))
+
+    processes = []
+    for device_id in device_ids:
+        for _ in range(args.instances_per_device):
+            device = constr_device(device_type, [device_id])
+            p = Process(
+                target=worker,
+                args=(
+                    args.pipeline,
+                    device,
+                    task_queue,
+                    args.batch_size,
+                    str(output_dir),
+                ),
+            )
+            p.start()
+            processes.append(p)
+
+    for p in processes:
+        p.join()
+
+    print("All done")
+
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
+```
+
+假设将上述脚本存储为 `mp_infer.py`，以下是一些调用示例：
+
+```bash
+# 通用版面解析 v3 产线
+# 处理 `input_images` 目录中所有文件
+# 使用 GPU 0、1、2、3，每块 GPU 上 1 个产线实例，每个实例一次处理 1 个输入文件
+python mp_infer.py \
+    --pipeline PP-StructureV3 \
+    --input_dir input_images/ \
+    --device 'gpu:0,1,2,3' \
+    --output_dir output1
+
+# 通用版面解析 v3 产线
+# 处理 `input_images` 目录中所有后缀为 `.jpg` 的文件
+# 使用 GPU 0、2，每块 GPU 上 2 个产线实例，每个实例一次处理 4 个输入文件
+python mp_infer.py \
+    --pipeline PP-StructureV3 \
+    --input_dir input_images/ \
+    --device 'gpu:0,2' \
+    --output_dir output2 \
+    --instances_per_device 2 \
+    --batch_size 4 \
+    --input_glob_pattern '*.jpg'
+```
diff --git a/docs/pipeline_usage/instructions/pipeline_python_API.en.md b/docs/pipeline_usage/instructions/pipeline_python_API.en.md
index f5a7abcb60..799b21c0d5 100644
--- a/docs/pipeline_usage/instructions/pipeline_python_API.en.md
+++ b/docs/pipeline_usage/instructions/pipeline_python_API.en.md
@@ -36,7 +36,7 @@ In short, there are only three steps:
     * `pp_option`: `PaddlePredictorOption` type, used to change inference settings (e.g. the operating mode). Please refer to [4-Inference Configuration](#4-inference-configuration) for more details;
     * `use_hpip`：`bool | None` type, whether to enable the high-performance inference plugin (`None` for using the setting from the configuration file);
     * `hpi_config`：`dict | None` type, high-performance inference configuration;
-  * Return Value: `BasePredictor` type.
+  * Return Value: `BasePipeline` type.
 
 ### 2. Perform Inference by Calling the `predict()` Method of the Prediction Model Pipeline Object
 
diff --git a/docs/pipeline_usage/instructions/pipeline_python_API.md b/docs/pipeline_usage/instructions/pipeline_python_API.md
index 1be0345dcd..1287cf9530 100644
--- a/docs/pipeline_usage/instructions/pipeline_python_API.md
+++ b/docs/pipeline_usage/instructions/pipeline_python_API.md
@@ -37,7 +37,7 @@ for res in output:
     * `pp_option`：`PaddlePredictorOption` 类型，用于改变运行模式等配置项，关于推理配置的详细说明，请参考下文[4-推理配置](#4-推理配置)；
     * `use_hpip`：`bool | None` 类型，是否启用高性能推理插件（`None` 表示使用配置文件中的配置）；
     * `hpi_config`：`dict | None` 类型，高性能推理配置；
-  * 返回值：`BasePredictor`类型。
+  * 返回值：`BasePipeline`类型。
 
 ### 2. 调用预测模型产线对象的`predict()`方法进行推理预测
 
diff --git a/docs/pipeline_usage/pipeline_develop_guide.en.md b/docs/pipeline_usage/pipeline_develop_guide.en.md
index e993b39d27..c7a69e7950 100644
--- a/docs/pipeline_usage/pipeline_develop_guide.en.md
+++ b/docs/pipeline_usage/pipeline_develop_guide.en.md
@@ -117,8 +117,9 @@ The following steps are executed:
 
 > ❗ The results obtained from running the Python script are the same as those from the command line method.
 
-If the pre-trained model pipeline meets your expectations, you can proceed directly to [development integration/deployment](#6-development-integration-and-deployment). If not, optimize the pipeline effects according to the following steps.
+If you’d like to perform parallel inference, please refer to [Pipeline Parallel Inference](../pipeline_usage/instructions/parallel_inference.en.md).
 
+If the pre-trained model pipeline meets your expectations, you can proceed directly to [development integration/deployment](#6-development-integration-and-deployment). If not, optimize the pipeline effects according to the following steps.
 
 ## 3. Model Selection (Optional)
 
diff --git a/docs/pipeline_usage/pipeline_develop_guide.md b/docs/pipeline_usage/pipeline_develop_guide.md
index 7517a4c0d6..7572f68bca 100644
--- a/docs/pipeline_usage/pipeline_develop_guide.md
+++ b/docs/pipeline_usage/pipeline_develop_guide.md
@@ -117,8 +117,10 @@ for res in output:
 
 > ❗ Python脚本运行得到的结果与命令行方式相同。
 
+如果希望进行并行推理，可参考 [产线并行推理](../pipeline_usage/instructions/parallel_inference.md)。
 
 如果预训练模型产线的效果符合您的预期，即可直接进行[开发集成/部署](#6开发集成部署)，如果不符合，再根据后续步骤对产线的效果进行优化。
+
 ## 3、模型选择（可选）
 
 由于一个产线中可能包含一个或多个单功能模块，在进行模型微调时，您需要根据测试的情况确定微调其中的哪个模块的模型。
@@ -166,7 +168,6 @@ Pipeline:
 
 PaddleX 也提供了其他三种部署方式，详细说明如下：
 
-
 🚀 <b>高性能推理</b>：在实际生产环境中，许多应用对部署策略的性能指标（尤其是响应速度）有着较严苛的标准，以确保系统的高效运行与用户体验的流畅性。为此，PaddleX 提供高性能推理插件，旨在对模型推理及前后处理进行深度性能优化，实现端到端流程的显著提速，详细的高性能部署流程请参考[PaddleX高性能部署指南](../pipeline_deploy/high_performance_inference.md)。
 
 ☁️ <b>服务化部署</b>：服务化部署是实际生产环境中常见的一种部署形式。通过将推理功能封装为服务，客户端可以通过网络请求来访问这些服务，以获取推理结果。PaddleX 支持多种产线服务化部署方案，详细的产线服务化部署流程请参考[PaddleX服务化部署指南](../pipeline_deploy/serving.md)。
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.en.md b/docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.en.md
index c6a2249e21..a0f5ac65f4 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.en.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.en.md
@@ -152,7 +152,7 @@ paddlex --pipeline human_keypoint_detection \
         --device gpu:0
 ```
 
-The relevant parameter descriptions and results explanations can be referred to in the parameter explanations and results explanations of [2.2.2 Integration via Python Script](#222-integration-via-python-script).
+The relevant parameter descriptions and results explanations can be referred to in the parameter explanations and results explanations of [2.2.2 Integration via Python Script](#222-integration-via-python-script). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 The visualization results are saved to `save_path`, as shown below:
 
@@ -202,7 +202,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The device used for pipeline inference. It supports specifying the specific card number of GPU, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU, such as "cpu".</td>
+<td>The device used for pipeline inference. It supports specifying the specific card number of GPU, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU, such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.md b/docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.md
index 4539237954..806e7bedb0 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.md
@@ -148,7 +148,7 @@ paddlex --pipeline human_keypoint_detection \
         --save_path ./output/ \
         --device gpu:0
 ```
-相关参数和运行结果说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明和结果解释。
+相关参数和运行结果说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明和结果解释。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 可视化结果保存至`save_path`，如下所示：
 
@@ -197,7 +197,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.en.md b/docs/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.en.md
index 3b6c1e3acd..cf11de23d5 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.en.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.en.md
@@ -86,7 +86,7 @@ Note: Due to network issues, the above URL could not be successfully parsed. If
 paddlex --pipeline anomaly_detection --input uad_grid.png --device gpu:0  --save_path ./output
 ```
 
-The relevant parameter descriptions can be found in the [2.1.2 Python Script Integration](#212-python脚本方式集成) section.
+The relevant parameter descriptions can be found in the [2.1.2 Python Script Integration](#212-python脚本方式集成) section. Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the results will be printed to the terminal as follows:
 
@@ -141,7 +141,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu".</td>
+<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.md b/docs/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.md
index 1fe5d36b39..8a005c522c 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.md
@@ -88,7 +88,7 @@ PaddleX 所提供的模型产线均可以快速体验效果，您可以在本地
 paddlex --pipeline anomaly_detection --input uad_grid.png --device gpu:0  --save_path ./output
 ```
 
-相关的参数说明可以参考[2.1.2 Python脚本方式集成](#212-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.1.2 Python脚本方式集成](#212-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 
@@ -146,7 +146,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/image_classification.en.md b/docs/pipeline_usage/tutorials/cv_pipelines/image_classification.en.md
index 265af00c58..eca8e7dd77 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/image_classification.en.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/image_classification.en.md
@@ -751,7 +751,7 @@ You can quickly experience the image classification pipeline with a single comma
 paddlex --pipeline image_classification --input general_image_classification_001.jpg --device gpu:0 --save_path ./output/
 ```
 
-The relevant parameter descriptions can be found in the parameter explanation section of [2.2.2 Python Script Integration](#222-integration-via-python-script).
+The relevant parameter descriptions can be found in the parameter explanation section of [2.2.2 Python Script Integration](#222-integration-via-python-script). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 ```bash
 {'res': {'input_path': 'general_image_classification_001.jpg', 'page_index': None, 'class_ids': array([296, 170, 356, 258, 248], dtype=int32), 'scores': array([0.62736, 0.03752, 0.03256, 0.0323 , 0.03194], dtype=float32), 'label_names': ['ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus', 'Irish wolfhound', 'weasel', 'Samoyed, Samoyede', 'Eskimo dog, husky']}}
@@ -808,7 +808,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The device used for pipeline inference. It supports specifying the specific card number of GPUs, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPUs, such as "cpu".</td>
+<td>The device used for pipeline inference. It supports specifying the specific card number of GPUs, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPUs, such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/image_classification.md b/docs/pipeline_usage/tutorials/cv_pipelines/image_classification.md
index 8f9ae89a45..c85808fbd0 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/image_classification.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/image_classification.md
@@ -748,7 +748,7 @@ PaddleX 所提供的模型产线均可以快速体验效果，你可以在星河
 ```bash
 paddlex --pipeline image_classification --input general_image_classification_001.jpg --device gpu:0 --save_path ./output/
 ```
-相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 
@@ -805,7 +805,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.en.md b/docs/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.en.md
index eddd0460a7..12622d06f0 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.en.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.en.md
@@ -116,7 +116,7 @@ You can quickly experience the image multi-label classification pipeline effect
 paddlex --pipeline image_multilabel_classification --input general_image_classification_001.jpg --device gpu:0
 ```
 
-The relevant parameter descriptions can be referred to in the parameter explanations in [2.2.2 Python Script Integration]().
+The relevant parameter descriptions can be referred to in the parameter explanations in [2.2.2 Python Script Integration](). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the result will be printed to the terminal as follows:
 
@@ -176,7 +176,7 @@ In the above Python script, the following steps are performed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu".</td>
+<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.md b/docs/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.md
index d8a28fa940..10f8397eb2 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/image_multi_label_classification.md
@@ -119,7 +119,7 @@ PaddleX 所提供的模型产线均可以快速体验效果，你可以在星河
 ```bash
 paddlex --pipeline image_multilabel_classification --input general_image_classification_001.jpg --device gpu:0
 ```
-相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 
@@ -177,7 +177,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/instance_segmentation.en.md b/docs/pipeline_usage/tutorials/cv_pipelines/instance_segmentation.en.md
index d06cb93d00..96bc13e7b8 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/instance_segmentation.en.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/instance_segmentation.en.md
@@ -229,7 +229,7 @@ paddlex --pipeline instance_segmentation \
         --device gpu:0
 ```
 
-The relevant parameter descriptions can be referred to in the parameter explanations in [2.2.2 Python Script Integration]().
+The relevant parameter descriptions can be referred to in the parameter explanations in [2.2.2 Python Script Integration](). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the result will be printed to the terminal as follows:
 
@@ -285,7 +285,7 @@ In the above Python script, the following steps are performed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu".</td>
+<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>None</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/instance_segmentation.md b/docs/pipeline_usage/tutorials/cv_pipelines/instance_segmentation.md
index 99a0ced4d1..897f70de8b 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/instance_segmentation.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/instance_segmentation.md
@@ -234,7 +234,7 @@ paddlex --pipeline instance_segmentation \
         --save_path ./output \
         --device gpu:0
 ```
-相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 ```bash
@@ -287,7 +287,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>None</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/object_detection.en.md b/docs/pipeline_usage/tutorials/cv_pipelines/object_detection.en.md
index d62863fc0d..69836aaf63 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/object_detection.en.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/object_detection.en.md
@@ -416,7 +416,7 @@ paddlex --pipeline object_detection \
         --device gpu:0
 ```
 
-For the description of parameters and interpretation of results, please refer to the parameter explanation and result interpretation in [2.2.2 Integration via Python Script](#222-integration-via-python-script).
+For the description of parameters and interpretation of results, please refer to the parameter explanation and result interpretation in [2.2.2 Integration via Python Script](#222-integration-via-python-script). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 The visualization results are saved to `save_path`, as shown below:
 
@@ -465,7 +465,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The device for pipeline inference. It supports specifying the specific card number of GPU, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU as "cpu".</td>
+<td>The device for pipeline inference. It supports specifying the specific card number of GPU, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/object_detection.md b/docs/pipeline_usage/tutorials/cv_pipelines/object_detection.md
index b8ec073c1d..9889b57f55 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/object_detection.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/object_detection.md
@@ -435,7 +435,7 @@ paddlex --pipeline object_detection \
         --save_path ./output/ \
         --device gpu:0
 ```
-相关参数和运行结果说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明和结果解释。
+相关参数和运行结果说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明和结果解释。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 可视化结果保存至`save_path`，如下所示：
 
@@ -483,7 +483,7 @@ for res in output:
 <td><code>None</code></td>
 </tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.en.md b/docs/pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.en.md
index ecea5a3471..f81966b3db 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.en.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.en.md
@@ -134,7 +134,7 @@ You can quickly experience the pedestrian attribute recognition pipeline with a
 paddlex --pipeline pedestrian_attribute_recognition --input pedestrian_attribute_002.jpg --device gpu:0 --save_path ./output/
 ```
 
-The relevant parameter descriptions can be found in the parameter explanation section of [2.2.2 Python Script Integration](#222-python脚本方式集成).
+The relevant parameter descriptions can be found in the parameter explanation section of [2.2.2 Python Script Integration](#222-python脚本方式集成). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the result will be printed to the terminal, as shown below:
 
@@ -194,7 +194,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The device used for pipeline inference. It supports specifying the specific card number of GPUs, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPUs, such as "cpu".</td>
+<td>The device used for pipeline inference. It supports specifying the specific card number of GPUs, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPUs, such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.md b/docs/pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.md
index 3838b19047..0e849488a6 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/pedestrian_attribute_recognition.md
@@ -134,7 +134,7 @@ PaddleX 所提供的模型产线均可以快速体验效果，你可以在星河
 ```bash
 paddlex --pipeline pedestrian_attribute_recognition --input pedestrian_attribute_002.jpg --device gpu:0 --save_path ./output/
 ```
-相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 
@@ -192,7 +192,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.en.md b/docs/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.en.md
index 362c39d16b..25902f1a2c 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.en.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.en.md
@@ -95,7 +95,7 @@ paddlex --pipeline rotated_object_detection \
         --device gpu:0 \
 ```
 
-The relevant parameter descriptions can be referred to in the parameter explanations of [2.2.2 Integration via Python Script](#222-integration-via-python-script).
+The relevant parameter descriptions can be referred to in the parameter explanations of [2.2.2 Integration via Python Script](#222-integration-via-python-script). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the results will be printed to the terminal, as follows:
 
@@ -150,7 +150,7 @@ In the above Python script, the following steps were executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The device used for pipeline inference. It supports specifying the specific card number of the GPU, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU, such as "cpu".</td>
+<td>The device used for pipeline inference. It supports specifying the specific card number of the GPU, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU, such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>None</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.md b/docs/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.md
index 65d07af48f..730d2d11a1 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/rotated_object_detection.md
@@ -95,7 +95,7 @@ paddlex --pipeline rotated_object_detection \
         --save_path ./output \
         --device gpu:0 \
 ```
-相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 ```bash
@@ -148,7 +148,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>None</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/semantic_segmentation.en.md b/docs/pipeline_usage/tutorials/cv_pipelines/semantic_segmentation.en.md
index bc68b90435..81b250b139 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/semantic_segmentation.en.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/semantic_segmentation.en.md
@@ -262,7 +262,7 @@ paddlex --pipeline semantic_segmentation \
         --device gpu:0 \
 ```
 
-The relevant parameter descriptions can be referred to in the parameter explanations in [2.2.2 Python Script Integration]().
+The relevant parameter descriptions can be referred to in the parameter explanations in [2.2.2 Python Script Integration](). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the result will be printed to the terminal, as follows:
 
@@ -317,7 +317,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu".</td>
+<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>None</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/semantic_segmentation.md b/docs/pipeline_usage/tutorials/cv_pipelines/semantic_segmentation.md
index 350322176c..84ea719a01 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/semantic_segmentation.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/semantic_segmentation.md
@@ -269,7 +269,7 @@ paddlex --pipeline semantic_segmentation \
         --save_path ./output \
         --device gpu:0 \
 ```
-相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 ```bash
@@ -321,7 +321,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>None</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/small_object_detection.en.md b/docs/pipeline_usage/tutorials/cv_pipelines/small_object_detection.en.md
index 6309df2a4a..999f128c82 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/small_object_detection.en.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/small_object_detection.en.md
@@ -113,7 +113,7 @@ paddlex --pipeline small_object_detection \
         --device gpu:0
 ```
 
-The relevant parameter descriptions can be referred to in the parameter explanations in [2.2.2 Python Script Integration](#222-python-script-integration).
+The relevant parameter descriptions can be referred to in the parameter explanations in [2.2.2 Python Script Integration](#222-python-script-integration). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the result will be printed to the terminal as follows:
 
@@ -168,7 +168,7 @@ In the above Python script, the following steps are performed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu".</td>
+<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>None</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/small_object_detection.md b/docs/pipeline_usage/tutorials/cv_pipelines/small_object_detection.md
index c73a2ea08b..41359e74ce 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/small_object_detection.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/small_object_detection.md
@@ -117,7 +117,7 @@ paddlex --pipeline small_object_detection \
         --save_path ./output \
         --device gpu:0
 ```
-相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 ```bash
@@ -169,7 +169,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>None</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.en.md b/docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.en.md
index 7f768a9b33..05f600d4ef 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.en.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.en.md
@@ -136,7 +136,7 @@ Parameter Description:
 {'res': {'input_path': 'vehicle_attribute_002.jpg', 'boxes': [{'labels': ['red(红色)', 'sedan(轿车)'], 'cls_scores': array([0.96375, 0.94025]), 'det_score': 0.9774094820022583, 'coordinate': [196.32553, 302.3847, 639.3131, 655.57904]}, {'labels': ['suv(SUV)', 'brown(棕色)'], 'cls_scores': array([0.99968, 0.99317]), 'det_score': 0.9705657958984375, 'coordinate': [769.4419, 278.8417, 1401.0217, 641.3569]}]}}
 ```
 
-For the explanation of the running result parameters, you can refer to the result interpretation in [Section 2.2.2 Integration via Python Script](#222-integration-via-python-script).
+For the explanation of the running result parameters, you can refer to the result interpretation in [Section 2.2.2 Integration via Python Script](#222-integration-via-python-script). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 The visualization results are saved under `save_path`, and the visualization result is as follows:
 
@@ -176,7 +176,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The device used for pipeline inference. It supports specifying the specific card number of GPUs, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPUs, such as "cpu".</td>
+<td>The device used for pipeline inference. It supports specifying the specific card number of GPUs, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPUs, such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.md b/docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.md
index 36aa5fb66c..3c08c8a165 100644
--- a/docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.md
+++ b/docs/pipeline_usage/tutorials/cv_pipelines/vehicle_attribute_recognition.md
@@ -132,7 +132,7 @@ PaddleX 所提供的模型产线均可以快速体验效果，你可以在星河
 ```bash
 paddlex --pipeline vehicle_attribute_recognition --input vehicle_attribute_002.jpg --device gpu:0 --save_path ./output/
 ```
-相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 
@@ -190,7 +190,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md b/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md
index 3beb75304f..c85971d790 100644
--- a/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md
+++ b/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.en.md
@@ -1309,12 +1309,6 @@ To remove the page limit, please add the following configuration to the pipeline
 <td>No</td>
 </tr>
 <tr>
-<td><code>useGeneralOcr</code></td>
-<td><code>boolean</code> | <code>null</code></td>
-<td>Please refer to the description of the <code>use_general_ocr</code> parameter of the pipeline object's <code>visual_predict</code> method.</td>
-<td>No</td>
-</tr>
-<tr>
 <td><code>useSealRecognition</code></td>
 <td><code>boolean</code> | <code>null</code></td>
 <td>Please refer to the description of the <code>use_seal_recognition</code> parameter of the pipeline object's <code>visual_predict</code> method.</td>
@@ -1582,7 +1576,7 @@ To remove the page limit, please add the following configuration to the pipeline
 <tr>
 <td><code>vectorInfo</code></td>
 <td><code>object</code> | <code>null</code></td>
-<td>Serialized result of the vector database. Provided by the <code>buildVectorStore</code> operation.</td>
+<td>Serialized result of the vector database. Provided by the <code>buildVectorStore</code> operation. Please note that the deserialization process involves performing an unpickle operation. To prevent malicious attacks, be sure to use data from trusted sources.</td>
 <td>No</td>
 </tr>
 <tr>
diff --git a/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md b/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md
index 42fbfb0f42..8e2a997104 100644
--- a/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md
+++ b/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v3.md
@@ -1314,12 +1314,6 @@ for res in visual_predict_res:
 <td>否</td>
 </tr>
 <tr>
-<td><code>useGeneralOcr</code></td>
-<td><code>boolean</code> | <code>null</code></td>
-<td>请参阅产线对象中 <code>visual_predict</code> 方法的 <code>use_general_ocr</code> 参数相关说明。</td>
-<td>否</td>
-</tr>
-<tr>
 <td><code>useSealRecognition</code></td>
 <td><code>boolean</code> | <code>null</code></td>
 <td>请参阅产线对象中 <code>visual_predict</code> 方法的 <code>use_seal_recognition</code> 参数相关说明。</td>
@@ -1587,7 +1581,7 @@ for res in visual_predict_res:
 <tr>
 <td><code>vectorInfo</code></td>
 <td><code>object</code> | <code>null</code></td>
-<td>向量数据库序列化结果。由<code>buildVectorStore</code>操作提供。</td>
+<td>向量数据库序列化结果。由<code>buildVectorStore</code>操作提供。请注意，反序列化过程需要进行 unpickle 操作，为了防止恶意攻击，请确保使用可信任来源的数据。</td>
 <td>否</td>
 </tr>
 <tr>
diff --git a/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v4.en.md b/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v4.en.md
index 88312303c3..8b97cbe9c0 100644
--- a/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v4.en.md
+++ b/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v4.en.md
@@ -1446,12 +1446,6 @@ To remove the page limit, please add the following configuration to the pipeline
 <td>No</td>
 </tr>
 <tr>
-<td><code>useGeneralOcr</code></td>
-<td><code>boolean</code> | <code>null</code></td>
-<td>Please refer to the description of the <code>use_general_ocr</code> parameter of the pipeline object's <code>visual_predict</code> method.</td>
-<td>No</td>
-</tr>
-<tr>
 <td><code>useSealRecognition</code></td>
 <td><code>boolean</code> | <code>null</code></td>
 <td>Please refer to the description of the <code>use_seal_recognition</code> parameter of the pipeline object's <code>visual_predict</code> method.</td>
@@ -1775,7 +1769,7 @@ To remove the page limit, please add the following configuration to the pipeline
 <tr>
 <td><code>vectorInfo</code></td>
 <td><code>object</code> | <code>null</code></td>
-<td>Serialized result of the vector database. Provided by the <code>buildVectorStore</code> operation.</td>
+<td>Serialized result of the vector database. Provided by the <code>buildVectorStore</code> operation. Please note that the deserialization process involves performing an unpickle operation. To prevent malicious attacks, be sure to use data from trusted sources.</td>
 <td>No</td>
 </tr>
 <tr>
diff --git a/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v4.md b/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v4.md
index 6b0de8123e..107fac3282 100644
--- a/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v4.md
+++ b/docs/pipeline_usage/tutorials/information_extraction_pipelines/document_scene_information_extraction_v4.md
@@ -1649,12 +1649,6 @@ for res in visual_predict_res:
 <td>否</td>
 </tr>
 <tr>
-<td><code>useGeneralOcr</code></td>
-<td><code>boolean</code> | <code>null</code></td>
-<td>请参阅产线对象中 <code>visual_predict</code> 方法的 <code>use_general_ocr</code> 参数相关说明。</td>
-<td>否</td>
-</tr>
-<tr>
 <td><code>useSealRecognition</code></td>
 <td><code>boolean</code> | <code>null</code></td>
 <td>请参阅产线对象中 <code>visual_predict</code> 方法的 <code>use_seal_recognition</code> 参数相关说明。</td>
@@ -1979,7 +1973,7 @@ for res in visual_predict_res:
 <tr>
 <td><code>vectorInfo</code></td>
 <td><code>object</code> | <code>null</code></td>
-<td>向量数据库序列化结果。由<code>buildVectorStore</code>操作提供。</td>
+<td>向量数据库序列化结果。由<code>buildVectorStore</code>操作提供。请注意，反序列化过程需要进行 unpickle 操作，为了防止恶意攻击，请确保使用可信任来源的数据。</td>
 <td>否</td>
 </tr>
 <tr>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/OCR.en.md b/docs/pipeline_usage/tutorials/ocr_pipelines/OCR.en.md
index d11541a238..33b4c162be 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/OCR.en.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/OCR.en.md
@@ -471,7 +471,7 @@ paddlex --pipeline OCR \
         --device gpu:0
 ```
 
-For details on the relevant parameter descriptions, please refer to the parameter descriptions in [2.2.2 Python Script Integration](#222-python-script-integration).
+For details on the relevant parameter descriptions, please refer to the parameter descriptions in [2.2.2 Python Script Integration](#222-python-script-integration). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the results will be printed to the terminal as follows:
 
@@ -539,7 +539,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The device used for pipeline inference. It supports specifying specific GPU card numbers, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU, such as "cpu".</td>
+<td>The device used for pipeline inference. It supports specifying specific GPU card numbers, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU, such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/OCR.md b/docs/pipeline_usage/tutorials/ocr_pipelines/OCR.md
index d300bc733c..a4497ae614 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/OCR.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/OCR.md
@@ -482,7 +482,7 @@ paddlex --pipeline OCR \
         --save_path ./output \
         --device gpu:0
 ```
-相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 ```bash
@@ -549,7 +549,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.en.md b/docs/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.en.md
index c6b7ce3613..fa61d9747f 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.en.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.en.md
@@ -648,7 +648,7 @@ paddlex --pipeline PP-StructureV3 \
         --device gpu:0
 ```
 
-The parameter description can be found in [2.2.2 Python Script Integration](#222-python-script-integration).
+The parameter description can be found in [2.2.2 Python Script Integration](#222-python-script-integration). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the result will be printed to the terminal, as follows:
 
@@ -784,7 +784,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The inference device for the pipeline. It supports specifying the specific GPU card number, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU, such as "cpu".</td>
+<td>The inference device for the pipeline. It supports specifying the specific GPU card number, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU, such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
@@ -1499,12 +1499,6 @@ To remove the page limit, please add the following configuration to the pipeline
 <td>No</td>
 </tr>
 <tr>
-<td><code>useGeneralOcr</code></td>
-<td><code>boolean</code> | <code>null</code></td>
-<td>Please refer to the description of the <code>use_general_ocr</code> parameter of the pipeline object's <code>predict</code> method.</td>
-<td>No</td>
-</tr>
-<tr>
 <td><code>useSealRecognition</code></td>
 <td><code>boolean</code> | <code>null</code></td>
 <td>Please refer to the description of the <code>use_seal_recognition</code> parameter of the pipeline object's <code>predict</code> method.</td>
@@ -1523,6 +1517,18 @@ To remove the page limit, please add the following configuration to the pipeline
 <td>No</td>
 </tr>
 <tr>
+<td><code>useChartRecognition</code></td>
+<td><code>boolean</code> | <code>null</code></td>
+<td>Please refer to the description of the <code>use_chart_recognition</code> parameter of the pipeline object's <code>predict</code> method.</td>
+<td>No</td>
+</tr>
+<tr>
+<td><code>useRegionDetection</code></td>
+<td><code>boolean</code> | <code>null</code></td>
+<td>Please refer to the description of the <code>use_region_detection</code> parameter of the pipeline object's <code>predict</code> method.</td>
+<td>No</td>
+</tr>
+<tr>
 <td><code>layoutThreshold</code></td>
 <td><code>number</code> | <code>null</code></td>
 <td>Please refer to the description of the <code>layout_threshold</code> parameter of the pipeline object's <code>predict</code> method.</td>
@@ -1619,9 +1625,9 @@ To remove the page limit, please add the following configuration to the pipeline
 <td>No</td>
 </tr>
 <tr>
-<td><code>useTableCellsOcrResults</code></td>
+<td><code>useOcrResultsWithTableCells</code></td>
 <td><code>boolean</code></td>
-<td>Please refer to the description of the <code>use_table_cells_ocr_results</code> parameter of the pipeline object's <code>predict</code> method.</td>
+<td>Please refer to the description of the <code>use_ocr_results_with_table_cells</code> parameter of the pipeline object's <code>predict</code> method.</td>
 <td>No</td>
 </tr>
 <tr>
@@ -1636,6 +1642,24 @@ To remove the page limit, please add the following configuration to the pipeline
 <td>Please refer to the description of the <code>use_e2e_wireless_table_rec_model</code> parameter of the pipeline object's <code>predict</code> method.</td>
 <td>No</td>
 </tr>
+<tr>
+<td><code>useWiredTableCellsTransToHtml</code></td>
+<td><code>boolean</code></td>
+<td>Please refer to the description of the <code>use_wired_table_cells_trans_to_html</code> parameter of the pipeline object's <code>predict</code> method.</td>
+<td>No</td>
+</tr>
+<tr>
+<td><code>useWirelessTableCellsTransToHtml</code></td>
+<td><code>boolean</code></td>
+<td>Please refer to the description of the <code>use_wireless_table_cells_trans_to_html</code> parameter of the pipeline object's <code>predict</code> method.</td>
+<td>No</td>
+</tr>
+<tr>
+<td><code>useTableOrientationClassify</code></td>
+<td><code>boolean</code></td>
+<td>Please refer to the description of the <code>use_table_orientation_classify</code> parameter of the pipeline object's <code>predict</code> method.</td>
+<td>No</td>
+</tr>
 </tbody>
 </table>
 <ul>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.md b/docs/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.md
index a48e0c61ac..b185dc57c9 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/PP-StructureV3.md
@@ -606,7 +606,7 @@ paddlex --pipeline PP-StructureV3 \
         --save_path ./output \
         --device gpu:0
 ```
-相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 
@@ -733,7 +733,7 @@ for item in markdown_images:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
@@ -1445,12 +1445,6 @@ for res in output:
 <td>否</td>
 </tr>
 <tr>
-<td><code>useGeneralOcr</code></td>
-<td><code>boolean</code> | <code>null</code></td>
-<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_general_ocr</code> 参数相关说明。</td>
-<td>否</td>
-</tr>
-<tr>
 <td><code>useSealRecognition</code></td>
 <td><code>boolean</code> | <code>null</code></td>
 <td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_seal_recognition</code> 参数相关说明。</td>
@@ -1469,6 +1463,18 @@ for res in output:
 <td>否</td>
 </tr>
 <tr>
+<td><code>useChartRecognition</code></td>
+<td><code>boolean</code> | <code>null</code></td>
+<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_chart_recognition</code> 参数相关说明。</td>
+<td>否</td>
+</tr>
+<tr>
+<td><code>useRegionDetection</code></td>
+<td><code>boolean</code> | <code>null</code></td>
+<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_region_detection</code> 参数相关说明。</td>
+<td>否</td>
+</tr>
+<tr>
 <td><code>layoutThreshold</code></td>
 <td><code>number</code> | <code>null</code></td>
 <td>请参阅产线对象中 <code>predict</code> 方法的 <code>layout_threshold</code> 参数相关说明。</td>
@@ -1565,9 +1571,9 @@ for res in output:
 <td>否</td>
 </tr>
 <tr>
-<td><code>useTableCellsOcrResults</code></td>
+<td><code>useOcrResultsWithTableCells</code></td>
 <td><code>boolean</code></td>
-<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_table_cells_ocr_results</code> 参数相关说明。</td>
+<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_ocr_results_with_table_cells</code> 参数相关说明。</td>
 <td>否</td>
 </tr>
 <tr>
@@ -1582,6 +1588,24 @@ for res in output:
 <td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_e2e_wireless_table_rec_model</code> 参数相关说明。</td>
 <td>否</td>
 </tr>
+<tr>
+<td><code>useWiredTableCellsTransToHtml</code></td>
+<td><code>boolean</code></td>
+<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_wired_table_cells_trans_to_html</code> 参数相关说明。</td>
+<td>No</td>
+</tr>
+<tr>
+<td><code>useWirelessTableCellsTransToHtml</code></td>
+<td><code>boolean</code></td>
+<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_wireless_table_cells_trans_to_html</code> 参数相关说明。</td>
+<td>No</td>
+</tr>
+<tr>
+<td><code>useTableOrientationClassify</code></td>
+<td><code>boolean</code></td>
+<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_table_orientation_classify</code> 参数相关说明。</td>
+<td>No</td>
+</tr>
 </tbody>
 </table>
 <ul>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.en.md b/docs/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.en.md
index ddd866346e..3930fefe64 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.en.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.en.md
@@ -125,7 +125,7 @@ paddlex --pipeline doc_preprocessor \
         --save_path ./output \
         --device gpu:0
 ```
-You can refer to the parameter descriptions in [2.1.2 Python Script Integration](#212-python-script-integration) for related parameter details.
+You can refer to the parameter descriptions in [2.1.2 Python Script Integration](#212-python-script-integration) for related parameter details. Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the results will be printed to the terminal as follows:
 
@@ -182,7 +182,7 @@ In the above Python script, the following steps were executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>Inference device for the pipeline. Supports specifying the GPU card number, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPU as "cpu".</td>
+<td>Inference device for the pipeline. Supports specifying the GPU card number, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPU as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.md b/docs/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.md
index 5cbc257783..231c319d0a 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/doc_preprocessor.md
@@ -123,7 +123,7 @@ paddlex --pipeline doc_preprocessor \
         --device gpu:0
 ```
 
-相关的参数说明可以参考[2.1.2 Python脚本方式集成](#212-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.1.2 Python脚本方式集成](#212-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 
@@ -184,7 +184,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.en.md b/docs/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.en.md
index 013333501c..5ac9fce986 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.en.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.en.md
@@ -311,7 +311,7 @@ paddlex --pipeline formula_recognition \
         --device gpu:0
 ```
 
-The relevant parameter descriptions can be referenced from [2.2 Integration via Python Script](#22-integration-via-python-script).
+The relevant parameter descriptions can be referenced from [2.2 Integration via Python Script](#22-integration-via-python-script). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the results will be printed to the terminal, as shown below:
 
@@ -387,7 +387,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu".</td>
+<td>Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>None</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.md b/docs/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.md
index a192339a6a..3fde767466 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/formula_recognition.md
@@ -309,7 +309,7 @@ paddlex --pipeline formula_recognition \
         --save_path ./output \
         --device gpu:0
 ```
-相关的参数说明可以参考[2.2 Python脚本方式集成](#22-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2 Python脚本方式集成](#22-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 
@@ -384,7 +384,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>None</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.en.md b/docs/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.en.md
index d55be8c1d7..9217433b2a 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.en.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.en.md
@@ -583,7 +583,7 @@ paddlex --pipeline layout_parsing \
         --save_path ./output \
         --device gpu:0
 ```
-For parameter descriptions, refer to the parameter explanations in [2.2.2 Integration via Python Script](#222-integration-via-python-script).
+For parameter descriptions, refer to the parameter explanations in [2.2.2 Integration via Python Script](#222-integration-via-python-script). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the results will be printed to the terminal, as shown below:
 
@@ -659,7 +659,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The inference device for the pipeline. Supports specifying the specific card number for GPUs, such as "gpu:0", specific card numbers for other hardware, such as "npu:0", and CPUs as "cpu".</td>
+<td>The inference device for the pipeline. Supports specifying the specific card number for GPUs, such as "gpu:0", specific card numbers for other hardware, such as "npu:0", and CPUs as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
@@ -1295,12 +1295,6 @@ To remove the page limit, please add the following configuration to the pipeline
 <td>No</td>
 </tr>
 <tr>
-<td><code>useGeneralOcr</code></td>
-<td><code>boolean</code> | <code>null</code></td>
-<td>Please refer to the description of the <code>use_general_ocr</code> parameter of the pipeline object's <code>predict</code> method.</td>
-<td>No</td>
-</tr>
-<tr>
 <td><code>useSealRecognition</code></td>
 <td><code>boolean</code> | <code>null</code></td>
 <td>Please refer to the description of the <code>use_seal_recognition</code> parameter of the pipeline object's <code>predict</code> method.</td>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.md b/docs/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.md
index fbece4f2ec..9a2968c60e 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/layout_parsing.md
@@ -616,7 +616,7 @@ paddlex --pipeline layout_parsing \
         --save_path ./output \
         --device gpu:0
 ```
-相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2.2 Python脚本方式集成](#222-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 
@@ -696,7 +696,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
@@ -1348,12 +1348,6 @@ for res in output:
 <td>否</td>
 </tr>
 <tr>
-<td><code>useGeneralOcr</code></td>
-<td><code>boolean</code> | <code>null</code></td>
-<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_general_ocr</code> 参数相关说明。</td>
-<td>否</td>
-</tr>
-<tr>
 <td><code>useSealRecognition</code></td>
 <td><code>boolean</code> | <code>null</code></td>
 <td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_seal_recognition</code> 参数相关说明。</td>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.en.md b/docs/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.en.md
index cb57e3f526..c00684fdc7 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.en.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.en.md
@@ -605,7 +605,7 @@ paddlex --pipeline seal_recognition \
     --save_path ./output
 ```
 
-The relevant parameter descriptions can be referred to in the parameter explanations of [2.1.2 Integration via Python Script](#212-integration-via-python-script).
+The relevant parameter descriptions can be referred to in the parameter explanations of [2.1.2 Integration via Python Script](#212-integration-via-python-script). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the results will be printed to the terminal, as follows:
 
@@ -687,7 +687,7 @@ In the above Python script, the following steps were executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The device used for pipeline inference. It supports specifying the specific card number of the GPU, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU, such as "cpu".</td>
+<td>The device used for pipeline inference. It supports specifying the specific card number of the GPU, such as "gpu:0", other hardware card numbers, such as "npu:0", or CPU, such as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.md b/docs/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.md
index 0ef2316b7b..1bea918d85 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/seal_recognition.md
@@ -575,7 +575,7 @@ paddlex --pipeline seal_recognition \
     --save_path ./output
 ```
 
-相关的参数说明可以参考[2.1.2 Python脚本方式集成](#212-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.1.2 Python脚本方式集成](#212-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 
@@ -659,7 +659,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md b/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md
index 3a3ca44588..4754f9dc94 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition.en.md
@@ -648,7 +648,7 @@ paddlex --pipeline table_recognition \
         --device gpu:0
 ```
 
-The content of the parameters can refer to the parameter description in [2.2 Python Script Method](#22-python-script-method-integration).
+The content of the parameters can refer to the parameter description in [2.2 Python Script Method](#22-python-script-method-integration). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 After running, the result will be printed to the terminal, as follows:
 
@@ -741,7 +741,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The inference device for the pipeline. It supports specifying the specific GPU card number, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPU as "cpu".</td>
+<td>The inference device for the pipeline. It supports specifying the specific GPU card number, such as "gpu:0", other hardware card numbers, such as "npu:0", and CPU as "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
@@ -1293,9 +1293,9 @@ To remove the page limit, please add the following configuration to the pipeline
 <td>No</td>
 </tr>
 <tr>
-<td><code>useTableCellsOcrResults</code></td>
+<td><code>useOcrResultsWithTableCells</code></td>
 <td><code>boolean</code></td>
-<td>Please refer to the description of the <code>use_table_cells_ocr_results</code> parameter of the pipeline object's <code>predict</code> method.</td>
+<td>Please refer to the description of the <code>use_ocr_results_with_table_cells</code> parameter of the pipeline object's <code>predict</code> method.</td>
 <td>No</td>
 </tr>
 </tbody>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition.md b/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition.md
index 0a97978525..2a9bd4ee3a 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition.md
@@ -597,7 +597,7 @@ paddlex --pipeline table_recognition \
         --device gpu:0
 ```
 
-相关的参数说明可以参考[2.2 Python脚本方式](#22-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2 Python脚本方式](#22-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 运行后，会将结果打印到终端上，结果如下：
 
@@ -694,7 +694,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
@@ -1239,9 +1239,9 @@ for res in output:
 <td>否</td>
 </tr>
 <tr>
-<td><code>useTableCellsOcrResults</code></td>
+<td><code>useOcrResultsWithTableCells</code></td>
 <td><code>boolean</code></td>
-<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_table_cells_ocr_results</code> 参数相关说明。</td>
+<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_ocr_results_with_table_cells</code> 参数相关说明。</td>
 <td>否</td>
 </tr>
 </tbody>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.en.md b/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.en.md
index 13f7ee2c07..50f6c8ac98 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.en.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.en.md
@@ -702,7 +702,7 @@ paddlex --pipeline table_recognition_v2 \
        [1046, ...,  573]], dtype=int16)}}]}}
 ```
 
-The explanation of the running result parameters can refer to the result interpretation in [2.2.2 Python Script Integration](#222-python-script-integration).
+The explanation of the running result parameters can refer to the result interpretation in [2.2.2 Python Script Integration](#222-python-script-integration). Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to [Pipeline Parallel Inference](../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices).
 
 
 The visualization results are saved under `save_path`, where the visualization result of table recognition is as follows:
@@ -761,7 +761,7 @@ In the above Python script, the following steps are executed:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>The inference device for the pipeline. It supports specifying specific GPU card numbers, such as "gpu:0", specific card numbers for other hardware, such as "npu:0", and CPU like "cpu".</td>
+<td>The inference device for the pipeline. It supports specifying specific GPU card numbers, such as "gpu:0", specific card numbers for other hardware, such as "npu:0", and CPU like "cpu". Supports specifying multiple devices simultaneously for parallel inference. For details, please refer to <a href="../../instructions/parallel_inference.en.md#specifying-multiple-inference-devices">Pipeline Parallel Inference</a>.</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
@@ -1365,9 +1365,9 @@ To remove the page limit, please add the following configuration to the pipeline
 <td>No</td>
 </tr>
 <tr>
-<td><code>useTableCellsOcrResults</code></td>
+<td><code>useOcrResultsWithTableCells</code></td>
 <td><code>boolean</code></td>
-<td>Please refer to the description of the <code>use_table_cells_ocr_results</code> parameter of the pipeline object's <code>predict</code> method.</td>
+<td>Please refer to the description of the <code>use_ocr_results_with_table_cells</code> parameter of the pipeline object's <code>predict</code> method.</td>
 <td>No</td>
 </tr>
 <tr>
@@ -1382,6 +1382,24 @@ To remove the page limit, please add the following configuration to the pipeline
 <td>Please refer to the description of the <code>use_e2e_wireless_table_rec_model</code> parameter of the pipeline object's <code>predict</code> method.</td>
 <td>No</td>
 </tr>
+<tr>
+<td><code>useWiredTableCellsTransToHtml</code></td>
+<td><code>boolean</code></td>
+<td>Please refer to the description of the <code>use_wired_table_cells_trans_to_html</code> parameter of the pipeline object's <code>predict</code> method.</td>
+<td>No</td>
+</tr>
+<tr>
+<td><code>useWirelessTableCellsTransToHtml</code></td>
+<td><code>boolean</code></td>
+<td>Please refer to the description of the <code>use_wireless_table_cells_trans_to_html</code> parameter of the pipeline object's <code>predict</code> method.</td>
+<td>No</td>
+</tr>
+<tr>
+<td><code>useTableOrientationClassify</code></td>
+<td><code>boolean</code></td>
+<td>Please refer to the description of the <code>use_table_orientation_classify</code> parameter of the pipeline object's <code>predict</code> method.</td>
+<td>No</td>
+</tr>
 </tbody>
 </table>
 <ul>
diff --git a/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.md b/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.md
index 5d743bd34b..f1c840966b 100644
--- a/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.md
+++ b/docs/pipeline_usage/tutorials/ocr_pipelines/table_recognition_v2.md
@@ -682,7 +682,7 @@ paddlex --pipeline table_recognition_v2 \
         --device gpu:0
 ```
 
-相关的参数说明可以参考[2.2 Python脚本方式集成](#22-python脚本方式集成)中的参数说明。
+相关的参数说明可以参考[2.2 Python脚本方式集成](#22-python脚本方式集成)中的参数说明。支持同时指定多个设备以进行并行推理，详情请参考 [产线并行推理](../../instructions/parallel_inference.md#指定多个推理设备)。
 
 <details><summary>👉 <b>运行后，得到的结果为：（点击展开）</b></summary>
 
@@ -775,7 +775,7 @@ for res in output:
 </tr>
 <tr>
 <td><code>device</code></td>
-<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。</td>
+<td>产线推理设备。支持指定GPU具体卡号，如“gpu:0”，其他硬件具体卡号，如“npu:0”，CPU如“cpu”。支持同时指定多个设备以进行并行推理，详情请参考 <a href="../../instructions/parallel_inference.md#指定多个推理设备">产线并行推理</a>。</td>
 <td><code>str</code></td>
 <td><code>gpu:0</code></td>
 </tr>
@@ -1371,9 +1371,9 @@ for res in output:
 <td>否</td>
 </tr>
 <tr>
-<td><code>useTableCellsOcrResults</code></td>
+<td><code>useOcrResultsWithTableCells</code></td>
 <td><code>boolean</code></td>
-<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_table_cells_ocr_results</code> 参数相关说明。</td>
+<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_ocr_results_with_table_cells</code> 参数相关说明。</td>
 <td>否</td>
 </tr>
 <tr>
@@ -1388,6 +1388,24 @@ for res in output:
 <td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_e2e_wireless_table_rec_model</code> 参数相关说明。</td>
 <td>否</td>
 </tr>
+<tr>
+<td><code>useWiredTableCellsTransToHtml</code></td>
+<td><code>boolean</code></td>
+<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_wired_table_cells_trans_to_html</code> 参数相关说明。</td>
+<td>No</td>
+</tr>
+<tr>
+<td><code>useWirelessTableCellsTransToHtml</code></td>
+<td><code>boolean</code></td>
+<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_wireless_table_cells_trans_to_html</code> 参数相关说明。</td>
+<td>No</td>
+</tr>
+<tr>
+<td><code>useTableOrientationClassify</code></td>
+<td><code>boolean</code></td>
+<td>请参阅产线对象中 <code>predict</code> 方法的 <code>use_table_orientation_classify</code> 参数相关说明。</td>
+<td>No</td>
+</tr>
 </tbody>
 </table>
 <ul>
diff --git a/mkdocs.yml b/mkdocs.yml
index b8c14072d0..fd5ee568f4 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -367,6 +367,7 @@ nav:
        - 说明文件: 
          - PaddleX产线命令行使用说明: pipeline_usage/instructions/pipeline_CLI_usage.md
          - PaddleX产线Python脚本使用说明: pipeline_usage/instructions/pipeline_python_API.md
+         - 产线并行推理: pipline_usage/instructions/parallel_inference.md
   - 单功能模块使用教程:
        - OCR:
          - 文本检测模块: module_usage/tutorials/ocr_modules/text_detection.md
diff --git a/paddlex/configs/pipelines/PP-StructureV3.yaml b/paddlex/configs/pipelines/PP-StructureV3.yaml
index 88708eaa43..e83cfe0d30 100644
--- a/paddlex/configs/pipelines/PP-StructureV3.yaml
+++ b/paddlex/configs/pipelines/PP-StructureV3.yaml
@@ -1,6 +1,8 @@
 
 pipeline_name: PP-StructureV3
 
+batch_size: 8
+
 use_doc_preprocessor: True
 use_general_ocr: True
 use_seal_recognition: True
@@ -14,6 +16,7 @@ SubModules:
     module_name: layout_detection
     model_name: PP-DocLayout-L
     model_dir: null
+    batch_size: 8
     threshold: 
       0: 0.3  # paragraph_title
       1: 0.5  # image
@@ -96,6 +99,7 @@ SubModules:
 SubPipelines:
   DocPreprocessor:
     pipeline_name: doc_preprocessor
+    batch_size: 8
     use_doc_orientation_classify: True
     use_doc_unwarping: True
     SubModules:
@@ -103,6 +107,7 @@ SubPipelines:
         module_name: doc_text_orientation
         model_name: PP-LCNet_x1_0_doc_ori
         model_dir: null
+        batch_size: 8
       DocUnwarping:
         module_name: image_unwarping
         model_name: UVDoc
@@ -110,6 +115,7 @@ SubPipelines:
 
   GeneralOCR:
     pipeline_name: OCR
+    batch_size: 8
     text_type: general
     use_doc_preprocessor: False
     use_textline_orientation: True
@@ -127,17 +133,18 @@ SubPipelines:
         module_name: textline_orientation
         model_name: PP-LCNet_x0_25_textline_ori
         model_dir: null
-        batch_size: 1 
+        batch_size: 8
       TextRecognition:
         module_name: text_recognition
         model_name: PP-OCRv4_server_rec_doc
         model_dir: null
-        batch_size: 6
+        batch_size: 8
         score_thresh: 0.0
  
 
   TableRecognition:
     pipeline_name: table_recognition_v2
+    batch_size: 8
     use_layout_detection: False
     use_doc_preprocessor: False
     use_ocr_model: False
@@ -169,6 +176,7 @@ SubPipelines:
     SubPipelines:
       GeneralOCR:
         pipeline_name: OCR
+        batch_size: 8
         text_type: general
         use_doc_preprocessor: False
         use_textline_orientation: True
@@ -186,21 +194,23 @@ SubPipelines:
             module_name: textline_orientation
             model_name: PP-LCNet_x0_25_textline_ori
             model_dir: null
-            batch_size: 1 
+            batch_size: 8
           TextRecognition:
             module_name: text_recognition
             model_name: PP-OCRv4_server_rec_doc
             model_dir: null
-            batch_size: 6
+            batch_size: 8
         score_thresh: 0.0
 
   SealRecognition:
     pipeline_name: seal_recognition
+    batch_size: 8
     use_layout_detection: False
     use_doc_preprocessor: False
     SubPipelines:
       SealOCR:
         pipeline_name: OCR
+        batch_size: 8
         text_type: seal
         use_doc_preprocessor: False
         use_textline_orientation: False
@@ -218,11 +228,12 @@ SubPipelines:
             module_name: text_recognition
             model_name: PP-OCRv4_server_rec
             model_dir: null
-            batch_size: 1
+            batch_size: 8
             score_thresh: 0
     
   FormulaRecognition:
     pipeline_name: formula_recognition
+    batch_size: 8
     use_layout_detection: False
     use_doc_preprocessor: False
     SubModules:
@@ -230,4 +241,4 @@ SubPipelines:
         module_name: formula_recognition
         model_name: PP-FormulaNet-L
         model_dir: null
-        batch_size: 5
+        batch_size: 8
diff --git a/paddlex/inference/models/base/predictor/base_predictor.py b/paddlex/inference/models/base/predictor/base_predictor.py
index c4f9947d21..60d078810e 100644
--- a/paddlex/inference/models/base/predictor/base_predictor.py
+++ b/paddlex/inference/models/base/predictor/base_predictor.py
@@ -335,6 +335,8 @@ def _prepare_pp_option(
             device_info = None
         if pp_option is None:
             pp_option = PaddlePredictorOption(model_name=self.model_name)
+        elif pp_option.model_name is None:
+            pp_option.model_name = self.model_name
         if device_info:
             pp_option.device_type = device_info[0]
             pp_option.device_id = device_info[1]
diff --git a/paddlex/inference/models/common/static_infer.py b/paddlex/inference/models/common/static_infer.py
index 8fac3f6d0c..70591f145f 100644
--- a/paddlex/inference/models/common/static_infer.py
+++ b/paddlex/inference/models/common/static_infer.py
@@ -835,7 +835,7 @@ def _build_ui_runtime(self, backend, backend_config, ui_option=None):
                     for name, shapes in backend_config.dynamic_shapes.items():
                         ui_option.trt_option.set_shape(name, *shapes)
                 else:
-                    logging.warning(
+                    logging.info(
                         "TensorRT dynamic shapes will be loaded from the file."
                     )
         elif backend == "om":
diff --git a/paddlex/inference/models/text_detection/processors.py b/paddlex/inference/models/text_detection/processors.py
index 9f41672b4a..b07126c551 100644
--- a/paddlex/inference/models/text_detection/processors.py
+++ b/paddlex/inference/models/text_detection/processors.py
@@ -13,7 +13,6 @@
 # limitations under the License.
 
 import math
-import sys
 from typing import Union
 
 import numpy as np
@@ -105,6 +104,8 @@ def resize_image_type1(self, img):
             resize_w = ori_w * resize_h / ori_h
             N = math.ceil(resize_w / 32)
             resize_w = N * 32
+        if resize_h == ori_h and resize_w == ori_w:
+            return img, [1.0, 1.0]
         ratio_h = float(resize_h) / ori_h
         ratio_w = float(resize_w) / ori_w
         img = cv2.resize(img, (int(resize_w), int(resize_h)))
@@ -152,13 +153,17 @@ def resize_image_type0(
         resize_h = max(int(round(resize_h / 32) * 32), 32)
         resize_w = max(int(round(resize_w / 32) * 32), 32)
 
+        if resize_h == h and resize_w == w:
+            return img, [1.0, 1.0]
+
         try:
             if int(resize_w) <= 0 or int(resize_h) <= 0:
                 return None, (None, None)
             img = cv2.resize(img, (int(resize_w), int(resize_h)))
         except:
             logging.info(img.shape, resize_w, resize_h)
-            sys.exit(0)
+            raise
+
         ratio_h = resize_h / float(h)
         ratio_w = resize_w / float(w)
         return img, [ratio_h, ratio_w]
@@ -181,6 +186,10 @@ def resize_image_type2(self, img):
         max_stride = 128
         resize_h = (resize_h + max_stride - 1) // max_stride * max_stride
         resize_w = (resize_w + max_stride - 1) // max_stride * max_stride
+
+        if resize_h == h and resize_w == w:
+            return img, [1.0, 1.0]
+
         img = cv2.resize(img, (int(resize_w), int(resize_h)))
         ratio_h = resize_h / float(h)
         ratio_w = resize_w / float(w)
@@ -191,6 +200,8 @@ def resize_image_type3(self, img):
         """resize the image"""
         resize_c, resize_h, resize_w = self.input_shape  # (c, h, w)
         ori_h, ori_w = img.shape[:2]  # (h, w, c)
+        if resize_h == ori_h and resize_w == ori_w:
+            return img, [1.0, 1.0]
         ratio_h = float(resize_h) / ori_h
         ratio_w = float(resize_w) / ori_w
         img = cv2.resize(img, (int(resize_w), int(resize_h)))
diff --git a/paddlex/inference/pipelines/__init__.py b/paddlex/inference/pipelines/__init__.py
index a92f85d56b..4b43d3d391 100644
--- a/paddlex/inference/pipelines/__init__.py
+++ b/paddlex/inference/pipelines/__init__.py
@@ -153,15 +153,21 @@ def create_pipeline(
             )
         config = config.copy()
     pipeline_name = config["pipeline_name"]
-    if use_hpip is not None:
-        config["use_hpip"] = use_hpip
-    if hpi_config is not None:
-        config["hpi_config"] = hpi_config
+    if use_hpip is None:
+        use_hpip = config.pop("use_hpip", False)
+    else:
+        config.pop("use_hpip", None)
+    if hpi_config is None:
+        hpi_config = config.pop("hpi_config", None)
+    else:
+        config.pop("hpi_config", None)
 
     pipeline = BasePipeline.get(pipeline_name)(
         config=config,
         device=device,
         pp_option=pp_option,
+        use_hpip=use_hpip,
+        hpi_config=hpi_config,
         *args,
         **kwargs,
     )
diff --git a/paddlex/inference/pipelines/_parallel.py b/paddlex/inference/pipelines/_parallel.py
new file mode 100644
index 0000000000..75e9a2d9b7
--- /dev/null
+++ b/paddlex/inference/pipelines/_parallel.py
@@ -0,0 +1,172 @@
+# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import abc
+from concurrent.futures import ThreadPoolExecutor
+
+from ...utils import device as device_utils
+from ..common.batch_sampler import ImageBatchSampler
+from .base import BasePipeline
+
+
+class MultiDeviceSimpleInferenceExecutor(object):
+    def __init__(self, pipelines, batch_sampler, *, postprocess_result=None):
+        super().__init__()
+        self._pipelines = pipelines
+        self._batch_sampler = batch_sampler
+        self._postprocess_result = postprocess_result
+
+    @property
+    def pipelines(self):
+        return self._pipelines
+
+    def execute(
+        self,
+        input,
+        *args,
+        **kwargs,
+    ):
+        with ThreadPoolExecutor(max_workers=len(self._pipelines)) as pool:
+            input_batches = self._batch_sampler(input)
+            out_of_data = False
+            while not out_of_data:
+                input_future_pairs = []
+                for pipeline in self._pipelines:
+                    try:
+                        input_batch = next(input_batches)
+                    except StopIteration:
+                        out_of_data = True
+                        break
+                    input_instances = input_batch.instances
+                    future = pool.submit(
+                        lambda pipeline, input_instances, args, kwargs: list(
+                            pipeline.predict(input_instances, *args, **kwargs)
+                        ),
+                        pipeline,
+                        input_instances,
+                        args,
+                        kwargs,
+                    )
+                    input_future_pairs.append((input_batch, future))
+
+                # We synchronize here to keep things simple (no data
+                # prefetching, no queues, no dedicated workers), although
+                # it's less efficient.
+                for input_batch, future in input_future_pairs:
+                    result = future.result()
+                    for input_path, result_item in zip(input_batch.input_paths, result):
+                        result_item["input_path"] = input_path
+                    if self._postprocess_result:
+                        result = self._postprocess_result(result, input_batch)
+                    yield from result
+
+
+class AutoParallelSimpleInferencePipeline(BasePipeline):
+    def __init__(
+        self,
+        config,
+        *args,
+        **kwargs,
+    ):
+        super().__init__(*args, **kwargs)
+
+        self._multi_device_inference = False
+        if self.device is not None:
+            device_type, device_ids = device_utils.parse_device(self.device)
+            if device_ids is not None and len(device_ids) > 1:
+                self._multi_device_inference = True
+                self._pipelines = []
+                for device_id in device_ids:
+                    pipeline = self._create_internal_pipeline(
+                        config, device_utils.constr_device(device_type, [device_id])
+                    )
+                    self._pipelines.append(pipeline)
+                batch_size = self._get_batch_size(config)
+                batch_sampler = self._create_batch_sampler(batch_size)
+                self._executor = MultiDeviceSimpleInferenceExecutor(
+                    self._pipelines,
+                    batch_sampler,
+                    postprocess_result=self._postprocess_result,
+                )
+        if not self._multi_device_inference:
+            self._pipeline = self._create_internal_pipeline(config, self.device)
+
+    @property
+    def multi_device_inference(self):
+        return self._multi_device_inference
+
+    def __getattr__(self, name):
+        if self._multi_device_inference:
+            first_pipeline = self._executor.pipelines[0]
+            return getattr(first_pipeline, name)
+        else:
+            return getattr(self._pipeline, name)
+
+    def predict(
+        self,
+        input,
+        *args,
+        **kwargs,
+    ):
+        if self._multi_device_inference:
+            yield from self._executor.execute(
+                input,
+                *args,
+                **kwargs,
+            )
+        else:
+            yield from self._pipeline.predict(
+                input,
+                *args,
+                **kwargs,
+            )
+
+    @abc.abstractmethod
+    def _create_internal_pipeline(self, config, device):
+        raise NotImplementedError
+
+    @abc.abstractmethod
+    def _get_batch_size(self, config):
+        raise NotImplementedError
+
+    @abc.abstractmethod
+    def _create_batch_sampler(self, batch_size):
+        raise NotImplementedError
+
+    def _postprocess_result(self, result, input_batch):
+        return result
+
+
+class AutoParallelImageSimpleInferencePipeline(AutoParallelSimpleInferencePipeline):
+    @property
+    @abc.abstractmethod
+    def _pipeline_cls(self):
+        raise NotImplementedError
+
+    def _create_internal_pipeline(self, config, device):
+        return self._pipeline_cls(
+            config,
+            device=device,
+            pp_option=self.pp_option,
+            use_hpip=self.use_hpip,
+            hpi_config=self.hpi_config,
+        )
+
+    def _create_batch_sampler(self, batch_size):
+        return ImageBatchSampler(batch_size)
+
+    def _postprocess_result(self, result, input_batch):
+        for page_index, item in zip(input_batch.page_indexes, result):
+            item["page_index"] = page_index
+        return result
diff --git a/paddlex/inference/pipelines/anomaly_detection/pipeline.py b/paddlex/inference/pipelines/anomaly_detection/pipeline.py
index 0fae28488c..a04ef24bfb 100644
--- a/paddlex/inference/pipelines/anomaly_detection/pipeline.py
+++ b/paddlex/inference/pipelines/anomaly_detection/pipeline.py
@@ -20,15 +20,13 @@
 from ...models.anomaly_detection.result import UadResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 
 
-@pipeline_requires_extra("cv")
-class AnomalyDetectionPipeline(BasePipeline):
+class _AnomalyDetectionPipeline(BasePipeline):
     """Image AnomalyDetectionPipeline Pipeline"""
 
-    entities = "anomaly_detection"
-
     def __init__(
         self,
         config: Dict,
@@ -70,3 +68,15 @@ def predict(
             UadResult: The predicted anomaly results.
         """
         yield from self.anomaly_detetion_model(input)
+
+
+@pipeline_requires_extra("cv")
+class AnomalyDetectionPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = "anomaly_detection"
+
+    @property
+    def _pipeline_cls(self):
+        return _AnomalyDetectionPipeline
+
+    def _get_batch_size(self, config):
+        return config["SubModules"]["AnomalyDetection"].get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/attribute_recognition/pipeline.py b/paddlex/inference/pipelines/attribute_recognition/pipeline.py
index ce0b2f633b..5733f3ed19 100644
--- a/paddlex/inference/pipelines/attribute_recognition/pipeline.py
+++ b/paddlex/inference/pipelines/attribute_recognition/pipeline.py
@@ -21,12 +21,13 @@
 from ...common.reader import ReadImage
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 from ..components import CropByBoxes
 from .result import AttributeRecResult
 
 
-class AttributeRecPipeline(BasePipeline):
+class _AttributeRecPipeline(BasePipeline):
     """Attribute Rec Pipeline"""
 
     def __init__(
@@ -100,6 +101,15 @@ def get_final_result(self, input_path, raw_img, det_res, rec_res):
         return AttributeRecResult(single_img_res)
 
 
+class AttributeRecPipeline(AutoParallelImageSimpleInferencePipeline):
+    @property
+    def _pipeline_cls(self):
+        return _AttributeRecPipeline
+
+    def _get_batch_size(self, config):
+        return config["SubModules"]["Detection"]["batch_size"]
+
+
 @pipeline_requires_extra("cv")
 class PedestrianAttributeRecPipeline(AttributeRecPipeline):
     entities = "pedestrian_attribute_recognition"
diff --git a/paddlex/inference/pipelines/base.py b/paddlex/inference/pipelines/base.py
index 77e48bb519..087a355260 100644
--- a/paddlex/inference/pipelines/base.py
+++ b/paddlex/inference/pipelines/base.py
@@ -101,7 +101,9 @@ def create_model(self, config: Dict, **kwargs) -> BasePredictor:
             model_dir=model_dir,
             device=self.device,
             batch_size=config.get("batch_size", 1),
-            pp_option=self.pp_option,
+            pp_option=(
+                self.pp_option.copy() if self.pp_option is not None else self.pp_option
+            ),
             use_hpip=use_hpip,
             hpi_config=hpi_config,
             **kwargs,
@@ -132,7 +134,9 @@ def create_pipeline(self, config: Dict):
         pipeline = create_pipeline(
             config=config,
             device=self.device,
-            pp_option=self.pp_option,
+            pp_option=(
+                self.pp_option.copy() if self.pp_option is not None else self.pp_option
+            ),
             use_hpip=use_hpip,
             hpi_config=hpi_config,
         )
diff --git a/paddlex/inference/pipelines/doc_preprocessor/pipeline.py b/paddlex/inference/pipelines/doc_preprocessor/pipeline.py
index 8afa2f8298..2206b85055 100644
--- a/paddlex/inference/pipelines/doc_preprocessor/pipeline.py
+++ b/paddlex/inference/pipelines/doc_preprocessor/pipeline.py
@@ -22,17 +22,15 @@
 from ...common.reader import ReadImage
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 from ..components import rotate_image
 from .result import DocPreprocessorResult
 
 
-@pipeline_requires_extra("ocr")
-class DocPreprocessorPipeline(BasePipeline):
+class _DocPreprocessorPipeline(BasePipeline):
     """Doc Preprocessor Pipeline"""
 
-    entities = "doc_preprocessor"
-
     def __init__(
         self,
         config: Dict,
@@ -76,7 +74,7 @@ def __init__(
             )
             self.doc_unwarping_model = self.create_model(doc_unwarping_config)
 
-        self.batch_sampler = ImageBatchSampler(batch_size=1)
+        self.batch_sampler = ImageBatchSampler(batch_size=config.get("batch_size", 1))
         self.img_reader = ReadImage(format="BGR")
 
     def check_model_settings_valid(self, model_settings: Dict) -> bool:
@@ -155,31 +153,55 @@ def predict(
         if not self.check_model_settings_valid(model_settings):
             yield {"error": "the input params for model settings are invalid!"}
 
-        for img_id, batch_data in enumerate(self.batch_sampler(input)):
-            image_array = self.img_reader(batch_data.instances)[0]
+        for _, batch_data in enumerate(self.batch_sampler(input)):
+            image_arrays = self.img_reader(batch_data.instances)
 
             if model_settings["use_doc_orientation_classify"]:
-                pred = next(self.doc_ori_classify_model(image_array))
-                angle = int(pred["label_names"][0])
-                rot_img = rotate_image(image_array, angle)
+                preds = list(self.doc_ori_classify_model(image_arrays))
+                angles = []
+                rot_imgs = []
+                for img, pred in zip(image_arrays, preds):
+                    angle = int(pred["label_names"][0])
+                    angles.append(angle)
+                    rot_img = rotate_image(img, angle)
+                    rot_imgs.append(rot_img)
             else:
-                angle = -1
-                rot_img = image_array
+                angles = [-1 for _ in range(len(image_arrays))]
+                rot_imgs = image_arrays
 
             if model_settings["use_doc_unwarping"]:
-                output_img = next(self.doc_unwarping_model(rot_img))["doctr_img"][
-                    :, :, ::-1
+                output_imgs = [
+                    item["doctr_img"][:, :, ::-1]
+                    for item in self.doc_unwarping_model(rot_imgs)
                 ]
             else:
-                output_img = rot_img
-
-            single_img_res = {
-                "input_path": batch_data.input_paths[0],
-                "page_index": batch_data.page_indexes[0],
-                "input_img": image_array,
-                "model_settings": model_settings,
-                "angle": angle,
-                "rot_img": rot_img,
-                "output_img": output_img,
-            }
-            yield DocPreprocessorResult(single_img_res)
+                output_imgs = rot_imgs
+
+            for input_path, page_index, image_array, output_img in zip(
+                batch_data.input_paths,
+                batch_data.page_indexes,
+                image_arrays,
+                output_imgs,
+            ):
+                single_img_res = {
+                    "input_path": input_path,
+                    "page_index": page_index,
+                    "input_img": image_array,
+                    "model_settings": model_settings,
+                    "angle": angle,
+                    "rot_img": rot_img,
+                    "output_img": output_img,
+                }
+                yield DocPreprocessorResult(single_img_res)
+
+
+@pipeline_requires_extra("ocr")
+class DocPreprocessorPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = "doc_preprocessor"
+
+    @property
+    def _pipeline_cls(self):
+        return _DocPreprocessorPipeline
+
+    def _get_batch_size(self, config):
+        return config.get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/formula_recognition/pipeline.py b/paddlex/inference/pipelines/formula_recognition/pipeline.py
index 722f658927..b7294c5509 100644
--- a/paddlex/inference/pipelines/formula_recognition/pipeline.py
+++ b/paddlex/inference/pipelines/formula_recognition/pipeline.py
@@ -20,23 +20,18 @@
 from ....utils.deps import pipeline_requires_extra
 from ...common.batch_sampler import ImageBatchSampler
 from ...common.reader import ReadImage
-from ...models.formula_recognition.result import (
-    FormulaRecResult as SingleFormulaRecognitionResult,
-)
 from ...models.object_detection.result import DetResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 from ..components import CropByBoxes
 from .result import FormulaRecognitionResult
 
 
-@pipeline_requires_extra("ocr")
-class FormulaRecognitionPipeline(BasePipeline):
+class _FormulaRecognitionPipeline(BasePipeline):
     """Formula Recognition Pipeline"""
 
-    entities = ["formula_recognition"]
-
     def __init__(
         self,
         config: Dict,
@@ -110,7 +105,7 @@ def __init__(
 
         self._crop_by_boxes = CropByBoxes()
 
-        self.batch_sampler = ImageBatchSampler(batch_size=1)
+        self.batch_sampler = ImageBatchSampler(batch_size=config.get("batch_size", 1))
         self.img_reader = ReadImage(format="BGR")
 
     def get_model_settings(
@@ -147,14 +142,14 @@ def get_model_settings(
         )
 
     def check_model_settings_valid(
-        self, model_settings: Dict, layout_det_res: DetResult
+        self, model_settings: Dict, layout_det_res: Union[DetResult, List[DetResult]]
     ) -> bool:
         """
         Check if the input parameters are valid based on the initialized models.
 
         Args:
             model_settings (Dict): A dictionary containing input parameters.
-            layout_det_res (DetResult): The layout detection result.
+            layout_det_res (Union[DetResult, List[DetResult]]): The layout detection result(s).
         Returns:
             bool: True if all required models are initialized according to input parameters, False otherwise.
         """
@@ -180,32 +175,13 @@ def check_model_settings_valid(
 
         return True
 
-    def predict_single_formula_recognition_res(
-        self,
-        image_array: np.ndarray,
-    ) -> SingleFormulaRecognitionResult:
-        """
-        Predict formula recognition results from an image array, layout detection results.
-
-        Args:
-            image_array (np.ndarray): The input image represented as a numpy array.
-            formula_box (list): The formula box coordinates.
-            flag_find_nei_text (bool): Whether to find neighboring text.
-        Returns:
-            SingleFormulaRecognitionResult: single formula recognition result.
-        """
-
-        formula_recognition_pred = next(self.formula_recognition_model(image_array))
-
-        return formula_recognition_pred
-
     def predict(
         self,
         input: Union[str, List[str], np.ndarray, List[np.ndarray]],
         use_layout_detection: Optional[bool] = None,
         use_doc_orientation_classify: Optional[bool] = None,
         use_doc_unwarping: Optional[bool] = None,
-        layout_det_res: Optional[DetResult] = None,
+        layout_det_res: Optional[Union[DetResult, List[DetResult]]] = None,
         layout_threshold: Optional[Union[float, dict]] = None,
         layout_nms: Optional[bool] = None,
         layout_unclip_ratio: Optional[Union[float, Tuple[float, float]]] = None,
@@ -220,14 +196,13 @@ def predict(
             use_layout_detection (Optional[bool]): Whether to use layout detection.
             use_doc_orientation_classify (Optional[bool]): Whether to use document orientation classification.
             use_doc_unwarping (Optional[bool]): Whether to use document unwarping.
-            layout_det_res (Optional[DetResult]): The layout detection result.
+            layout_det_res (Optional[Union[DetResult, List[DetResult]]]): The layout detection result(s).
                 It will be used if it is not None and use_layout_detection is False.
             **kwargs: Additional keyword arguments.
 
         Returns:
             formulaRecognitionResult: The predicted formula recognition result.
         """
-
         model_settings = self.get_model_settings(
             use_doc_orientation_classify,
             use_doc_unwarping,
@@ -237,73 +212,136 @@ def predict(
         if not self.check_model_settings_valid(model_settings, layout_det_res):
             yield {"error": "the input params for model settings are invalid!"}
 
-        for img_id, batch_data in enumerate(self.batch_sampler(input)):
-            image_array = self.img_reader(batch_data.instances)[0]
+        external_layout_det_results = layout_det_res
+        if external_layout_det_results is not None:
+            if not isinstance(external_layout_det_results, list):
+                external_layout_det_results = [external_layout_det_results]
+            external_layout_det_results = iter(external_layout_det_results)
+
+        for _, batch_data in enumerate(self.batch_sampler(input)):
+            image_arrays = self.img_reader(batch_data.instances)
 
             if model_settings["use_doc_preprocessor"]:
-                doc_preprocessor_res = next(
+                doc_preprocessor_results = list(
                     self.doc_preprocessor_pipeline(
-                        image_array,
+                        image_arrays,
                         use_doc_orientation_classify=use_doc_orientation_classify,
                         use_doc_unwarping=use_doc_unwarping,
                     )
                 )
             else:
-                doc_preprocessor_res = {"output_img": image_array}
+                doc_preprocessor_results = [{"output_img": arr} for arr in image_arrays]
 
-            doc_preprocessor_image = doc_preprocessor_res["output_img"]
+            doc_preprocessor_images = [
+                item["output_img"] for item in doc_preprocessor_results
+            ]
 
-            formula_res_list = []
-            formula_region_id = 1
+            formula_results = []
 
-            if not model_settings["use_layout_detection"] and layout_det_res is None:
-                layout_det_res = {}
-                img_height, img_width = doc_preprocessor_image.shape[:2]
-                single_formula_rec_res = self.predict_single_formula_recognition_res(
-                    doc_preprocessor_image,
+            if (
+                not model_settings["use_layout_detection"]
+                and external_layout_det_results is None
+            ):
+                layout_det_results = [{} for _ in doc_preprocessor_images]
+                formula_rec_results = list(
+                    self.formula_recognition_model(doc_preprocessor_images)
                 )
-                single_formula_rec_res["formula_region_id"] = formula_region_id
-                formula_res_list.append(single_formula_rec_res)
-                formula_region_id += 1
+                for formula_rec_res in formula_rec_results:
+                    formula_results_for_img = []
+                    formula_rec_res["formula_region_id"] = 1
+                    formula_results_for_img.append(formula_rec_res)
+                    formula_results.append(formula_results_for_img)
             else:
                 if model_settings["use_layout_detection"]:
-                    layout_det_res = next(
+                    layout_det_results = list(
                         self.layout_det_model(
-                            doc_preprocessor_image,
+                            doc_preprocessor_images,
                             threshold=layout_threshold,
                             layout_nms=layout_nms,
                             layout_unclip_ratio=layout_unclip_ratio,
                             layout_merge_bboxes_mode=layout_merge_bboxes_mode,
                         )
                     )
-                formula_crop_img = []
-                for box_info in layout_det_res["boxes"]:
-                    if box_info["label"].lower() in ["formula"]:
-                        crop_img_info = self._crop_by_boxes(
-                            doc_preprocessor_image, [box_info]
-                        )
-                        crop_img_info = crop_img_info[0]
-                        formula_crop_img.append(crop_img_info["img"])
-                        single_formula_rec_res = {}
-                        single_formula_rec_res["formula_region_id"] = formula_region_id
-                        single_formula_rec_res["dt_polys"] = box_info["coordinate"]
-                        formula_res_list.append(single_formula_rec_res)
-                        formula_region_id += 1
-                for idx, formula_rec_res in enumerate(
-                    self.formula_recognition_model(formula_crop_img)
+                else:
+                    layout_det_results = []
+                    for _ in doc_preprocessor_images:
+                        try:
+                            layout_det_res = next(external_layout_det_results)
+                        except StopIteration:
+                            raise ValueError("No more layout det results")
+                        layout_det_results.append(layout_det_res)
+
+                formula_crop_imgs = []
+                formula_det_results = []
+                chunk_indices = [0]
+                for doc_preprocessor_image, layout_det_res in zip(
+                    doc_preprocessor_images, layout_det_results
                 ):
-                    formula_region_id = formula_res_list[idx]["formula_region_id"]
-                    dt_polys = formula_res_list[idx]["dt_polys"]
-                    formula_rec_res["formula_region_id"] = formula_region_id
-                    formula_rec_res["dt_polys"] = dt_polys
-                    formula_res_list[idx] = formula_rec_res
-
-            single_img_res = {
-                "input_path": batch_data.input_paths[0],
-                "page_index": batch_data.page_indexes[0],
-                "layout_det_res": layout_det_res,
-                "doc_preprocessor_res": doc_preprocessor_res,
-                "formula_res_list": formula_res_list,
-                "model_settings": model_settings,
-            }
-            yield FormulaRecognitionResult(single_img_res)
+                    formula_region_id = 1
+                    for box_info in layout_det_res["boxes"]:
+                        if box_info["label"].lower() in ["formula"]:
+                            crop_img_info = self._crop_by_boxes(
+                                doc_preprocessor_image, [box_info]
+                            )
+                            crop_img_info = crop_img_info[0]
+                            formula_crop_imgs.append(crop_img_info["img"])
+                            res = {}
+                            res["formula_region_id"] = formula_region_id
+                            res["dt_polys"] = box_info["coordinate"]
+                            formula_det_results.append(res)
+                            formula_region_id += 1
+                    chunk_indices.append(len(formula_crop_imgs))
+
+                formula_rec_results = list(
+                    self.formula_recognition_model(formula_crop_imgs)
+                )
+                for idx in range(len(chunk_indices) - 1):
+                    formula_det_results_for_idx = formula_det_results[
+                        chunk_indices[idx] : chunk_indices[idx + 1]
+                    ]
+                    formula_rec_results_for_idx = formula_rec_results[
+                        chunk_indices[idx] : chunk_indices[idx + 1]
+                    ]
+                    for formula_det_res, formula_rec_res in zip(
+                        formula_det_results_for_idx, formula_rec_results_for_idx
+                    ):
+                        formula_region_id = formula_det_res["formula_region_id"]
+                        dt_polys = formula_det_res["dt_polys"]
+                        formula_rec_res["formula_region_id"] = formula_region_id
+                        formula_rec_res["dt_polys"] = dt_polys
+                    formula_results.append(formula_rec_results_for_idx)
+
+            for (
+                input_path,
+                page_index,
+                layout_det_res,
+                doc_preprocessor_res,
+                formula_results_for_img,
+            ) in zip(
+                batch_data.input_paths,
+                batch_data.page_indexes,
+                layout_det_results,
+                doc_preprocessor_results,
+                formula_results,
+            ):
+                single_img_res = {
+                    "input_path": input_path,
+                    "page_index": page_index,
+                    "layout_det_res": layout_det_res,
+                    "doc_preprocessor_res": doc_preprocessor_res,
+                    "formula_res_list": formula_results_for_img,
+                    "model_settings": model_settings,
+                }
+                yield FormulaRecognitionResult(single_img_res)
+
+
+@pipeline_requires_extra("ocr")
+class FormulaRecognitionPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = ["formula_recognition"]
+
+    @property
+    def _pipeline_cls(self):
+        return _FormulaRecognitionPipeline
+
+    def _get_batch_size(self, config):
+        return config.get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/image_classification/pipeline.py b/paddlex/inference/pipelines/image_classification/pipeline.py
index c32262e06a..960a0827fa 100644
--- a/paddlex/inference/pipelines/image_classification/pipeline.py
+++ b/paddlex/inference/pipelines/image_classification/pipeline.py
@@ -20,15 +20,13 @@
 from ...models.image_classification.result import TopkResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 
 
-@pipeline_requires_extra("cv")
-class ImageClassificationPipeline(BasePipeline):
+class _ImageClassificationPipeline(BasePipeline):
     """Image Classification Pipeline"""
 
-    entities = "image_classification"
-
     def __init__(
         self,
         config: Dict,
@@ -78,3 +76,15 @@ def predict(
 
         topk = kwargs.pop("topk", self.topk)
         yield from self.image_classification_model(input, topk=topk)
+
+
+@pipeline_requires_extra("cv")
+class ImageClassificationPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = "image_classification"
+
+    @property
+    def _pipeline_cls(self):
+        return _ImageClassificationPipeline
+
+    def _get_batch_size(self, config):
+        return config["SubModules"]["ImageClassification"].get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/image_multilabel_classification/pipeline.py b/paddlex/inference/pipelines/image_multilabel_classification/pipeline.py
index cfd4997ce2..3d89d368b2 100644
--- a/paddlex/inference/pipelines/image_multilabel_classification/pipeline.py
+++ b/paddlex/inference/pipelines/image_multilabel_classification/pipeline.py
@@ -20,15 +20,13 @@
 from ...models.image_multilabel_classification.result import MLClassResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 
 
-@pipeline_requires_extra("cv")
-class ImageMultiLabelClassificationPipeline(BasePipeline):
+class _ImageMultiLabelClassificationPipeline(BasePipeline):
     """Image Multi Label Classification Pipeline"""
 
-    entities = "image_multilabel_classification"
-
     def __init__(
         self,
         config: Dict,
@@ -85,3 +83,15 @@ def predict(
             input=input,
             threshold=self.threshold if threshold is None else threshold,
         )
+
+
+@pipeline_requires_extra("cv")
+class ImageMultiLabelClassificationPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = "image_multilabel_classification"
+
+    @property
+    def _pipeline_cls(self):
+        return _ImageMultiLabelClassificationPipeline
+
+    def _get_batch_size(self, config):
+        return config["SubModules"]["ImageMultiLabelClassification"]["batch_size"]
diff --git a/paddlex/inference/pipelines/instance_segmentation/pipeline.py b/paddlex/inference/pipelines/instance_segmentation/pipeline.py
index f30483e03a..92beeace0e 100644
--- a/paddlex/inference/pipelines/instance_segmentation/pipeline.py
+++ b/paddlex/inference/pipelines/instance_segmentation/pipeline.py
@@ -20,15 +20,13 @@
 from ...models.instance_segmentation.result import InstanceSegResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 
 
-@pipeline_requires_extra("cv")
-class InstanceSegmentationPipeline(BasePipeline):
+class _InstanceSegmentationPipeline(BasePipeline):
     """Instance Segmentation Pipeline"""
 
-    entities = "instance_segmentation"
-
     def __init__(
         self,
         config: Dict,
@@ -79,3 +77,15 @@ def predict(
             InstanceSegResult: The predicted instance segmentation results.
         """
         yield from self.instance_segmentation_model(input, threshold=threshold)
+
+
+@pipeline_requires_extra("cv")
+class InstanceSegmentationPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = "instance_segmentation"
+
+    @property
+    def _pipeline_cls(self):
+        return _InstanceSegmentationPipeline
+
+    def _get_batch_size(self, config):
+        return config["SubModules"]["InstanceSegmentation"].get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/keypoint_detection/pipeline.py b/paddlex/inference/pipelines/keypoint_detection/pipeline.py
index cf0e6015bf..63936eacbe 100644
--- a/paddlex/inference/pipelines/keypoint_detection/pipeline.py
+++ b/paddlex/inference/pipelines/keypoint_detection/pipeline.py
@@ -20,17 +20,15 @@
 from ...models.keypoint_detection.result import KptResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 
 Number = Union[int, float]
 
 
-@pipeline_requires_extra("cv")
-class KeypointDetectionPipeline(BasePipeline):
+class _KeypointDetectionPipeline(BasePipeline):
     """Keypoint Detection pipeline"""
 
-    entities = "human_keypoint_detection"
-
     def __init__(
         self,
         config: Dict,
@@ -146,3 +144,15 @@ def predict(
                     }
                 )
             yield KptResult(single_img_res)
+
+
+@pipeline_requires_extra("cv")
+class KeypointDetectionPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = "human_keypoint_detection"
+
+    @property
+    def _pipeline_cls(self):
+        return _KeypointDetectionPipeline
+
+    def _get_batch_size(self, config):
+        return config["SubModules"]["ObjectDetection"].get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/layout_parsing/pipeline.py b/paddlex/inference/pipelines/layout_parsing/pipeline.py
index 6917ed6f95..88da9801b7 100644
--- a/paddlex/inference/pipelines/layout_parsing/pipeline.py
+++ b/paddlex/inference/pipelines/layout_parsing/pipeline.py
@@ -23,6 +23,7 @@
 from ...models.object_detection.result import DetResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 from ..components import CropByBoxes
 from ..ocr.result import OCRResult
@@ -30,12 +31,9 @@
 from .utils import get_sub_regions_ocr_res, sorted_layout_boxes
 
 
-@pipeline_requires_extra("ocr")
-class LayoutParsingPipeline(BasePipeline):
+class _LayoutParsingPipeline(BasePipeline):
     """Layout Parsing Pipeline"""
 
-    entities = ["layout_parsing"]
-
     def __init__(
         self,
         config: Dict,
@@ -579,3 +577,15 @@ def predict(
                 "model_settings": model_settings,
             }
             yield LayoutParsingResult(single_img_res)
+
+
+@pipeline_requires_extra("ocr")
+class LayoutParsingPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = ["layout_parsing"]
+
+    @property
+    def _pipeline_cls(self):
+        return _LayoutParsingPipeline
+
+    def _get_batch_size(self, config):
+        return 1
diff --git a/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py b/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py
index 6b2b9cc03e..5a1073ae25 100644
--- a/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py
+++ b/paddlex/inference/pipelines/layout_parsing/pipeline_v2.py
@@ -27,6 +27,7 @@
 from ...models.object_detection.result import DetResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 from ..ocr.result import OCRResult
 from .result_v2 import LayoutParsingBlock, LayoutParsingRegion, LayoutParsingResultV2
@@ -48,12 +49,9 @@
 )
 
 
-@pipeline_requires_extra("ocr")
-class LayoutParsingPipelineV2(BasePipeline):
+class _LayoutParsingPipelineV2(BasePipeline):
     """Layout Parsing Pipeline V2"""
 
-    entities = ["PP-StructureV3"]
-
     def __init__(
         self,
         config: dict,
@@ -83,8 +81,8 @@ def __init__(
         )
 
         self.inintial_predictor(config)
-        self.batch_sampler = ImageBatchSampler(batch_size=1)
 
+        self.batch_sampler = ImageBatchSampler(batch_size=config.get("batch_size", 1))
         self.img_reader = ReadImage(format="BGR")
 
     def inintial_predictor(self, config: dict) -> None:
@@ -403,7 +401,7 @@ def standardized_data(
                             overall_ocr_res["rec_texts"][ocr_idx] = ""
                     x1, y1, x2, y2 = [int(i) for i in crop_box]
                     crop_img = np.array(image)[y1:y2, x1:x2]
-                    crop_img_rec_res = next(text_rec_model([crop_img]))
+                    crop_img_rec_res = list(text_rec_model([crop_img]))[0]
                     crop_img_dt_poly = get_bbox_intersection(
                         overall_ocr_dt_poly, layout_box, return_format="poly"
                     )
@@ -608,7 +606,7 @@ def sort_line_by_projection(
                         int(bbox[1]) : int(bbox[3]),
                         int(bbox[0]) : int(bbox[2]),
                     ]
-                    crop_img_rec_res = next(text_rec_model([crop_img]))
+                    crop_img_rec_res = list(text_rec_model([crop_img]))[0]
                     crop_img_rec_score = crop_img_rec_res["rec_score"]
                     crop_img_rec_text = crop_img_rec_res["rec_text"]
                     text = (
@@ -1050,69 +1048,78 @@ def predict(
             yield {"error": "the input params for model settings are invalid!"}
 
         for batch_data in self.batch_sampler(input):
-            image_array = self.img_reader(batch_data.instances)[0]
+            image_arrays = self.img_reader(batch_data.instances)
 
             if model_settings["use_doc_preprocessor"]:
-                doc_preprocessor_res = next(
+                doc_preprocessor_results = list(
                     self.doc_preprocessor_pipeline(
-                        image_array,
+                        image_arrays,
                         use_doc_orientation_classify=use_doc_orientation_classify,
                         use_doc_unwarping=use_doc_unwarping,
-                    ),
+                    )
                 )
             else:
-                doc_preprocessor_res = {"output_img": image_array}
+                doc_preprocessor_results = [{"output_img": arr} for arr in image_arrays]
 
-            doc_preprocessor_image = doc_preprocessor_res["output_img"]
+            doc_preprocessor_images = [
+                item["output_img"] for item in doc_preprocessor_results
+            ]
 
-            layout_det_res = next(
+            layout_det_results = list(
                 self.layout_det_model(
-                    doc_preprocessor_image,
+                    doc_preprocessor_images,
                     threshold=layout_threshold,
                     layout_nms=layout_nms,
                     layout_unclip_ratio=layout_unclip_ratio,
                     layout_merge_bboxes_mode=layout_merge_bboxes_mode,
                 )
             )
-
-            imgs_in_doc = gather_imgs(doc_preprocessor_image, layout_det_res["boxes"])
+            imgs_in_doc = [
+                gather_imgs(img, res["boxes"])
+                for img, res in zip(doc_preprocessor_images, layout_det_results)
+            ]
 
             if model_settings["use_region_detection"]:
-                region_det_res = next(
+                region_det_results = list(
                     self.region_detection_model(
-                        doc_preprocessor_image,
+                        doc_preprocessor_images,
                         layout_nms=True,
                         layout_merge_bboxes_mode="small",
                     ),
                 )
             else:
-                region_det_res = {"boxes": []}
+                region_det_results = [{"boxes": []} for _ in doc_preprocessor_images]
 
             if model_settings["use_formula_recognition"]:
-                formula_res_all = next(
+                formula_res_all = list(
                     self.formula_recognition_pipeline(
-                        doc_preprocessor_image,
+                        doc_preprocessor_images,
                         use_layout_detection=False,
                         use_doc_orientation_classify=False,
                         use_doc_unwarping=False,
-                        layout_det_res=layout_det_res,
+                        layout_det_res=layout_det_results,
                     ),
                 )
-                formula_res_list = formula_res_all["formula_res_list"]
+                formula_res_lists = [
+                    item["formula_res_list"] for item in formula_res_all
+                ]
             else:
-                formula_res_list = []
+                formula_res_lists = [[] for _ in doc_preprocessor_images]
 
-            for formula_res in formula_res_list:
-                x_min, y_min, x_max, y_max = list(map(int, formula_res["dt_polys"]))
-                doc_preprocessor_image[y_min:y_max, x_min:x_max, :] = 255.0
+            for doc_preprocessor_image, formula_res_list in zip(
+                doc_preprocessor_images, formula_res_lists
+            ):
+                for formula_res in formula_res_list:
+                    x_min, y_min, x_max, y_max = list(map(int, formula_res["dt_polys"]))
+                    doc_preprocessor_image[y_min:y_max, x_min:x_max, :] = 255.0
 
             if (
                 model_settings["use_general_ocr"]
                 or model_settings["use_table_recognition"]
             ):
-                overall_ocr_res = next(
+                overall_ocr_results = list(
                     self.general_ocr_pipeline(
-                        doc_preprocessor_image,
+                        doc_preprocessor_images,
                         use_textline_orientation=use_textline_orientation,
                         text_det_limit_side_len=text_det_limit_side_len,
                         text_det_limit_type=text_det_limit_type,
@@ -1123,90 +1130,109 @@ def predict(
                     ),
                 )
             else:
-                overall_ocr_res = {
-                    "dt_polys": [],
-                    "rec_texts": [],
-                    "rec_scores": [],
-                    "rec_polys": [],
-                    "rec_boxes": np.array([]),
-                }
+                overall_ocr_results = [
+                    {
+                        "dt_polys": [],
+                        "rec_texts": [],
+                        "rec_scores": [],
+                        "rec_polys": [],
+                        "rec_boxes": np.array([]),
+                    }
+                    for _ in doc_preprocessor_images
+                ]
 
-            overall_ocr_res["rec_labels"] = ["text"] * len(overall_ocr_res["rec_texts"])
+            for overall_ocr_res in overall_ocr_results:
+                overall_ocr_res["rec_labels"] = ["text"] * len(
+                    overall_ocr_res["rec_texts"]
+                )
 
             if model_settings["use_table_recognition"]:
-                table_contents = copy.deepcopy(overall_ocr_res)
-                for formula_res in formula_res_list:
-                    x_min, y_min, x_max, y_max = list(map(int, formula_res["dt_polys"]))
-                    poly_points = [
-                        (x_min, y_min),
-                        (x_max, y_min),
-                        (x_max, y_max),
-                        (x_min, y_max),
-                    ]
-                    table_contents["dt_polys"].append(poly_points)
-                    table_contents["rec_texts"].append(
-                        f"${formula_res['rec_formula']}$"
-                    )
-                    if table_contents["rec_boxes"].size == 0:
-                        table_contents["rec_boxes"] = np.array(
-                            [formula_res["dt_polys"]]
+                table_contents = []
+                for overall_ocr_res, formula_res_list, imgs_in_doc_for_img in zip(
+                    overall_ocr_results, formula_res_lists, imgs_in_doc
+                ):
+                    table_contents_for_img = copy.deepcopy(overall_ocr_res)
+                    for formula_res in formula_res_list:
+                        x_min, y_min, x_max, y_max = list(
+                            map(int, formula_res["dt_polys"])
                         )
-                    else:
-                        table_contents["rec_boxes"] = np.vstack(
-                            (table_contents["rec_boxes"], [formula_res["dt_polys"]])
+                        poly_points = [
+                            (x_min, y_min),
+                            (x_max, y_min),
+                            (x_max, y_max),
+                            (x_min, y_max),
+                        ]
+                        table_contents_for_img["dt_polys"].append(poly_points)
+                        table_contents_for_img["rec_texts"].append(
+                            f"${formula_res['rec_formula']}$"
                         )
-                    table_contents["rec_polys"].append(poly_points)
-                    table_contents["rec_scores"].append(1)
-
-                for img in imgs_in_doc:
-                    img_path = img["path"]
-                    x_min, y_min, x_max, y_max = img["coordinate"]
-                    poly_points = [
-                        (x_min, y_min),
-                        (x_max, y_min),
-                        (x_max, y_max),
-                        (x_min, y_max),
-                    ]
-                    table_contents["dt_polys"].append(poly_points)
-                    table_contents["rec_texts"].append(
-                        f'<div style="text-align: center;"><img src="{img_path}" alt="Image" /></div>'
-                    )
-                    if table_contents["rec_boxes"].size == 0:
-                        table_contents["rec_boxes"] = np.array([img["coordinate"]])
-                    else:
-                        table_contents["rec_boxes"] = np.vstack(
-                            (table_contents["rec_boxes"], img["coordinate"])
+                        if table_contents_for_img["rec_boxes"].size == 0:
+                            table_contents_for_img["rec_boxes"] = np.array(
+                                [formula_res["dt_polys"]]
+                            )
+                        else:
+                            table_contents_for_img["rec_boxes"] = np.vstack(
+                                (
+                                    table_contents_for_img["rec_boxes"],
+                                    [formula_res["dt_polys"]],
+                                )
+                            )
+                        table_contents_for_img["rec_polys"].append(poly_points)
+                        table_contents_for_img["rec_scores"].append(1)
+
+                    for img in imgs_in_doc_for_img:
+                        img_path = img["path"]
+                        x_min, y_min, x_max, y_max = img["coordinate"]
+                        poly_points = [
+                            (x_min, y_min),
+                            (x_max, y_min),
+                            (x_max, y_max),
+                            (x_min, y_max),
+                        ]
+                        table_contents_for_img["dt_polys"].append(poly_points)
+                        table_contents_for_img["rec_texts"].append(
+                            f'<div style="text-align: center;"><img src="{img_path}" alt="Image" /></div>'
                         )
-                    table_contents["rec_polys"].append(poly_points)
-                    table_contents["rec_scores"].append(img["score"])
+                        if table_contents_for_img["rec_boxes"].size == 0:
+                            table_contents_for_img["rec_boxes"] = np.array(
+                                [img["coordinate"]]
+                            )
+                        else:
+                            table_contents_for_img["rec_boxes"] = np.vstack(
+                                (table_contents_for_img["rec_boxes"], img["coordinate"])
+                            )
+                        table_contents_for_img["rec_polys"].append(poly_points)
+                        table_contents_for_img["rec_scores"].append(img["score"])
 
-                table_res_all = next(
+                    table_contents.append(table_contents_for_img)
+
+                table_res_all = list(
                     self.table_recognition_pipeline(
-                        doc_preprocessor_image,
+                        doc_preprocessor_images,
                         use_doc_orientation_classify=False,
                         use_doc_unwarping=False,
                         use_layout_detection=False,
                         use_ocr_model=False,
                         overall_ocr_res=table_contents,
-                        layout_det_res=layout_det_res,
+                        layout_det_res=layout_det_results,
                         cell_sort_by_y_projection=True,
                         use_table_cells_ocr_results=use_table_cells_ocr_results,
                         use_e2e_wired_table_rec_model=use_e2e_wired_table_rec_model,
                         use_e2e_wireless_table_rec_model=use_e2e_wireless_table_rec_model,
                     ),
                 )
-                table_res_list = table_res_all["table_res_list"]
+                table_res_lists = [item["table_res_list"] for item in table_res_all]
             else:
-                table_res_list = []
+                table_res_lists = [[] for _ in doc_preprocessor_images]
 
             if model_settings["use_seal_recognition"]:
-                seal_res_all = next(
+                seal_res_all = list(
                     self.seal_recognition_pipeline(
-                        doc_preprocessor_image,
+                        doc_preprocessor_images,
                         use_doc_orientation_classify=False,
                         use_doc_unwarping=False,
                         use_layout_detection=False,
-                        layout_det_res=layout_det_res,
+                        layout_det_res=layout_det_results,
                         seal_det_limit_side_len=seal_det_limit_side_len,
                         seal_det_limit_type=seal_det_limit_type,
                         seal_det_thresh=seal_det_thresh,
@@ -1215,62 +1241,87 @@ def predict(
                         seal_rec_score_thresh=seal_rec_score_thresh,
                     ),
                 )
-                seal_res_list = seal_res_all["seal_res_list"]
+                seal_res_lists = [item["seal_res_list"] for item in seal_res_all]
             else:
-                seal_res_list = []
-
-            chart_res_list = []
-            if model_settings["use_chart_recognition"]:
-                chart_imgs_list = []
-                for bbox in layout_det_res["boxes"]:
-                    if bbox["label"] == "chart":
-                        x_min, y_min, x_max, y_max = bbox["coordinate"]
-                        chart_img = doc_preprocessor_image[
-                            int(y_min) : int(y_max), int(x_min) : int(x_max), :
-                        ]
-                        chart_imgs_list.append({"image": chart_img})
-
-                for chart_res_batch in self.chart_recognition_model(
-                    input=chart_imgs_list,
-                    max_new_tokens=max_new_tokens,
-                    no_repeat_ngram_size=no_repeat_ngram_size,
-                ):
-                    chart_res_list.append(chart_res_batch["result"])
+                seal_res_lists = [[] for _ in doc_preprocessor_images]
 
-            parsing_res_list = self.get_layout_parsing_res(
+            for (
+                input_path,
+                page_index,
                 doc_preprocessor_image,
-                region_det_res=region_det_res,
-                layout_det_res=layout_det_res,
-                overall_ocr_res=overall_ocr_res,
-                table_res_list=table_res_list,
-                seal_res_list=seal_res_list,
-                chart_res_list=chart_res_list,
-                formula_res_list=formula_res_list,
-                text_rec_score_thresh=text_rec_score_thresh,
-            )
+                doc_preprocessor_res,
+                layout_det_res,
+                region_det_res,
+                overall_ocr_res,
+                table_res_list,
+                seal_res_list,
+                formula_res_list,
+                imgs_in_doc_for_img,
+            ) in zip(
+                batch_data.input_paths,
+                batch_data.page_indexes,
+                doc_preprocessor_images,
+                doc_preprocessor_results,
+                layout_det_results,
+                region_det_results,
+                overall_ocr_results,
+                table_res_lists,
+                seal_res_lists,
+                formula_res_lists,
+                imgs_in_doc,
+            ):
+                chart_res_list = []
+                if model_settings["use_chart_recognition"]:
+                    chart_imgs_list = []
+                    for bbox in layout_det_res["boxes"]:
+                        if bbox["label"] == "chart":
+                            x_min, y_min, x_max, y_max = bbox["coordinate"]
+                            chart_img = doc_preprocessor_image[
+                                int(y_min) : int(y_max), int(x_min) : int(x_max), :
+                            ]
+                            chart_imgs_list.append({"image": chart_img})
+
+                    for chart_res_batch in self.chart_recognition_model(
+                        input=chart_imgs_list,
+                        max_new_tokens=max_new_tokens,
+                        no_repeat_ngram_size=no_repeat_ngram_size,
+                    ):
+                        chart_res_list.append(chart_res_batch["result"])
 
-            for formula_res in formula_res_list:
-                x_min, y_min, x_max, y_max = list(map(int, formula_res["dt_polys"]))
-                doc_preprocessor_image[y_min:y_max, x_min:x_max, :] = formula_res[
-                    "input_img"
-                ]
+                parsing_res_list = self.get_layout_parsing_res(
+                    doc_preprocessor_image,
+                    region_det_res=region_det_res,
+                    layout_det_res=layout_det_res,
+                    overall_ocr_res=overall_ocr_res,
+                    table_res_list=table_res_list,
+                    seal_res_list=seal_res_list,
+                    chart_res_list=chart_res_list,
+                    formula_res_list=formula_res_list,
+                    text_rec_score_thresh=text_rec_score_thresh,
+                )
 
-            single_img_res = {
-                "input_path": batch_data.input_paths[0],
-                "page_index": batch_data.page_indexes[0],
-                "doc_preprocessor_res": doc_preprocessor_res,
-                "layout_det_res": layout_det_res,
-                "region_det_res": region_det_res,
-                "overall_ocr_res": overall_ocr_res,
-                "table_res_list": table_res_list,
-                "seal_res_list": seal_res_list,
-                "chart_res_list": chart_res_list,
-                "formula_res_list": formula_res_list,
-                "parsing_res_list": parsing_res_list,
-                "imgs_in_doc": imgs_in_doc,
-                "model_settings": model_settings,
-            }
-            yield LayoutParsingResultV2(single_img_res)
+                for formula_res in formula_res_list:
+                    x_min, y_min, x_max, y_max = list(map(int, formula_res["dt_polys"]))
+                    doc_preprocessor_image[y_min:y_max, x_min:x_max, :] = formula_res[
+                        "input_img"
+                    ]
+
+                single_img_res = {
+                    "input_path": input_path,
+                    "page_index": page_index,
+                    "doc_preprocessor_res": doc_preprocessor_res,
+                    "layout_det_res": layout_det_res,
+                    "region_det_res": region_det_res,
+                    "overall_ocr_res": overall_ocr_res,
+                    "table_res_list": table_res_list,
+                    "seal_res_list": seal_res_list,
+                    "chart_res_list": chart_res_list,
+                    "formula_res_list": formula_res_list,
+                    "parsing_res_list": parsing_res_list,
+                    "imgs_in_doc": imgs_in_doc_for_img,
+                    "model_settings": model_settings,
+                }
+                yield LayoutParsingResultV2(single_img_res)
 
     def concatenate_markdown_pages(self, markdown_list: list) -> tuple:
         """
@@ -1326,3 +1377,15 @@ def concatenate_markdown_pages(self, markdown_list: list) -> tuple:
             )
 
         return markdown_texts
+
+
+@pipeline_requires_extra("ocr")
+class LayoutParsingPipelineV2(AutoParallelImageSimpleInferencePipeline):
+    entities = ["PP-StructureV3"]
+
+    @property
+    def _pipeline_cls(self):
+        return _LayoutParsingPipelineV2
+
+    def _get_batch_size(self, config):
+        return config.get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/object_detection/pipeline.py b/paddlex/inference/pipelines/object_detection/pipeline.py
index a48254e04f..69e29ae503 100644
--- a/paddlex/inference/pipelines/object_detection/pipeline.py
+++ b/paddlex/inference/pipelines/object_detection/pipeline.py
@@ -20,15 +20,13 @@
 from ...models.object_detection.result import DetResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 
 
-@pipeline_requires_extra("cv")
-class ObjectDetectionPipeline(BasePipeline):
+class _ObjectDetectionPipeline(BasePipeline):
     """Object Detection Pipeline"""
 
-    entities = "object_detection"
-
     def __init__(
         self,
         config: Dict,
@@ -103,3 +101,15 @@ def predict(
             layout_merge_bboxes_mode=layout_merge_bboxes_mode,
             **kwargs,
         )
+
+
+@pipeline_requires_extra("cv")
+class ObjectDetectionPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = "object_detection"
+
+    @property
+    def _pipeline_cls(self):
+        return _ObjectDetectionPipeline
+
+    def _get_batch_size(self, config):
+        return config["SubModules"]["ObjectDetection"].get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/ocr/pipeline.py b/paddlex/inference/pipelines/ocr/pipeline.py
index 3f939ec965..82b3e84904 100644
--- a/paddlex/inference/pipelines/ocr/pipeline.py
+++ b/paddlex/inference/pipelines/ocr/pipeline.py
@@ -22,6 +22,7 @@
 from ...common.reader import ReadImage
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 from ..components import (
     CropByPolys,
@@ -33,12 +34,9 @@
 from .result import OCRResult
 
 
-@pipeline_requires_extra("ocr")
-class OCRPipeline(BasePipeline):
+class _OCRPipeline(BasePipeline):
     """OCR Pipeline"""
 
-    entities = "OCR"
-
     def __init__(
         self,
         config: Dict,
@@ -131,7 +129,7 @@ def __init__(
             text_rec_config, input_shape=self.input_shape
         )
 
-        self.batch_sampler = ImageBatchSampler(batch_size=1)
+        self.batch_sampler = ImageBatchSampler(batch_size=config.get("batch_size", 1))
         self.img_reader = ReadImage(format="BGR")
 
     def rotate_image(
@@ -320,87 +318,135 @@ def predict(
         if text_rec_score_thresh is None:
             text_rec_score_thresh = self.text_rec_score_thresh
 
-        for img_id, batch_data in enumerate(self.batch_sampler(input)):
-            image_array = self.img_reader(batch_data.instances)[0]
+        for _, batch_data in enumerate(self.batch_sampler(input)):
+            image_arrays = self.img_reader(batch_data.instances)
 
             if model_settings["use_doc_preprocessor"]:
-                doc_preprocessor_res = next(
+                doc_preprocessor_results = list(
                     self.doc_preprocessor_pipeline(
-                        image_array,
+                        image_arrays,
                         use_doc_orientation_classify=use_doc_orientation_classify,
                         use_doc_unwarping=use_doc_unwarping,
                     )
                 )
             else:
-                doc_preprocessor_res = {"output_img": image_array}
+                doc_preprocessor_results = [{"output_img": arr} for arr in image_arrays]
 
-            doc_preprocessor_image = doc_preprocessor_res["output_img"]
+            doc_preprocessor_images = [
+                item["output_img"] for item in doc_preprocessor_results
+            ]
 
-            det_res = next(
-                self.text_det_model(doc_preprocessor_image, **text_det_params)
+            det_results = list(
+                self.text_det_model(doc_preprocessor_images, **text_det_params)
             )
 
-            dt_polys = det_res["dt_polys"]
-            det_res["dt_scores"]
-
-            dt_polys = self._sort_boxes(dt_polys)
-
-            single_img_res = {
-                "input_path": batch_data.input_paths[0],
-                "page_index": batch_data.page_indexes[0],
-                "doc_preprocessor_res": doc_preprocessor_res,
-                "dt_polys": dt_polys,
-                "model_settings": model_settings,
-                "text_det_params": text_det_params,
-                "text_type": self.text_type,
-                "text_rec_score_thresh": text_rec_score_thresh,
-            }
-
-            single_img_res["rec_texts"] = []
-            single_img_res["rec_scores"] = []
-            single_img_res["rec_polys"] = []
-            if len(dt_polys) > 0:
-                all_subs_of_img = list(
-                    self._crop_by_polys(doc_preprocessor_image, dt_polys)
+            dt_polys_list = [item["dt_polys"] for item in det_results]
+
+            dt_polys_list = [self._sort_boxes(item) for item in dt_polys_list]
+
+            results = [
+                {
+                    "input_path": input_path,
+                    "page_index": page_index,
+                    "doc_preprocessor_res": doc_preprocessor_res,
+                    "dt_polys": dt_polys,
+                    "model_settings": model_settings,
+                    "text_det_params": text_det_params,
+                    "text_type": self.text_type,
+                    "text_rec_score_thresh": text_rec_score_thresh,
+                    "rec_texts": [],
+                    "rec_scores": [],
+                    "rec_polys": [],
+                }
+                for input_path, page_index, doc_preprocessor_res, dt_polys in zip(
+                    batch_data.input_paths,
+                    batch_data.page_indexes,
+                    doc_preprocessor_results,
+                    dt_polys_list,
                 )
+            ]
+
+            indices = list(range(len(doc_preprocessor_images)))
+            indices = [idx for idx in indices if len(dt_polys_list[idx]) > 0]
+
+            if indices:
+                all_subs_of_imgs = []
+                chunk_indices = [0]
+                for idx in indices:
+                    all_subs_of_img = list(
+                        self._crop_by_polys(
+                            doc_preprocessor_images[idx], dt_polys_list[idx]
+                        )
+                    )
+                    all_subs_of_imgs.extend(all_subs_of_img)
+                    chunk_indices.append(chunk_indices[-1] + len(all_subs_of_img))
+
                 # use textline orientation model
                 if model_settings["use_textline_orientation"]:
                     angles = [
                         int(textline_angle_info["class_ids"][0])
                         for textline_angle_info in self.textline_orientation_model(
-                            all_subs_of_img
+                            all_subs_of_imgs
                         )
                     ]
-                    all_subs_of_img = self.rotate_image(all_subs_of_img, angles)
+                    all_subs_of_imgs = self.rotate_image(all_subs_of_imgs, angles)
                 else:
-                    angles = [-1] * len(all_subs_of_img)
-                single_img_res["textline_orientation_angles"] = angles
-
-                sub_img_info_list = [
-                    {
-                        "sub_img_id": img_id,
-                        "sub_img_ratio": sub_img.shape[1] / float(sub_img.shape[0]),
-                    }
-                    for img_id, sub_img in enumerate(all_subs_of_img)
-                ]
-                sorted_subs_info = sorted(
-                    sub_img_info_list, key=lambda x: x["sub_img_ratio"]
-                )
-                sorted_subs_of_img = [
-                    all_subs_of_img[x["sub_img_id"]] for x in sorted_subs_info
-                ]
-                for idx, rec_res in enumerate(self.text_rec_model(sorted_subs_of_img)):
-                    sub_img_id = sorted_subs_info[idx]["sub_img_id"]
-                    sub_img_info_list[sub_img_id]["rec_res"] = rec_res
-                for sno in range(len(sub_img_info_list)):
-                    rec_res = sub_img_info_list[sno]["rec_res"]
-                    if rec_res["rec_score"] >= text_rec_score_thresh:
-                        single_img_res["rec_texts"].append(rec_res["rec_text"])
-                        single_img_res["rec_scores"].append(rec_res["rec_score"])
-                        single_img_res["rec_polys"].append(dt_polys[sno])
-            if self.text_type == "general":
-                rec_boxes = convert_points_to_boxes(single_img_res["rec_polys"])
-                single_img_res["rec_boxes"] = rec_boxes
-            else:
-                single_img_res["rec_boxes"] = np.array([])
-            yield OCRResult(single_img_res)
+                    angles = [-1] * len(all_subs_of_imgs)
+                for i, idx in enumerate(indices):
+                    res = results[idx]
+                    res["textline_orientation_angles"] = angles[
+                        chunk_indices[i] : chunk_indices[i + 1]
+                    ]
+
+                # TODO: Process all sub-images in the batch together
+                for i, idx in enumerate(indices):
+                    all_subs_of_img = all_subs_of_imgs[
+                        chunk_indices[i] : chunk_indices[i + 1]
+                    ]
+                    res = results[idx]
+                    dt_polys = dt_polys_list[idx]
+                    sub_img_info_list = [
+                        {
+                            "sub_img_id": img_id,
+                            "sub_img_ratio": sub_img.shape[1] / float(sub_img.shape[0]),
+                        }
+                        for img_id, sub_img in enumerate(all_subs_of_img)
+                    ]
+                    sorted_subs_info = sorted(
+                        sub_img_info_list, key=lambda x: x["sub_img_ratio"]
+                    )
+                    sorted_subs_of_img = [
+                        all_subs_of_img[x["sub_img_id"]] for x in sorted_subs_info
+                    ]
+                    for i, rec_res in enumerate(
+                        self.text_rec_model(sorted_subs_of_img)
+                    ):
+                        sub_img_id = sorted_subs_info[i]["sub_img_id"]
+                        sub_img_info_list[sub_img_id]["rec_res"] = rec_res
+                    for sno in range(len(sub_img_info_list)):
+                        rec_res = sub_img_info_list[sno]["rec_res"]
+                        if rec_res["rec_score"] >= text_rec_score_thresh:
+                            res["rec_texts"].append(rec_res["rec_text"])
+                            res["rec_scores"].append(rec_res["rec_score"])
+                            res["rec_polys"].append(dt_polys[sno])
+
+            for res in results:
+                if self.text_type == "general":
+                    rec_boxes = convert_points_to_boxes(res["rec_polys"])
+                    res["rec_boxes"] = rec_boxes
+                else:
+                    res["rec_boxes"] = np.array([])
+
+                yield OCRResult(res)
+
+
+@pipeline_requires_extra("ocr")
+class OCRPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = "OCR"
+
+    @property
+    def _pipeline_cls(self):
+        return _OCRPipeline
+
+    def _get_batch_size(self, config):
+        return config.get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/rotated_object_detection/pipeline.py b/paddlex/inference/pipelines/rotated_object_detection/pipeline.py
index 2c619b413f..e217d1015b 100644
--- a/paddlex/inference/pipelines/rotated_object_detection/pipeline.py
+++ b/paddlex/inference/pipelines/rotated_object_detection/pipeline.py
@@ -20,15 +20,13 @@
 from ...models.object_detection.result import DetResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 
 
-@pipeline_requires_extra("cv")
-class RotatedObjectDetectionPipeline(BasePipeline):
+class _RotatedObjectDetectionPipeline(BasePipeline):
     """Rotated Object Detection Pipeline"""
 
-    entities = "rotated_object_detection"
-
     def __init__(
         self,
         config: Dict,
@@ -83,3 +81,15 @@ def predict(
             DetResult: The predicted rotated object detection results.
         """
         yield from self.rotated_object_detection_model(input, threshold=threshold)
+
+
+@pipeline_requires_extra("cv")
+class RotatedObjectDetectionPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = "rotated_object_detection"
+
+    @property
+    def _pipeline_cls(self):
+        return _RotatedObjectDetectionPipeline
+
+    def _get_batch_size(self, config):
+        return config["SubModules"]["RotatedObjectDetection"].get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/seal_recognition/pipeline.py b/paddlex/inference/pipelines/seal_recognition/pipeline.py
index 710504f2d7..a725afa21a 100644
--- a/paddlex/inference/pipelines/seal_recognition/pipeline.py
+++ b/paddlex/inference/pipelines/seal_recognition/pipeline.py
@@ -23,17 +23,15 @@
 from ...models.object_detection.result import DetResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 from ..components import CropByBoxes
 from .result import SealRecognitionResult
 
 
-@pipeline_requires_extra("ocr")
-class SealRecognitionPipeline(BasePipeline):
+class _SealRecognitionPipeline(BasePipeline):
     """Seal Recognition Pipeline"""
 
-    entities = ["seal_recognition"]
-
     def __init__(
         self,
         config: Dict,
@@ -104,7 +102,7 @@ def __init__(
 
         self._crop_by_boxes = CropByBoxes()
 
-        self.batch_sampler = ImageBatchSampler(batch_size=1)
+        self.batch_sampler = ImageBatchSampler(batch_size=config.get("batch_size", 1))
 
         self.img_reader = ReadImage(format="BGR")
 
@@ -180,7 +178,7 @@ def predict(
         use_doc_orientation_classify: Optional[bool] = None,
         use_doc_unwarping: Optional[bool] = None,
         use_layout_detection: Optional[bool] = None,
-        layout_det_res: Optional[DetResult] = None,
+        layout_det_res: Optional[Union[DetResult, List[DetResult]]] = None,
         layout_threshold: Optional[Union[float, dict]] = None,
         layout_nms: Optional[bool] = None,
         layout_unclip_ratio: Optional[Union[float, Tuple[float, float]]] = None,
@@ -201,29 +199,38 @@ def predict(
         if not self.check_model_settings_valid(model_settings, layout_det_res):
             yield {"error": "the input params for model settings are invalid!"}
 
-        for img_id, batch_data in enumerate(self.batch_sampler(input)):
-            image_array = self.img_reader(batch_data.instances)[0]
+        external_layout_det_results = layout_det_res
+        if external_layout_det_results is not None:
+            if not isinstance(external_layout_det_results, list):
+                external_layout_det_results = [external_layout_det_results]
+            external_layout_det_results = iter(external_layout_det_results)
+
+        for _, batch_data in enumerate(self.batch_sampler(input)):
+            image_arrays = self.img_reader(batch_data.instances)
 
             if model_settings["use_doc_preprocessor"]:
-                doc_preprocessor_res = next(
+                doc_preprocessor_results = list(
                     self.doc_preprocessor_pipeline(
-                        image_array,
+                        image_arrays,
                         use_doc_orientation_classify=use_doc_orientation_classify,
                         use_doc_unwarping=use_doc_unwarping,
                     )
                 )
             else:
-                doc_preprocessor_res = {"output_img": image_array}
+                doc_preprocessor_results = [{"output_img": arr} for arr in image_arrays]
 
-            doc_preprocessor_image = doc_preprocessor_res["output_img"]
+            doc_preprocessor_images = [
+                item["output_img"] for item in doc_preprocessor_results
+            ]
 
-            seal_res_list = []
-            seal_region_id = 1
-            if not model_settings["use_layout_detection"] and layout_det_res is None:
-                layout_det_res = {}
-                seal_ocr_res = next(
+            if (
+                not model_settings["use_layout_detection"]
+                and external_layout_det_results is None
+            ):
+                layout_det_results = [{} for _ in doc_preprocessor_images]
+                flat_seal_results = list(
                     self.seal_ocr_pipeline(
-                        doc_preprocessor_image,
+                        doc_preprocessor_images,
                         text_det_limit_side_len=seal_det_limit_side_len,
                         text_det_limit_type=seal_det_limit_type,
                         text_det_thresh=seal_det_thresh,
@@ -232,48 +239,97 @@ def predict(
                         text_rec_score_thresh=seal_rec_score_thresh,
                     )
                 )
-                seal_ocr_res["seal_region_id"] = seal_region_id
-                seal_res_list.append(seal_ocr_res)
-                seal_region_id += 1
+                for seal_res in flat_seal_results:
+                    seal_res["seal_region_id"] = 1
+                seal_results = [[item] for item in flat_seal_results]
             else:
                 if model_settings["use_layout_detection"]:
-                    layout_det_res = next(
+                    layout_det_results = list(
                         self.layout_det_model(
-                            doc_preprocessor_image,
+                            doc_preprocessor_images,
                             threshold=layout_threshold,
                             layout_nms=layout_nms,
                             layout_unclip_ratio=layout_unclip_ratio,
                             layout_merge_bboxes_mode=layout_merge_bboxes_mode,
                         )
                     )
-
-                for box_info in layout_det_res["boxes"]:
-                    if box_info["label"].lower() in ["seal"]:
-                        crop_img_info = self._crop_by_boxes(
-                            doc_preprocessor_image, [box_info]
-                        )
-                        crop_img_info = crop_img_info[0]
-                        seal_ocr_res = next(
-                            self.seal_ocr_pipeline(
-                                crop_img_info["img"],
-                                text_det_limit_side_len=seal_det_limit_side_len,
-                                text_det_limit_type=seal_det_limit_type,
-                                text_det_thresh=seal_det_thresh,
-                                text_det_box_thresh=seal_det_box_thresh,
-                                text_det_unclip_ratio=seal_det_unclip_ratio,
-                                text_rec_score_thresh=seal_rec_score_thresh,
+                else:
+                    layout_det_results = []
+                    for _ in doc_preprocessor_images:
+                        try:
+                            layout_det_res = next(external_layout_det_results)
+                        except StopIteration:
+                            raise ValueError("No more layout det results")
+                        layout_det_results.append(layout_det_res)
+
+                cropped_imgs = []
+                chunk_indices = [0]
+                for doc_preprocessor_image, layout_det_res in zip(
+                    doc_preprocessor_images, layout_det_results
+                ):
+                    for box_info in layout_det_res["boxes"]:
+                        if box_info["label"].lower() in ["seal"]:
+                            crop_img_info = self._crop_by_boxes(
+                                doc_preprocessor_image, [box_info]
                             )
-                        )
-                        seal_ocr_res["seal_region_id"] = seal_region_id
-                        seal_res_list.append(seal_ocr_res)
+                            crop_img_info = crop_img_info[0]
+                            cropped_imgs.append(crop_img_info["img"])
+                    chunk_indices.append(len(cropped_imgs))
+
+                flat_seal_results = list(
+                    self.seal_ocr_pipeline(
+                        cropped_imgs,
+                        text_det_limit_side_len=seal_det_limit_side_len,
+                        text_det_limit_type=seal_det_limit_type,
+                        text_det_thresh=seal_det_thresh,
+                        text_det_box_thresh=seal_det_box_thresh,
+                        text_det_unclip_ratio=seal_det_unclip_ratio,
+                        text_rec_score_thresh=seal_rec_score_thresh,
+                    )
+                )
+
+                seal_results = [
+                    flat_seal_results[i:j]
+                    for i, j in zip(chunk_indices[:-1], chunk_indices[1:])
+                ]
+
+                for seal_results_for_img in seal_results:
+                    seal_region_id = 1
+                    for seal_res in seal_results_for_img:
+                        seal_res["seal_region_id"] = seal_region_id
                         seal_region_id += 1
 
-            single_img_res = {
-                "input_path": batch_data.input_paths[0],
-                "page_index": batch_data.page_indexes[0],
-                "doc_preprocessor_res": doc_preprocessor_res,
-                "layout_det_res": layout_det_res,
-                "seal_res_list": seal_res_list,
-                "model_settings": model_settings,
-            }
-            yield SealRecognitionResult(single_img_res)
+            for (
+                input_path,
+                page_index,
+                doc_preprocessor_res,
+                layout_det_res,
+                seal_results_for_img,
+            ) in zip(
+                batch_data.input_paths,
+                batch_data.page_indexes,
+                doc_preprocessor_results,
+                layout_det_results,
+                seal_results,
+            ):
+                single_img_res = {
+                    "input_path": input_path,
+                    "page_index": page_index,
+                    "doc_preprocessor_res": doc_preprocessor_res,
+                    "layout_det_res": layout_det_res,
+                    "seal_res_list": seal_results_for_img,
+                    "model_settings": model_settings,
+                }
+                yield SealRecognitionResult(single_img_res)
+
+
+@pipeline_requires_extra("ocr")
+class SealRecognitionPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = ["seal_recognition"]
+
+    @property
+    def _pipeline_cls(self):
+        return _SealRecognitionPipeline
+
+    def _get_batch_size(self, config):
+        return config.get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/semantic_segmentation/pipeline.py b/paddlex/inference/pipelines/semantic_segmentation/pipeline.py
index 2c0bf96f33..17a46e90f5 100644
--- a/paddlex/inference/pipelines/semantic_segmentation/pipeline.py
+++ b/paddlex/inference/pipelines/semantic_segmentation/pipeline.py
@@ -20,15 +20,13 @@
 from ...models.semantic_segmentation.result import SegResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 
 
-@pipeline_requires_extra("cv")
-class SemanticSegmentationPipeline(BasePipeline):
+class _SemanticSegmentationPipeline(BasePipeline):
     """Semantic Segmentation Pipeline"""
 
-    entities = "semantic_segmentation"
-
     def __init__(
         self,
         config: Dict,
@@ -83,3 +81,15 @@ def predict(
             SegResult: The predicted segmentation results.
         """
         yield from self.semantic_segmentation_model(input, target_size=target_size)
+
+
+@pipeline_requires_extra("cv")
+class SemanticSegmentationPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = "semantic_segmentation"
+
+    @property
+    def _pipeline_cls(self):
+        return _SemanticSegmentationPipeline
+
+    def _get_batch_size(self, config):
+        return config["SubModules"]["SemanticSegmentation"].get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/small_object_detection/pipeline.py b/paddlex/inference/pipelines/small_object_detection/pipeline.py
index 3ed5e6a2fc..cbe428a845 100644
--- a/paddlex/inference/pipelines/small_object_detection/pipeline.py
+++ b/paddlex/inference/pipelines/small_object_detection/pipeline.py
@@ -20,15 +20,13 @@
 from ...models.object_detection.result import DetResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 
 
-@pipeline_requires_extra("cv")
-class SmallObjectDetectionPipeline(BasePipeline):
+class _SmallObjectDetectionPipeline(BasePipeline):
     """Small Object Detection Pipeline"""
 
-    entities = "small_object_detection"
-
     def __init__(
         self,
         config: Dict,
@@ -83,3 +81,15 @@ def predict(
             DetResult: The predicted small object detection results.
         """
         yield from self.small_object_detection_model(input, threshold=threshold)
+
+
+@pipeline_requires_extra("cv")
+class SmallObjectDetectionPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = "small_object_detection"
+
+    @property
+    def _pipeline_cls(self):
+        return _SmallObjectDetectionPipeline
+
+    def _get_batch_size(self, config):
+        return config["SubModules"]["SmallObjectDetection"].get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/table_recognition/pipeline.py b/paddlex/inference/pipelines/table_recognition/pipeline.py
index f88992604c..1a910aef50 100644
--- a/paddlex/inference/pipelines/table_recognition/pipeline.py
+++ b/paddlex/inference/pipelines/table_recognition/pipeline.py
@@ -24,6 +24,7 @@
 from ...models.object_detection.result import DetResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 from ..components import CropByBoxes
 from ..doc_preprocessor.result import DocPreprocessorResult
@@ -33,12 +34,9 @@
 from .utils import get_neighbor_boxes_idx
 
 
-@pipeline_requires_extra("ocr")
-class TableRecognitionPipeline(BasePipeline):
+class _TableRecognitionPipeline(BasePipeline):
     """Table Recognition Pipeline"""
 
-    entities = ["table_recognition"]
-
     def __init__(
         self,
         config: Dict,
@@ -476,3 +474,15 @@ def predict(
                 "model_settings": model_settings,
             }
             yield TableRecognitionResult(single_img_res)
+
+
+@pipeline_requires_extra("ocr")
+class TableRecognitionPipeline(AutoParallelImageSimpleInferencePipeline):
+    entities = ["table_recognition"]
+
+    @property
+    def _pipeline_cls(self):
+        return _TableRecognitionPipeline
+
+    def _get_batch_size(self, config):
+        return 1
diff --git a/paddlex/inference/pipelines/table_recognition/pipeline_v2.py b/paddlex/inference/pipelines/table_recognition/pipeline_v2.py
index 39eab710f9..334249c0e4 100644
--- a/paddlex/inference/pipelines/table_recognition/pipeline_v2.py
+++ b/paddlex/inference/pipelines/table_recognition/pipeline_v2.py
@@ -28,6 +28,7 @@
 from ...models.object_detection.result import DetResult
 from ...utils.hpi import HPIConfig
 from ...utils.pp_option import PaddlePredictorOption
+from .._parallel import AutoParallelImageSimpleInferencePipeline
 from ..base import BasePipeline
 from ..components import CropByBoxes
 from ..doc_preprocessor.result import DocPreprocessorResult
@@ -43,12 +44,9 @@
     from sklearn.cluster import KMeans
 
 
-@pipeline_requires_extra("ocr")
-class TableRecognitionPipelineV2(BasePipeline):
+class _TableRecognitionPipelineV2(BasePipeline):
     """Table Recognition Pipeline"""
 
-    entities = ["table_recognition_v2"]
-
     def __init__(
         self,
         config: Dict,
@@ -146,7 +144,7 @@ def __init__(
 
         self._crop_by_boxes = CropByBoxes()
 
-        self.batch_sampler = ImageBatchSampler(batch_size=1)
+        self.batch_sampler = ImageBatchSampler(batch_size=config.get("batch_size", 1))
         self.img_reader = ReadImage(format="BGR")
 
     def get_model_settings(
@@ -192,7 +190,7 @@ def check_model_settings_valid(
         self,
         model_settings: Dict,
         overall_ocr_res: OCRResult,
-        layout_det_res: DetResult,
+        layout_det_res: Union[DetResult, List[DetResult]],
     ) -> bool:
         """
         Check if the input parameters are valid based on the initialized models.
@@ -201,7 +199,7 @@ def check_model_settings_valid(
             model_settings (Dict): A dictionary containing input parameters.
             overall_ocr_res (OCRResult): Overall OCR result obtained after running the OCR pipeline.
                 The overall OCR result with convert_points_to_boxes information.
-            layout_det_res (DetResult): The layout detection result.
+            layout_det_res (Union[DetResult, List[DetResult]]): The layout detection result(s).
         Returns:
             bool: True if all required models are initialized according to input parameters, False otherwise.
         """
@@ -566,122 +564,131 @@ def split_ocr_bboxes_by_table_cells(self, ori_img, cells_bboxes):
         # Return the list of recognized texts from each cell.
         return texts_list
 
-    def predict_single_table_recognition_res(
+    def _predict(
         self,
-        image_array: np.ndarray,
-        overall_ocr_res: OCRResult,
-        table_box: list,
+        image_arrays: List[np.ndarray],
+        overall_ocr_results: List[OCRResult],
+        table_boxes: List[list],
         use_table_cells_ocr_results: bool = False,
         use_e2e_wired_table_rec_model: bool = False,
         use_e2e_wireless_table_rec_model: bool = False,
         flag_find_nei_text: bool = True,
-    ) -> SingleTableRecognitionResult:
+    ) -> List[SingleTableRecognitionResult]:
         """
-        Predict table recognition results from an image array, layout detection results, and OCR results.
+        Predict table recognition results from image arrays, layout detection results, and OCR results.
 
         Args:
-            image_array (np.ndarray): The input image represented as a numpy array.
-            overall_ocr_res (OCRResult): Overall OCR result obtained after running the OCR pipeline.
-                The overall OCR results containing text recognition information.
-            table_box (list): The table box coordinates.
+            image_arrays (List[np.ndarray]): The input image arrays.
+            overall_ocr_results (List[OCRResult]): Overall OCR results obtained after running the OCR pipeline.
+                The overall OCR results contain text recognition information.
+            table_boxes (List[list]): The table box coordinates.
             use_table_cells_ocr_results (bool): whether to use OCR results with cells.
             use_e2e_wired_table_rec_model (bool): Whether to use end-to-end wired table recognition model.
             use_e2e_wireless_table_rec_model (bool): Whether to use end-to-end wireless table recognition model.
             flag_find_nei_text (bool): Whether to find neighboring text.
         Returns:
-            SingleTableRecognitionResult: single table recognition result.
+            List[SingleTableRecognitionResult]: Single table recognition results.
         """
+        # TODO: Batch inference
 
-        table_cls_pred = next(self.table_cls_model(image_array))
-        table_cls_result = self.extract_results(table_cls_pred, "cls")
-        use_e2e_model = False
+        results = []
 
-        if table_cls_result == "wired_table":
-            table_structure_pred = next(self.wired_table_rec_model(image_array))
-            if use_e2e_wired_table_rec_model == True:
-                use_e2e_model = True
-            else:
-                table_cells_pred = next(
-                    self.wired_table_cells_detection_model(image_array, threshold=0.3)
-                )  # Setting the threshold to 0.3 can improve the accuracy of table cells detection.
-                # If you really want more or fewer table cells detection boxes, the threshold can be adjusted.
-        elif table_cls_result == "wireless_table":
-            table_structure_pred = next(self.wireless_table_rec_model(image_array))
-            if use_e2e_wireless_table_rec_model == True:
-                use_e2e_model = True
-            else:
-                table_cells_pred = next(
-                    self.wireless_table_cells_detection_model(
-                        image_array, threshold=0.3
+        for image_array, overall_ocr_res, table_box in zip(
+            image_arrays, overall_ocr_results, table_boxes
+        ):
+            table_cls_pred = next(self.table_cls_model(image_array))
+            table_cls_result = self.extract_results(table_cls_pred, "cls")
+            use_e2e_model = False
+
+            if table_cls_result == "wired_table":
+                table_structure_pred = next(self.wired_table_rec_model(image_array))
+                if use_e2e_wired_table_rec_model == True:
+                    use_e2e_model = True
+                else:
+                    table_cells_pred = next(
+                        self.wired_table_cells_detection_model(
+                            image_array, threshold=0.3
+                        )
+                    )  # Setting the threshold to 0.3 can improve the accuracy of table cells detection.
+                    # If you really want more or fewer table cells detection boxes, the threshold can be adjusted.
+            elif table_cls_result == "wireless_table":
+                table_structure_pred = next(self.wireless_table_rec_model(image_array))
+                if use_e2e_wireless_table_rec_model == True:
+                    use_e2e_model = True
+                else:
+                    table_cells_pred = next(
+                        self.wireless_table_cells_detection_model(
+                            image_array, threshold=0.3
+                        )
+                    )  # Setting the threshold to 0.3 can improve the accuracy of table cells detection.
+                    # If you really want more or fewer table cells detection boxes, the threshold can be adjusted.
+            if use_e2e_model == False:
+                table_structure_result = self.extract_results(
+                    table_structure_pred, "table_stru"
+                )
+                table_cells_result, table_cells_score = self.extract_results(
+                    table_cells_pred, "det"
+                )
+                table_cells_result, table_cells_score = self.cells_det_results_nms(
+                    table_cells_result, table_cells_score
+                )
+                ocr_det_boxes = self.get_region_ocr_det_boxes(
+                    overall_ocr_res["rec_boxes"].tolist(), table_box
+                )
+                table_cells_result = self.cells_det_results_reprocessing(
+                    table_cells_result,
+                    table_cells_score,
+                    ocr_det_boxes,
+                    len(table_structure_pred["bbox"]),
+                )
+                if use_table_cells_ocr_results == True:
+                    cells_texts_list = self.split_ocr_bboxes_by_table_cells(
+                        image_array, table_cells_result
                     )
-                )  # Setting the threshold to 0.3 can improve the accuracy of table cells detection.
-                # If you really want more or fewer table cells detection boxes, the threshold can be adjusted.
-
-        if use_e2e_model == False:
-            table_structure_result = self.extract_results(
-                table_structure_pred, "table_stru"
-            )
-            table_cells_result, table_cells_score = self.extract_results(
-                table_cells_pred, "det"
-            )
-            table_cells_result, table_cells_score = self.cells_det_results_nms(
-                table_cells_result, table_cells_score
-            )
-            ocr_det_boxes = self.get_region_ocr_det_boxes(
-                overall_ocr_res["rec_boxes"].tolist(), table_box
-            )
-            table_cells_result = self.cells_det_results_reprocessing(
-                table_cells_result,
-                table_cells_score,
-                ocr_det_boxes,
-                len(table_structure_pred["bbox"]),
-            )
-            if use_table_cells_ocr_results == True:
-                cells_texts_list = self.split_ocr_bboxes_by_table_cells(
-                    image_array, table_cells_result
+                else:
+                    cells_texts_list = []
+                single_table_recognition_res = get_table_recognition_res(
+                    table_box,
+                    table_structure_result,
+                    table_cells_result,
+                    overall_ocr_res,
+                    cells_texts_list,
+                    use_table_cells_ocr_results,
                 )
             else:
-                cells_texts_list = []
-            single_table_recognition_res = get_table_recognition_res(
-                table_box,
-                table_structure_result,
-                table_cells_result,
-                overall_ocr_res,
-                cells_texts_list,
-                use_table_cells_ocr_results,
-            )
-        else:
-            if use_table_cells_ocr_results == True:
-                table_cells_result_e2e = list(
-                    map(lambda arr: arr.tolist(), table_structure_pred["bbox"])
+                if use_table_cells_ocr_results == True:
+                    table_cells_result_e2e = list(
+                        map(lambda arr: arr.tolist(), table_structure_pred["bbox"])
+                    )
+                    table_cells_result_e2e = [
+                        [rect[0], rect[1], rect[4], rect[5]]
+                        for rect in table_cells_result_e2e
+                    ]
+                    cells_texts_list = self.split_ocr_bboxes_by_table_cells(
+                        image_array, table_cells_result_e2e
+                    )
+                else:
+                    cells_texts_list = []
+                single_table_recognition_res = get_table_recognition_res_e2e(
+                    table_box,
+                    table_structure_pred,
+                    overall_ocr_res,
+                    cells_texts_list,
+                    use_table_cells_ocr_results,
                 )
-                table_cells_result_e2e = [
-                    [rect[0], rect[1], rect[4], rect[5]]
-                    for rect in table_cells_result_e2e
-                ]
-                cells_texts_list = self.split_ocr_bboxes_by_table_cells(
-                    image_array, table_cells_result_e2e
+
+            neighbor_text = ""
+            if flag_find_nei_text:
+                match_idx_list = get_neighbor_boxes_idx(
+                    overall_ocr_res["rec_boxes"], table_box
                 )
-            else:
-                cells_texts_list = []
-            single_table_recognition_res = get_table_recognition_res_e2e(
-                table_box,
-                table_structure_pred,
-                overall_ocr_res,
-                cells_texts_list,
-                use_table_cells_ocr_results,
-            )
+                if len(match_idx_list) > 0:
+                    for idx in match_idx_list:
+                        neighbor_text += overall_ocr_res["rec_texts"][idx] + "; "
+            single_table_recognition_res["neighbor_texts"] = neighbor_text
+            results.append(single_table_recognition_res)
 
-        neighbor_text = ""
-        if flag_find_nei_text:
-            match_idx_list = get_neighbor_boxes_idx(
-                overall_ocr_res["rec_boxes"], table_box
-            )
-            if len(match_idx_list) > 0:
-                for idx in match_idx_list:
-                    neighbor_text += overall_ocr_res["rec_texts"][idx] + "; "
-        single_table_recognition_res["neighbor_texts"] = neighbor_text
-        return single_table_recognition_res
+        return results
 
     def predict(
         self,
@@ -690,8 +697,8 @@ def predict(
         use_doc_unwarping: Optional[bool] = None,
         use_layout_detection: Optional[bool] = None,
         use_ocr_model: Optional[bool] = None,
-        overall_ocr_res: Optional[OCRResult] = None,
-        layout_det_res: Optional[DetResult] = None,
+        overall_ocr_res: Optional[Union[OCRResult, List[OCRResult]]] = None,
+        layout_det_res: Optional[Union[DetResult, List[DetResult]]] = None,
         text_det_limit_side_len: Optional[int] = None,
         text_det_limit_type: Optional[str] = None,
         text_det_thresh: Optional[float] = None,
@@ -711,9 +718,9 @@ def predict(
             use_layout_detection (bool): Whether to use layout detection.
             use_doc_orientation_classify (bool): Whether to use document orientation classification.
             use_doc_unwarping (bool): Whether to use document unwarping.
-            overall_ocr_res (OCRResult): The overall OCR result with convert_points_to_boxes information.
+            overall_ocr_res (Union[OCRResult, List[OCRResult]]): The overall OCR results with convert_points_to_boxes information.
                 It will be used if it is not None and use_ocr_model is False.
-            layout_det_res (DetResult): The layout detection result.
+            layout_det_res (Union[DetResult, List[DetResult]]): The layout detection result(s).
                 It will be used if it is not None and use_layout_detection is False.
             use_table_cells_ocr_results (bool): whether to use OCR results with cells.
             use_e2e_wired_table_rec_model (bool): Whether to use end-to-end wired table recognition model.
@@ -737,26 +744,40 @@ def predict(
         ):
             yield {"error": "the input params for model settings are invalid!"}
 
-        for img_id, batch_data in enumerate(self.batch_sampler(input)):
-            image_array = self.img_reader(batch_data.instances)[0]
+        external_overall_ocr_results = overall_ocr_res
+        if external_overall_ocr_results is not None:
+            if not isinstance(external_overall_ocr_results, list):
+                external_overall_ocr_results = [external_overall_ocr_results]
+            external_overall_ocr_results = iter(external_overall_ocr_results)
+
+        external_layout_det_results = layout_det_res
+        if external_layout_det_results is not None:
+            if not isinstance(external_layout_det_results, list):
+                external_layout_det_results = [external_layout_det_results]
+            external_layout_det_results = iter(external_layout_det_results)
+
+        for _, batch_data in enumerate(self.batch_sampler(input)):
+            image_arrays = self.img_reader(batch_data.instances)
 
             if model_settings["use_doc_preprocessor"]:
-                doc_preprocessor_res = next(
+                doc_preprocessor_results = list(
                     self.doc_preprocessor_pipeline(
-                        image_array,
+                        image_arrays,
                         use_doc_orientation_classify=use_doc_orientation_classify,
                         use_doc_unwarping=use_doc_unwarping,
                     )
                 )
             else:
-                doc_preprocessor_res = {"output_img": image_array}
+                doc_preprocessor_results = [{"output_img": arr} for arr in image_arrays]
 
-            doc_preprocessor_image = doc_preprocessor_res["output_img"]
+            doc_preprocessor_images = [
+                item["output_img"] for item in doc_preprocessor_results
+            ]
 
             if model_settings["use_ocr_model"]:
-                overall_ocr_res = next(
+                overall_ocr_results = list(
                     self.general_ocr_pipeline(
-                        doc_preprocessor_image,
+                        doc_preprocessor_images,
                         text_det_limit_side_len=text_det_limit_side_len,
                         text_det_limit_type=text_det_limit_type,
                         text_det_thresh=text_det_thresh,
@@ -765,60 +786,131 @@ def predict(
                         text_rec_score_thresh=text_rec_score_thresh,
                     )
                 )
-            elif use_table_cells_ocr_results == True:
-                assert self.general_ocr_config_bak != None
-                self.general_ocr_pipeline = self.create_pipeline(
-                    self.general_ocr_config_bak
-                )
+            else:
+                overall_ocr_results = []
+                for _ in doc_preprocessor_images:
+                    try:
+                        overall_ocr_res = next(external_overall_ocr_results)
+                    except StopIteration:
+                        raise ValueError("No more overall OCR results")
+                    overall_ocr_results.append(overall_ocr_res)
+
+                if use_table_cells_ocr_results:
+                    # FIXME: This creates a new pipeline on each call.
+                    assert self.general_ocr_config_bak is not None
+                    self.general_ocr_pipeline = self.create_pipeline(
+                        self.general_ocr_config_bak
+                    )
 
-            table_res_list = []
-            table_region_id = 1
-            if not model_settings["use_layout_detection"] and layout_det_res is None:
-                layout_det_res = {}
-                img_height, img_width = doc_preprocessor_image.shape[:2]
-                table_box = [0, 0, img_width - 1, img_height - 1]
-                single_table_rec_res = self.predict_single_table_recognition_res(
-                    doc_preprocessor_image,
-                    overall_ocr_res,
-                    table_box,
+            if (
+                not model_settings["use_layout_detection"]
+                and external_layout_det_results is None
+            ):
+                layout_det_results = [{} for _ in doc_preprocessor_images]
+
+                table_boxes = []
+                for img in doc_preprocessor_images:
+                    img_height, img_width = img.shape[:2]
+                    table_box = [0, 0, img_width - 1, img_height - 1]
+                    table_boxes.append(table_box)
+
+                flat_table_results = self._predict(
+                    doc_preprocessor_images,
+                    overall_ocr_results,
+                    table_boxes,
                     use_table_cells_ocr_results,
                     use_e2e_wired_table_rec_model,
                     use_e2e_wireless_table_rec_model,
                     flag_find_nei_text=False,
                 )
-                single_table_rec_res["table_region_id"] = table_region_id
-                table_res_list.append(single_table_rec_res)
-                table_region_id += 1
+
+                for table_res in flat_table_results:
+                    table_res["table_region_id"] = 1
+                table_results = [[item] for item in flat_table_results]
             else:
                 if model_settings["use_layout_detection"]:
-                    layout_det_res = next(self.layout_det_model(doc_preprocessor_image))
-
-                for box_info in layout_det_res["boxes"]:
-                    if box_info["label"].lower() in ["table"]:
-                        crop_img_info = self._crop_by_boxes(image_array, [box_info])
-                        crop_img_info = crop_img_info[0]
-                        table_box = crop_img_info["box"]
-                        single_table_rec_res = (
-                            self.predict_single_table_recognition_res(
-                                crop_img_info["img"],
-                                overall_ocr_res,
-                                table_box,
-                                use_table_cells_ocr_results,
-                                use_e2e_wired_table_rec_model,
-                                use_e2e_wireless_table_rec_model,
-                            )
-                        )
-                        single_table_rec_res["table_region_id"] = table_region_id
-                        table_res_list.append(single_table_rec_res)
+                    layout_det_results = list(
+                        self.layout_det_model(doc_preprocessor_images)
+                    )
+                else:
+                    layout_det_results = []
+                    for _ in doc_preprocessor_images:
+                        try:
+                            layout_det_res = next(external_layout_det_results)
+                        except StopIteration:
+                            raise ValueError("No more layout det results")
+                        layout_det_results.append(layout_det_res)
+
+                cropped_imgs = []
+                table_boxes = []
+                repeated_overall_ocr_results = []
+                chunk_indices = [0]
+                for image_array, layout_det_res, overall_ocr_res in zip(
+                    image_arrays, layout_det_results, overall_ocr_results
+                ):
+                    for box_info in layout_det_res["boxes"]:
+                        if box_info["label"].lower() in ["table"]:
+                            crop_img_info = self._crop_by_boxes(image_array, [box_info])
+                            crop_img_info = crop_img_info[0]
+                            cropped_imgs.append(crop_img_info["img"])
+                            table_boxes.append(crop_img_info["box"])
+                            repeated_overall_ocr_results.append(overall_ocr_res)
+                    chunk_indices.append(len(cropped_imgs))
+
+                flat_table_results = self._predict(
+                    cropped_imgs,
+                    repeated_overall_ocr_results,
+                    table_boxes,
+                    use_table_cells_ocr_results,
+                    use_e2e_wired_table_rec_model,
+                    use_e2e_wireless_table_rec_model,
+                )
+
+                table_results = [
+                    flat_table_results[i:j]
+                    for i, j in zip(chunk_indices[:-1], chunk_indices[1:])
+                ]
+
+                for table_results_for_img in table_results:
+                    table_region_id = 1
+                    for table_res in table_results_for_img:
+                        table_res["table_region_id"] = table_region_id
                         table_region_id += 1
 
-            single_img_res = {
-                "input_path": batch_data.input_paths[0],
-                "page_index": batch_data.page_indexes[0],
-                "doc_preprocessor_res": doc_preprocessor_res,
-                "layout_det_res": layout_det_res,
-                "overall_ocr_res": overall_ocr_res,
-                "table_res_list": table_res_list,
-                "model_settings": model_settings,
-            }
-            yield TableRecognitionResult(single_img_res)
+            for (
+                input_path,
+                page_index,
+                doc_preprocessor_res,
+                layout_det_res,
+                overall_ocr_res,
+                table_results_for_img,
+            ) in zip(
+                batch_data.input_paths,
+                batch_data.page_indexes,
+                doc_preprocessor_results,
+                layout_det_results,
+                overall_ocr_results,
+                table_results,
+            ):
+                single_img_res = {
+                    "input_path": input_path,
+                    "page_index": page_index,
+                    "doc_preprocessor_res": doc_preprocessor_res,
+                    "layout_det_res": layout_det_res,
+                    "overall_ocr_res": overall_ocr_res,
+                    "table_res_list": table_results_for_img,
+                    "model_settings": model_settings,
+                }
+                yield TableRecognitionResult(single_img_res)
+
+
+@pipeline_requires_extra("ocr")
+class TableRecognitionPipelineV2(AutoParallelImageSimpleInferencePipeline):
+    entities = ["table_recognition_v2"]
+
+    @property
+    def _pipeline_cls(self):
+        return _TableRecognitionPipelineV2
+
+    def _get_batch_size(self, config):
+        return config.get("batch_size", 1)
diff --git a/paddlex/inference/pipelines/table_recognition/table_recognition_post_processing_v2.py b/paddlex/inference/pipelines/table_recognition/table_recognition_post_processing_v2.py
index 52c2134386..dd74f33b9f 100644
--- a/paddlex/inference/pipelines/table_recognition/table_recognition_post_processing_v2.py
+++ b/paddlex/inference/pipelines/table_recognition/table_recognition_post_processing_v2.py
@@ -131,8 +131,8 @@ def compute_inter(rec1, rec2):
     Returns:
         float: Intersection over rec2_area
     """
-    x1_1, y1_1, x2_1, y2_1 = rec1
-    x1_2, y1_2, x2_2, y2_2 = rec2
+    x1_1, y1_1, x2_1, y2_1 = map(float, rec1)
+    x1_2, y1_2, x2_2, y2_2 = map(float, rec2)
     x_left = max(x1_1, x1_2)
     y_top = max(y1_1, y1_2)
     x_right = min(x2_1, x2_2)
diff --git a/paddlex/inference/serving/basic_serving/_pipeline_apps/layout_parsing.py b/paddlex/inference/serving/basic_serving/_pipeline_apps/layout_parsing.py
index 657884e496..7c2da3f06b 100644
--- a/paddlex/inference/serving/basic_serving/_pipeline_apps/layout_parsing.py
+++ b/paddlex/inference/serving/basic_serving/_pipeline_apps/layout_parsing.py
@@ -54,7 +54,6 @@ async def _infer(
             use_doc_orientation_classify=request.useDocOrientationClassify,
             use_doc_unwarping=request.useDocUnwarping,
             use_textline_orientation=request.useTextlineOrientation,
-            use_general_ocr=request.useGeneralOcr,
             use_seal_recognition=request.useSealRecognition,
             use_table_recognition=request.useTableRecognition,
             use_formula_recognition=request.useFormulaRecognition,
diff --git a/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_chatocrv3_doc.py b/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_chatocrv3_doc.py
index bd09c104bc..e069bab5e0 100644
--- a/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_chatocrv3_doc.py
+++ b/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_chatocrv3_doc.py
@@ -54,7 +54,6 @@ async def _analyze_images(
             images,
             use_doc_orientation_classify=request.useDocOrientationClassify,
             use_doc_unwarping=request.useDocUnwarping,
-            use_general_ocr=request.useGeneralOcr,
             use_seal_recognition=request.useSealRecognition,
             use_table_recognition=request.useTableRecognition,
             layout_threshold=request.layoutThreshold,
diff --git a/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_chatocrv4_doc.py b/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_chatocrv4_doc.py
index 8dcb481eed..73e7d25cfa 100644
--- a/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_chatocrv4_doc.py
+++ b/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_chatocrv4_doc.py
@@ -54,7 +54,6 @@ async def _analyze_images(
             images,
             use_doc_orientation_classify=request.useDocOrientationClassify,
             use_doc_unwarping=request.useDocUnwarping,
-            use_general_ocr=request.useGeneralOcr,
             use_seal_recognition=request.useSealRecognition,
             use_table_recognition=request.useTableRecognition,
             layout_threshold=request.layoutThreshold,
diff --git a/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_structurev3.py b/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_structurev3.py
index 7bb9e325c4..84e80c9fa1 100644
--- a/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_structurev3.py
+++ b/paddlex/inference/serving/basic_serving/_pipeline_apps/pp_structurev3.py
@@ -54,10 +54,11 @@ async def _infer(
             use_doc_orientation_classify=request.useDocOrientationClassify,
             use_doc_unwarping=request.useDocUnwarping,
             use_textline_orientation=request.useTextlineOrientation,
-            use_general_ocr=request.useGeneralOcr,
             use_seal_recognition=request.useSealRecognition,
             use_table_recognition=request.useTableRecognition,
             use_formula_recognition=request.useFormulaRecognition,
+            use_chart_recognition=request.useChartRecognition,
+            use_region_detection=request.useRegionDetection,
             layout_threshold=request.layoutThreshold,
             layout_nms=request.layoutNms,
             layout_unclip_ratio=request.layoutUnclipRatio,
@@ -74,9 +75,12 @@ async def _infer(
             seal_det_box_thresh=request.sealDetBoxThresh,
             seal_det_unclip_ratio=request.sealDetUnclipRatio,
             seal_rec_score_thresh=request.sealRecScoreThresh,
-            use_table_cells_ocr_results=request.useTableCellsOcrResults,
+            use_ocr_results_with_table_cells=request.useOcrResultsWithTableCells,
             use_e2e_wired_table_rec_model=request.useE2eWiredTableRecModel,
             use_e2e_wireless_table_rec_model=request.useE2eWirelessTableRecModel,
+            use_wired_table_cells_trans_to_html=request.useWiredTableCellsTransToHtml,
+            use_wireless_table_cells_trans_to_html=request.useWirelessTableCellsTransToHtml,
+            use_table_orientation_classify=request.useTableOrientationClassify,
         )
 
         layout_parsing_results: List[Dict[str, Any]] = []
diff --git a/paddlex/inference/serving/basic_serving/_pipeline_apps/table_recognition.py b/paddlex/inference/serving/basic_serving/_pipeline_apps/table_recognition.py
index b32eb5001a..3ec1300ada 100644
--- a/paddlex/inference/serving/basic_serving/_pipeline_apps/table_recognition.py
+++ b/paddlex/inference/serving/basic_serving/_pipeline_apps/table_recognition.py
@@ -63,7 +63,7 @@ async def _infer(request: InferRequest) -> AIStudioResultResponse[InferResult]:
             text_det_box_thresh=request.textDetBoxThresh,
             text_det_unclip_ratio=request.textDetUnclipRatio,
             text_rec_score_thresh=request.textRecScoreThresh,
-            use_table_cells_ocr_results=request.useTableCellsOcrResults,
+            use_ocr_results_with_table_cells=request.useOcrResultsWithTableCells,
         )
 
         table_rec_results: List[Dict[str, Any]] = []
diff --git a/paddlex/inference/serving/basic_serving/_pipeline_apps/table_recognition_v2.py b/paddlex/inference/serving/basic_serving/_pipeline_apps/table_recognition_v2.py
index 9a4f382b65..30fd799ca0 100644
--- a/paddlex/inference/serving/basic_serving/_pipeline_apps/table_recognition_v2.py
+++ b/paddlex/inference/serving/basic_serving/_pipeline_apps/table_recognition_v2.py
@@ -63,9 +63,12 @@ async def _infer(request: InferRequest) -> AIStudioResultResponse[InferResult]:
             text_det_box_thresh=request.textDetBoxThresh,
             text_det_unclip_ratio=request.textDetUnclipRatio,
             text_rec_score_thresh=request.textRecScoreThresh,
-            use_table_cells_ocr_results=request.useTableCellsOcrResults,
+            use_ocr_results_with_table_cells=request.useOcrResultsWithTableCells,
             use_e2e_wired_table_rec_model=request.useE2eWiredTableRecModel,
             use_e2e_wireless_table_rec_model=request.useE2eWirelessTableRecModel,
+            use_wired_table_cells_trans_to_html=request.useWiredTableCellsTransToHtml,
+            use_wireless_table_cells_trans_to_html=request.useWirelessTableCellsTransToHtml,
+            use_table_orientation_classify=request.useTableOrientationClassify,
         )
 
         table_rec_results: List[Dict[str, Any]] = []
diff --git a/paddlex/inference/serving/schemas/layout_parsing.py b/paddlex/inference/serving/schemas/layout_parsing.py
index e037808c90..5c2bd53fe8 100644
--- a/paddlex/inference/serving/schemas/layout_parsing.py
+++ b/paddlex/inference/serving/schemas/layout_parsing.py
@@ -34,7 +34,6 @@ class InferRequest(ocr.BaseInferRequest):
     useDocOrientationClassify: Optional[bool] = None
     useDocUnwarping: Optional[bool] = None
     useTextlineOrientation: Optional[bool] = None
-    useGeneralOcr: Optional[bool] = None
     useSealRecognition: Optional[bool] = None
     useTableRecognition: Optional[bool] = None
     useFormulaRecognition: Optional[bool] = None
diff --git a/paddlex/inference/serving/schemas/pp_chatocrv3_doc.py b/paddlex/inference/serving/schemas/pp_chatocrv3_doc.py
index 9ef0dbcbba..7324d25317 100644
--- a/paddlex/inference/serving/schemas/pp_chatocrv3_doc.py
+++ b/paddlex/inference/serving/schemas/pp_chatocrv3_doc.py
@@ -39,7 +39,6 @@
 class AnalyzeImagesRequest(ocr.BaseInferRequest):
     useDocOrientationClassify: Optional[bool] = None
     useDocUnwarping: Optional[bool] = None
-    useGeneralOcr: Optional[bool] = None
     useSealRecognition: Optional[bool] = None
     useTableRecognition: Optional[bool] = None
     layoutThreshold: Optional[float] = None
diff --git a/paddlex/inference/serving/schemas/pp_chatocrv4_doc.py b/paddlex/inference/serving/schemas/pp_chatocrv4_doc.py
index 3b041aab66..488e84339a 100644
--- a/paddlex/inference/serving/schemas/pp_chatocrv4_doc.py
+++ b/paddlex/inference/serving/schemas/pp_chatocrv4_doc.py
@@ -42,7 +42,6 @@
 class AnalyzeImagesRequest(ocr.BaseInferRequest):
     useDocOrientationClassify: Optional[bool] = None
     useDocUnwarping: Optional[bool] = None
-    useGeneralOcr: Optional[bool] = None
     useSealRecognition: Optional[bool] = None
     useTableRecognition: Optional[bool] = None
     layoutThreshold: Optional[float] = None
diff --git a/paddlex/inference/serving/schemas/pp_structurev3.py b/paddlex/inference/serving/schemas/pp_structurev3.py
index ba9278c439..a39b8e0cd8 100644
--- a/paddlex/inference/serving/schemas/pp_structurev3.py
+++ b/paddlex/inference/serving/schemas/pp_structurev3.py
@@ -35,10 +35,11 @@ class InferRequest(ocr.BaseInferRequest):
     useDocOrientationClassify: Optional[bool] = None
     useDocUnwarping: Optional[bool] = None
     useTextlineOrientation: Optional[bool] = None
-    useGeneralOcr: Optional[bool] = None
     useSealRecognition: Optional[bool] = None
     useTableRecognition: Optional[bool] = None
     useFormulaRecognition: Optional[bool] = None
+    useChartRecognition: Optional[bool] = None
+    useRegionDetection: Optional[bool] = None
     layoutThreshold: Optional[float] = None
     layoutNms: Optional[bool] = None
     layoutUnclipRatio: Optional[Union[float, Tuple[float, float], dict]] = None
@@ -55,9 +56,12 @@ class InferRequest(ocr.BaseInferRequest):
     sealDetBoxThresh: Optional[float] = None
     sealDetUnclipRatio: Optional[float] = None
     sealRecScoreThresh: Optional[float] = None
-    useTableCellsOcrResults: bool = False
+    useOcrResultsWithTableCells: bool = False
     useE2eWiredTableRecModel: bool = False
-    useE2eWirelessTableRecModel: bool = False
+    useE2eWirelessTableRecModel: bool = True
+    useWiredTableCellsTransToHtml: bool = False
+    useWirelessTableCellsTransToHtml: bool = False
+    useTableOrientationClassify: bool = True
 
 
 class MarkdownData(BaseModel):
diff --git a/paddlex/inference/serving/schemas/table_recognition.py b/paddlex/inference/serving/schemas/table_recognition.py
index b4a62e1560..d240b60523 100644
--- a/paddlex/inference/serving/schemas/table_recognition.py
+++ b/paddlex/inference/serving/schemas/table_recognition.py
@@ -45,7 +45,7 @@ class InferRequest(ocr.BaseInferRequest):
     textDetBoxThresh: Optional[float] = None
     textDetUnclipRatio: Optional[float] = None
     textRecScoreThresh: Optional[float] = None
-    useTableCellsOcrResults: bool = False
+    useOcrResultsWithTableCells: bool = False
 
 
 class TableRecResult(BaseModel):
diff --git a/paddlex/inference/serving/schemas/table_recognition_v2.py b/paddlex/inference/serving/schemas/table_recognition_v2.py
index f0c146cb0e..dd6ea44ca2 100644
--- a/paddlex/inference/serving/schemas/table_recognition_v2.py
+++ b/paddlex/inference/serving/schemas/table_recognition_v2.py
@@ -45,9 +45,12 @@ class InferRequest(ocr.BaseInferRequest):
     textDetBoxThresh: Optional[float] = None
     textDetUnclipRatio: Optional[float] = None
     textRecScoreThresh: Optional[float] = None
-    useTableCellsOcrResults: bool = False
+    useOcrResultsWithTableCells: bool = False
     useE2eWiredTableRecModel: bool = False
     useE2eWirelessTableRecModel: bool = False
+    useWiredTableCellsTransToHtml: bool = False
+    useWirelessTableCellsTransToHtml: bool = False
+    useTableOrientationClassify: bool = True
 
 
 class TableRecResult(BaseModel):
diff --git a/paddlex/inference/utils/hpi.py b/paddlex/inference/utils/hpi.py
index 2d849fdecf..f819b5ed5b 100644
--- a/paddlex/inference/utils/hpi.py
+++ b/paddlex/inference/utils/hpi.py
@@ -231,10 +231,13 @@ def suggest_inference_backend_and_config(
         pseudo_backend = backend_to_pseudo_backend["paddle"]
         assert pseudo_backend in (
             "paddle",
+            "paddle_fp16",
             "paddle_tensorrt",
             "paddle_tensorrt_fp16",
         ), pseudo_backend
-        if pseudo_backend == "paddle_tensorrt":
+        if pseudo_backend == "paddle_fp16":
+            suggested_backend_config.update({"run_mode": "paddle_fp16"})
+        elif pseudo_backend == "paddle_tensorrt":
             suggested_backend_config.update({"run_mode": "trt_fp32"})
         elif pseudo_backend == "paddle_tensorrt_fp16":
             # TODO: Check if the target device supports FP16.
diff --git a/paddlex/inference/utils/pp_option.py b/paddlex/inference/utils/pp_option.py
index 8c336707a1..9d834d19c0 100644
--- a/paddlex/inference/utils/pp_option.py
+++ b/paddlex/inference/utils/pp_option.py
@@ -13,6 +13,7 @@
 # limitations under the License.
 
 import os
+from copy import deepcopy
 from typing import Dict, List
 
 from ...utils import logging
@@ -64,6 +65,13 @@ def changed(self, v):
         assert isinstance(v, bool)
         self._changed = v
 
+    def copy(self):
+        obj = type(self)(self._model_name)
+        obj._cfg = deepcopy(self._cfg)
+        if hasattr(self, "trt_cfg_setting"):
+            obj.trt_cfg_setting = self.trt_cfg_setting
+        return obj
+
     def _init_option(self, **kwargs):
         for k, v in kwargs.items():
             if self._has_setter(k):
diff --git a/paddlex/modules/base/trainer.py b/paddlex/modules/base/trainer.py
index 4c1fd8105a..a5a4754858 100644
--- a/paddlex/modules/base/trainer.py
+++ b/paddlex/modules/base/trainer.py
@@ -21,7 +21,7 @@
     set_env_for_device,
     update_device_num,
 )
-from ...utils.flags import FLAGS_json_format_model, DISABLE_CINN_MODEL_WL
+from ...utils.flags import DISABLE_CINN_MODEL_WL, FLAGS_json_format_model
 from ...utils.misc import AutoRegisterABCMetaClass
 from .build_model import build_model
 from .utils.cinn_setting import CINN_WHITELIST, enable_cinn_backend