paddleX 服务器部署怎么开启高性能推理并指定后端？

- docker image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/hps:paddlex3.3-gpu
- docker compose如下：
```
services:
  paddlex-server:
    image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/hps:paddlex3.3-gpu
    container_name: paddlex-server
    command: /bin/bash server.sh
    environment:
      - CUDA_VISIBLE_DEVICES=4,5,6,7
      - PADDLEX_HPS_DEVICE_TYPE=GPU
      - PADDLEX_HPS_USE_HPIP=1 
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /usr/share/zoneinfo/Asia/Shanghai:/usr/share/zoneinfo/Asia/Shanghai:ro
      - /home/vg_llm/PaddleX/paddlex_hps_OCR_sdk/server:/app
      - /home/vg_llm/PaddleX/model:/model
    working_dir: /app
    network_mode: host
    privileged: true
    shm_size: 16g
    stdin_open: true
    tty: true
    init: true
    restart: "always"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["4", "5", "6", "7"]
              capabilities: [gpu]

```

OCR-SDK的server文件夹下的`pipeline_config.yaml`如下：
```

pipeline_name: OCR

text_type: general

use_doc_preprocessor: True
use_textline_orientation: True

SubPipelines:
  DocPreprocessor:
    pipeline_name: doc_preprocessor
    use_doc_orientation_classify: True
    use_doc_unwarping: True
    SubModules:
      DocOrientationClassify:
        module_name: doc_text_orientation
        model_name: PP-LCNet_x1_0_doc_ori
        model_dir: /model/PP-LCNet_x1_0_doc_ori_infer
        auto_config: False
        use_hpip: True
      DocUnwarping:
        module_name: image_unwarping
        model_name: UVDoc
        model_dir: /model/UVDoc_infer
        auto_config: False
        use_hpip: True

SubModules:
  TextDetection:
    module_name: text_detection
    model_name: PP-OCRv5_server_det
    model_dir: /model/PP-OCRv5_server_det_infer
    auto_config: False
    use_hpip: True
    limit_side_len: 64
    limit_type: min
    max_side_limit: 4000
    thresh: 0.3
    box_thresh: 0.6
    unclip_ratio: 1.5
  TextLineOrientation:
    module_name: textline_orientation
    model_name: PP-LCNet_x1_0_textline_ori 
    model_dir: /model/PP-LCNet_x1_0_textline_ori_infer
    auto_config: False
    use_hpip: True
    batch_size: 6    
  TextRecognition:
    module_name: text_recognition
    model_name: PP-OCRv5_server_rec 
    model_dir: /model/PP-OCRv5_server_rec_infer
    auto_config: False
    use_hpip: True
    batch_size: 6
    score_thresh: 0.0
```

可以看到docker compose写了`PADDLEX_HPS_USE_HPIP=1`，按照[服务化部署](https://paddlepaddle.github.io/PaddleX/latest/pipeline_deploy/serving.html#23)。
然后看到[高性能推理](https://paddlepaddle.github.io/PaddleX/latest/pipeline_deploy/high_performance_inference.html#23)写了需要在`pipeline_config.yaml`给每个模型配置添加`use_hpip: True`，但启动服务的时候会出现`The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance.`。再然后看到能够添加tensorRT后端，然后在每个模型的`use_hpip: True`的后面一行添加：
```
    use_hpip: True
    hpi_config:
      backend: tensorrt
```
这时候重新创建容器，会报错：`E1211 07:30:50.603498 7 model_repository_manager.cc:1186] failed to load 'ocr' version 1: Internal: RuntimeError: No inference backend and configuration could be suggested. Reason: Inference backend 'tensorrt' is unavailable.`

请问如何正确开启高性能推理？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

paddleX 服务器部署怎么开启高性能推理并指定后端？ #4832

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

paddleX 服务器部署怎么开启高性能推理并指定后端？ #4832

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions