Skip to content

paddleX 服务器部署怎么开启高性能推理并指定后端? #4832

@Jimmy-L99

Description

@Jimmy-L99
  • docker image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/hps:paddlex3.3-gpu
  • docker compose如下:
services:
  paddlex-server:
    image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/hps:paddlex3.3-gpu
    container_name: paddlex-server
    command: /bin/bash server.sh
    environment:
      - CUDA_VISIBLE_DEVICES=4,5,6,7
      - PADDLEX_HPS_DEVICE_TYPE=GPU
      - PADDLEX_HPS_USE_HPIP=1 
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /usr/share/zoneinfo/Asia/Shanghai:/usr/share/zoneinfo/Asia/Shanghai:ro
      - /home/vg_llm/PaddleX/paddlex_hps_OCR_sdk/server:/app
      - /home/vg_llm/PaddleX/model:/model
    working_dir: /app
    network_mode: host
    privileged: true
    shm_size: 16g
    stdin_open: true
    tty: true
    init: true
    restart: "always"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["4", "5", "6", "7"]
              capabilities: [gpu]

OCR-SDK的server文件夹下的pipeline_config.yaml如下:


pipeline_name: OCR

text_type: general

use_doc_preprocessor: True
use_textline_orientation: True

SubPipelines:
  DocPreprocessor:
    pipeline_name: doc_preprocessor
    use_doc_orientation_classify: True
    use_doc_unwarping: True
    SubModules:
      DocOrientationClassify:
        module_name: doc_text_orientation
        model_name: PP-LCNet_x1_0_doc_ori
        model_dir: /model/PP-LCNet_x1_0_doc_ori_infer
        auto_config: False
        use_hpip: True
      DocUnwarping:
        module_name: image_unwarping
        model_name: UVDoc
        model_dir: /model/UVDoc_infer
        auto_config: False
        use_hpip: True

SubModules:
  TextDetection:
    module_name: text_detection
    model_name: PP-OCRv5_server_det
    model_dir: /model/PP-OCRv5_server_det_infer
    auto_config: False
    use_hpip: True
    limit_side_len: 64
    limit_type: min
    max_side_limit: 4000
    thresh: 0.3
    box_thresh: 0.6
    unclip_ratio: 1.5
  TextLineOrientation:
    module_name: textline_orientation
    model_name: PP-LCNet_x1_0_textline_ori 
    model_dir: /model/PP-LCNet_x1_0_textline_ori_infer
    auto_config: False
    use_hpip: True
    batch_size: 6    
  TextRecognition:
    module_name: text_recognition
    model_name: PP-OCRv5_server_rec 
    model_dir: /model/PP-OCRv5_server_rec_infer
    auto_config: False
    use_hpip: True
    batch_size: 6
    score_thresh: 0.0

可以看到docker compose写了PADDLEX_HPS_USE_HPIP=1,按照服务化部署
然后看到高性能推理写了需要在pipeline_config.yaml给每个模型配置添加use_hpip: True,但启动服务的时候会出现The Paddle Inference backend is selected with the default configuration. This may not provide optimal performance.。再然后看到能够添加tensorRT后端,然后在每个模型的use_hpip: True的后面一行添加:

    use_hpip: True
    hpi_config:
      backend: tensorrt

这时候重新创建容器,会报错:E1211 07:30:50.603498 7 model_repository_manager.cc:1186] failed to load 'ocr' version 1: Internal: RuntimeError: No inference backend and configuration could be suggested. Reason: Inference backend 'tensorrt' is unavailable.

请问如何正确开启高性能推理?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions