Skip to content

天数 BI-150S 无法部署 PaddleOCR-VL #7461

@megemini

Description

@megemini

在 aistudio 的天数 BI-150S 环境中通过以下命令部署 PaddleOCR-VL:

python -m fastdeploy.entrypoints.openai.api_server \
    --model /home/aistudio/baidu/PaddleOCR-VL-1.5 \
    --port 8185 \
    --metrics-port 8186 \
    --engine-worker-queue-port 8187 \
    --max-model-len 16384 \
    --max-num-batched-tokens 16384 \
    --gpu-memory-utilization 0.8 \
    --max-num-seqs 256

报错:

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
INFO     2026-04-17 14:15:14,144 344971 engine.py[line:151] Waiting for worker processes to be ready...
Loading Weights:   0%|                                                                                                                                     | 0/100 [00:16<?, ?it/s]
ERROR    2026-04-17 14:15:35,184 344971 engine.py[line:160] Failed to launch worker processes, check log/workerlog.* for more details.
ERROR    2026-04-17 14:15:41,788 344971 engine.py[line:452] Error extracting sub services: [Errno 3] No such process, Traceback (most recent call last):
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/engine/engine.py", line 449, in _exit_sub_services
    pgid = os.getpgid(self.worker_proc.pid)
ProcessLookupError: [Errno 3] No such process

log/workerlog.0 中提示错误:

INFO     2026-04-17 14:11:37,472 339066 sampler.py[line:132] GuidedDecoding max_num_seqs=256 fill_bitmask_parallel_batch_size=4 is_cuda_platform=False max_workers=64.0
INFO     2026-04-17 14:11:37,479 339066 input_batch.py[line:346] Enabled logits processors: []
INFO     2026-04-17 14:11:37,480 339066 iluvatar.py[line:29] Using ixinfer MHA backend instead of append attention
Traceback (most recent call last):
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/engine/../worker/worker_process.py", line 1333, in <module>
    run_worker_proc()
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/base/dygraph/base.py", line 406, in _decorate_function
    return func(*args, **kwargs)
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/engine/../worker/worker_process.py", line 1313, in run_worker_proc
    worker_proc.init_device()
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/worker/worker_process.py", line 726, in init_device
    self.worker.init_device()
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/worker/iluvatar_worker.py", line 68, in init_device
    self.model_runner: IluvatarModelRunner = IluvatarModelRunner(
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/worker/iluvatar_model_runner.py", line 54, in __init__
    super(IluvatarModelRunner, self).__init__(
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/worker/gpu_model_runner.py", line 226, in __init__
    self._initialize_attn_backend()
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/worker/iluvatar_model_runner.py", line 94, in _initialize_attn_backend
    attn_backend = attn_cls(
  File "/home/aistudio/external-libraries/lib/python3.10/site-packages/fastdeploy/model_executor/layers/backends/iluvatar/attention/mha_attn_backend.py", line 81, in __init__
    assert self.block_size == 16, "Iluvatar paged attn requires block_size must be 16."
AssertionError: Iluvatar paged attn requires block_size must be 16.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions