Centos7环境下，通过docker部署mineru，启动报错 #4362

leavesosoon · 2026-01-14T04:50:41Z

leavesosoon
Jan 14, 2026

我在Centos7操作系统中，纯CPU环境，通过轩辕镜像站点下载的2.7.0版本的镜像，通过docker部署和启动。启动命令如下：
docker run -d
--name mineru-webui
-p 7860:7860
-e MINERU_DEVICE_MODE=cpu
-v /root/data/disk/mineru/input:/app/data
docker.xuanyuan.run/alexsuntop/mineru:2.7.0
mineru-gradio --server-name 0.0.0.0 --server-port 7860

mineru-webui可以访问，但是上传pdf之后，出现报错：
2026-01-13 18:35:41.173 | INFO | mineru.utils.engine_utils:get_vlm_engine:32 - Using vllm-async-engine as the inference engine for VLM.
2026-01-13 18:35:41.174 | INFO | mineru.cli.gradio_app:main:388 - Start init vllm-async-engine...
INFO 01-13 18:35:42 [init.py:245] No platform detected, vLLM is running on UnspecifiedPlatform
2026-01-13 18:35:44.046 | INFO | mineru.backend.vlm.utils:enable_custom_logits_processors:22 - CUDA not available, disabling custom_logits_processors
2026-01-13 18:35:44.076 | ERROR | mineru.cli.gradio_app:main:399 - Device string must not be empty

根据日志来看，平台和cuda没检测到，然后Device string must not be empty这一句报错，我尝试用环境变量指定为CPU，但是依旧提示这个错误

@leavesosoon · 2026-01-14T04:54:58Z

dosubot[bot]
bot Jan 14, 2026

Hi @leavesosoon! I'm Dosu and I’m helping the MinerU team.

你遇到的启动报错，根本原因是 MinerU 官方 Docker 镜像（包括 2.7.0 版本）默认集成了 vLLM 推理加速和 CUDA 相关依赖，只适用于 GPU 环境，不支持纯 CPU 场景。直接在 CentOS 7 的 CPU-only 环境下用官方镜像启动，会因为缺少 GPU 驱动或 CUDA 环境导致报错，常见现象包括容器反复重启、主进程无法启动、依赖报错等相关说明。

解决方法：需要自定义构建 CPU 专用镜像，步骤如下：

新建 Dockerfile，内容如下（适配 CPU-only 环境，无需 CUDA/vLLM）：

FROM python:3.10-slim

# 安装依赖
RUN apt-get update && \
    apt-get install -y fonts-noto-core fonts-noto-cjk fontconfig libgl1 && \
    fc-cache -fv && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# 安装 MinerU（仅 core，无 CUDA 依赖）
RUN python3 -m pip install -U 'mineru[core]' && \
    python3 -m pip cache purge

# （可选）下载模型，也可挂载本地模型目录
RUN mineru-models-download -s huggingface -m all

# 设置入口
ENTRYPOINT ["/bin/bash", "-c", "export MINERU_MODEL_SOURCE=local && exec \"$@\"", "--"]

构建镜像：

docker build -t mineru-cpu:latest -f Dockerfile .

启动服务（务必用 pipeline 后端，且不要加 --gpus 等 GPU 参数）：

docker run -p 8000:8000 mineru-cpu:latest mineru-api --host 0.0.0.0 --port 8000 --backend pipeline --device cpu

常见坑位与排查建议：

不要用官方镜像或 compose.yaml 默认的 GPU 配置（如 --gpus all、nvidia 相关 device）。
batch_size 参数必须为整数，不能为 0.5 等小数，否则会报 PyTorch DataLoader 错误相关案例。
模型路径、权限、依赖要确保容器内可访问。
如遇 OpenBLAS 线程创建失败，多为容器 ulimit 或 RLIMIT_NPROC 限制，需调整容器资源限制相关案例。
CentOS 7 属于非主线推荐环境，部分依赖可能需手动适配。

如果按上述方法自定义镜像后仍有报错，请提供具体的启动命令和完整的容器日志（docker logs），这样可以进一步精准定位问题。

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

11 replies

leavesosoon Jan 15, 2026
Author

您好，我完整复现一下我的问题。
添加--backend pipeline --device cpu参数，命令如下：

docker run -it --rm
-p 7860:7860
-v /root/data/disk/mineru/input:/app/data
mineru-myself-build-cpu:2.7.1
mineru-gradio --server-name 0.0.0.0 --server-port 7860 --backend pipeline --device cpu

详细报错日志：
2026-01-14 18:28:08.593 | INFO | mineru.utils.engine_utils:get_vlm_engine:32 - Using vllm-async-engine as the inference engine for VLM.
2026-01-14 18:28:08.593 | INFO | mineru.cli.gradio_app:main:388 - Start init vllm-async-engine...
2026-01-14 18:28:08.593 | ERROR | mineru.cli.gradio_app:main:399 - ModelSingleton.get_model() got multiple values for argument 'backend'
Traceback (most recent call last):

File "/usr/local/bin/mineru-gradio", line 7, in
sys.exit(main())
│ │ └
│ └
└ <module 'sys' (built-in)>
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1442, in call
return self.main(*args, **kwargs)
│ │ │ └ {}
│ │ └ ()
│ └ <function Command.main at 0x7f0ff1871800>
└
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1363, in main
rv = self.invoke(ctx)
│ │ └ <click.core.Context object at 0x7f0ff1b94740>
│ └ <function Command.invoke at 0x7f0ff18714e0>
└
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1226, in invoke
return ctx.invoke(self.callback, **ctx.params)
│ │ │ │ │ └ {'server_name': '0.0.0.0', 'server_port': 7860, 'example_enable': True, 'http_client_enable': False, 'api_enable': True, 'max...
│ │ │ │ └ <click.core.Context object at 0x7f0ff1b94740>
│ │ │ └ <function main at 0x7f0ed67cb560>
│ │ └
│ └ <function Context.invoke at 0x7f0ff1870720>
└ <click.core.Context object at 0x7f0ff1b94740>
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 794, in invoke
return callback(*args, **kwargs)
│ │ └ {'server_name': '0.0.0.0', 'server_port': 7860, 'example_enable': True, 'http_client_enable': False, 'api_enable': True, 'max...
│ └ ()
└ <function main at 0x7f0ed67cb560>
File "/usr/local/lib/python3.12/dist-packages/click/decorators.py", line 34, in new_func
return f(get_current_context(), *args, **kwargs)
│ │ │ └ {'server_name': '0.0.0.0', 'server_port': 7860, 'example_enable': True, 'http_client_enable': False, 'api_enable': True, 'max...
│ │ └ ()
│ └ <function get_current_context at 0x7f0ff181b240>
└ <function main at 0x7f0ed67cb9c0>

File "/usr/local/lib/python3.12/dist-packages/mineru/cli/gradio_app.py", line 391, in main
predictor = model_singleton.get_model(
│ └ <function ModelSingleton.get_model at 0x7f0ed67c94e0>
└ <mineru.backend.vlm.vlm_analyze.ModelSingleton object at 0x7f0ee1df92e0>

TypeError: ModelSingleton.get_model() got multiple values for argument 'backend'

Running on local URL: http://0.0.0.0:7860
To create a public link, set share=True in launch().

leavesosoon Jan 15, 2026
Author

打开的界面上面会默认选择一个后端解析模型【hybrid-auto-engine】，还有另外两个选项，vlm-auto-engine和pipeline。无论我怎么选择这个选项，都会报错。我看日志里面是提示的backend参数重复了，有多个。我的页面上面也有参数，在请求的时候应该也会送到后端，我的启动命令上面也有backend参数，会不会和这个有关系。然后我尝试在启动命令上面去掉backend参数，用页面上面的传参，启动命令如下：

docker run -it --rm
-p 7860:7860
-v /root/data/disk/mineru/input:/app/data
mineru-myself-build-cpu:2.7.1
mineru-gradio --server-name 0.0.0.0 --server-port 7860 --device cpu

然后又报错提示Device string must not be empty，这样看的话backend和device参数传递的都有问题。详细日志如下：

2026-01-14 19:12:58.161 | INFO | mineru.utils.engine_utils:get_vlm_engine:32 - Using vllm-async-engine as the inference engine for VLM.
2026-01-14 19:12:58.162 | INFO | mineru.cli.gradio_app:main:388 - Start init vllm-async-engine...
INFO 01-14 19:13:07 [init.py:245] No platform detected, vLLM is running on UnspecifiedPlatform
2026-01-14 19:13:09.605 | INFO | mineru.backend.vlm.utils:enable_custom_logits_processors:22 - CUDA not available, disabling custom_logits_processors
2026-01-14 19:13:09.628 | ERROR | mineru.cli.gradio_app:main:399 - Device string must not be empty
Traceback (most recent call last):

File "/usr/local/bin/mineru-gradio", line 7, in
sys.exit(main())
│ │ └
│ └
└ <module 'sys' (built-in)>
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1442, in call
return self.main(*args, **kwargs)
│ │ │ └ {}
│ │ └ ()
│ └ <function Command.main at 0x7fae39b95800>
└
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1363, in main
rv = self.invoke(ctx)
│ │ └ <click.core.Context object at 0x7fae39eb8710>
│ └ <function Command.invoke at 0x7fae39b954e0>
└
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1226, in invoke
return ctx.invoke(self.callback, **ctx.params)
│ │ │ │ │ └ {'server_name': '0.0.0.0', 'server_port': 7860, 'example_enable': True, 'http_client_enable': False, 'api_enable': True, 'max...
│ │ │ │ └ <click.core.Context object at 0x7fae39eb8710>
│ │ │ └ <function main at 0x7fad1eecf560>
│ │ └
│ └ <function Context.invoke at 0x7fae39b94720>
└ <click.core.Context object at 0x7fae39eb8710>
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 794, in invoke
return callback(*args, **kwargs)
│ │ └ {'server_name': '0.0.0.0', 'server_port': 7860, 'example_enable': True, 'http_client_enable': False, 'api_enable': True, 'max...
│ └ ()
└ <function main at 0x7fad1eecf560>
File "/usr/local/lib/python3.12/dist-packages/click/decorators.py", line 34, in new_func
return f(get_current_context(), *args, **kwargs)
│ │ │ └ {'server_name': '0.0.0.0', 'server_port': 7860, 'example_enable': True, 'http_client_enable': False, 'api_enable': True, 'max...
│ │ └ ()
│ └ <function get_current_context at 0x7fae39b43240>
└ <function main at 0x7fad1eecf9c0>

File "/usr/local/lib/python3.12/dist-packages/mineru/cli/gradio_app.py", line 391, in main
predictor = model_singleton.get_model(
│ └ <function ModelSingleton.get_model at 0x7fad1eecd4e0>
└ <mineru.backend.vlm.vlm_analyze.ModelSingleton object at 0x7fad1ee93b30>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/vlm/vlm_analyze.py", line 125, in get_model
vllm_async_llm = AsyncLLM.from_engine_args(AsyncEngineArgs(**kwargs))
│ │ │ └ {'gpu_memory_utilization': 0.5, 'model': '/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B'}
│ │ └ <class 'vllm.engine.arg_utils.AsyncEngineArgs'>
│ └ <classmethod(<function AsyncLLM.from_engine_args at 0x7face2fe2de0>)>
└ <class 'vllm.v1.engine.async_llm.AsyncLLM'>
File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 198, in from_engine_args
vllm_config = engine_args.create_engine_config(usage_context)
│ │ └ <UsageContext.ENGINE_CONTEXT: 'ENGINE_CONTEXT'>
│ └ <function EngineArgs.create_engine_config at 0x7fad00ad1120>
└ AsyncEngineArgs(model='/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', served_model_name=None, token...
File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1055, in create_engine_config
device_config = DeviceConfig(
└ <class 'vllm.config.DeviceConfig'>
File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 123, in init
s.pydantic_validator.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
│ │ │ │ │ │ └ DeviceConfig(device='', device_type='')
│ │ │ │ │ └ {'device': ''}
│ │ │ │ └ ()
│ │ │ └ <class 'pydantic_core._pydantic_core.ArgsKwargs'>
│ │ └ <method 'validate_python' of 'pydantic_core._pydantic_core.SchemaValidator' objects>
│ └ SchemaValidator(title="DeviceConfig", validator=Dataclass(
│ DataclassValidator {
│ strict: false,
│ validator:...
└ DeviceConfig(device='', device_type='')
File "/usr/local/lib/python3.12/dist-packages/vllm/config/init.py", line 1900, in post_init
self.device = torch.device(self.device_type)
│ │ │ │ │ └ ''
│ │ │ │ └ DeviceConfig(device='', device_type='')
│ │ │ └ <class 'torch.device'>
│ │ └ <module 'torch' from '/usr/local/lib/python3.12/dist-packages/torch/init.py'>
│ └ ''
└ DeviceConfig(device='', device_type='')

RuntimeError: Device string must not be empty

Running on local URL: http://0.0.0.0:7860
To create a public link, set share=True in launch().

dosubot[bot] bot Jan 15, 2026

你的分析完全正确，backend 参数冲突正是因为 CLI 启动命令和 WebUI（页面）都传递了 backend，Python 函数收到重复参数就会报 “got multiple values for argument 'backend'” 错误。你去掉 CLI 的 --backend 后，页面又报 “device不能为空”，是因为 device 只能通过 CLI 参数或环境变量传递，WebUI 并不会传 device 字段。

解决方法有两种安全模式（二选一）：

A. 推荐：只在 CLI 启动命令里指定 backend 和 device，WebUI 不要切换后端

这样 WebUI 的后端下拉框选项就不要动，始终用 CLI 指定的 pipeline，页面不用切换后端，否则还是会冲突。

docker run -it --rm \
  -p 7860:7860 \
  -v /root/data/disk/mineru/input:/app/data \
  mineru-myself-build-cpu:2.7.1 \
  mineru-gradio --server-name 0.0.0.0 --server-port 7860 --backend pipeline --device cpu

注意：WebUI 的后端下拉框不要切换，始终用 pipeline。

B. 或者：CLI 不传 backend，device 用环境变量指定，WebUI 选择后端

如果你想让 WebUI 可以切换后端（比如 pipeline/vlm-auto-engine/hybrid-auto-engine），CLI 启动命令不要加 --backend，但必须用环境变量指定 device：

docker run -it --rm \
  -p 7860:7860 \
  -v /root/data/disk/mineru/input:/app/data \
  -e MINERU_DEVICE_MODE=cpu \
  mineru-myself-build-cpu:2.7.1 \
  mineru-gradio --server-name 0.0.0.0 --server-port 7860

这样 device 就不会为空，WebUI 选择 pipeline 时会自动用 cpu，选择其他后端时如果不支持 CPU 会报错（比如 hybrid-auto-engine/vlm-auto-engine 需要 GPU）。

注意事项：

CLI 和 WebUI 不能同时传 backend，否则必然冲突 [参考1] [参考2]
device 只能通过 CLI 参数或环境变量传递，WebUI 不会传递 device 字段
hybrid-auto-engine/vlm-auto-engine 仅支持 GPU 环境，CPU-only 环境只能用 pipeline
WebUI 默认选中 hybrid-auto-engine，不适合 CPU-only 环境，建议用开发者工具手动改成 pipeline 或直接用 CLI 固定 pipeline

最佳实践：

纯 CPU 环境建议 CLI 固定 --backend pipeline --device cpu，WebUI 不切换后端
如需 WebUI 切换后端，CLI 不传 backend，device 用环境变量指定

本质：mineru-gradio 没有自动去重机制，CLI 和 WebUI 参数会合并，重复就报错。

如需进一步自定义默认后端（比如让 WebUI 默认 pipeline），可以在前端代码里把默认值改成 pipeline，或者用浏览器开发者工具手动修改默认选项。

如还有疑问可继续贴日志或命令。

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

leavesosoon Jan 15, 2026
Author

我用了你提供的B方案，CLI 不传 backend，device 用环境变量指定，WebUI 选择后端pipeline。命令如下：
docker run -it --rm
-p 7860:7860
-v /root/data/disk/mineru/input:/app/data
-e MINERU_DEVICE_MODE=cpu
mineru-myself-build-cpu:2.7.1
mineru-gradio --server-name 0.0.0.0 --server-port 7860

但是启动的时候还是提示：Device string must not be empty，且我上传了一个PDF，前端选择pipeline尝试进行解析，文件能预览，但是无法转换，转换的时候报错：TypeError: cannot unpack non-iterable NoneType object。

以下是详细日志：

2026-01-14 19:45:38.177 | INFO | mineru.utils.engine_utils:get_vlm_engine:32 - Using vllm-async-engine as the inference engine for VLM.
2026-01-14 19:45:38.178 | INFO | mineru.cli.gradio_app:main:388 - Start init vllm-async-engine...
INFO 01-14 19:45:49 [init.py:245] No platform detected, vLLM is running on UnspecifiedPlatform
2026-01-14 19:45:51.927 | INFO | mineru.backend.vlm.utils:enable_custom_logits_processors:22 - CUDA not available, disabling custom_logits_processors
2026-01-14 19:45:51.958 | ERROR | mineru.cli.gradio_app:main:399 - Device string must not be empty
Traceback (most recent call last):

File "/usr/local/bin/mineru-gradio", line 7, in
sys.exit(main())
│ │ └
│ └
└ <module 'sys' (built-in)>
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1442, in call
return self.main(*args, **kwargs)
│ │ │ └ {}
│ │ └ ()
│ └ <function Command.main at 0x7f8261ff9800>
└
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1363, in main
rv = self.invoke(ctx)
│ │ └ <click.core.Context object at 0x7f82623487a0>
│ └ <function Command.invoke at 0x7f8261ff94e0>
└
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 1226, in invoke
return ctx.invoke(self.callback, **ctx.params)
│ │ │ │ │ └ {'server_name': '0.0.0.0', 'server_port': 7860, 'example_enable': True, 'http_client_enable': False, 'api_enable': True, 'max...
│ │ │ │ └ <click.core.Context object at 0x7f82623487a0>
│ │ │ └ <function main at 0x7f81472bf560>
│ │ └
│ └ <function Context.invoke at 0x7f8261ff8720>
└ <click.core.Context object at 0x7f82623487a0>
File "/usr/local/lib/python3.12/dist-packages/click/core.py", line 794, in invoke
return callback(*args, **kwargs)
│ │ └ {'server_name': '0.0.0.0', 'server_port': 7860, 'example_enable': True, 'http_client_enable': False, 'api_enable': True, 'max...
│ └ ()
└ <function main at 0x7f81472bf560>
File "/usr/local/lib/python3.12/dist-packages/click/decorators.py", line 34, in new_func
return f(get_current_context(), *args, **kwargs)
│ │ │ └ {'server_name': '0.0.0.0', 'server_port': 7860, 'example_enable': True, 'http_client_enable': False, 'api_enable': True, 'max...
│ │ └ ()
│ └ <function get_current_context at 0x7f8261fa3240>
└ <function main at 0x7f81472bf9c0>

File "/usr/local/lib/python3.12/dist-packages/mineru/cli/gradio_app.py", line 391, in main
predictor = model_singleton.get_model(
│ └ <function ModelSingleton.get_model at 0x7f81472bd4e0>
└ <mineru.backend.vlm.vlm_analyze.ModelSingleton object at 0x7f81529dd8e0>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/vlm/vlm_analyze.py", line 125, in get_model
vllm_async_llm = AsyncLLM.from_engine_args(AsyncEngineArgs(**kwargs))
│ │ │ └ {'gpu_memory_utilization': 0.5, 'model': '/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B'}
│ │ └ <class 'vllm.engine.arg_utils.AsyncEngineArgs'>
│ └ <classmethod(<function AsyncLLM.from_engine_args at 0x7f810b302b60>)>
└ <class 'vllm.v1.engine.async_llm.AsyncLLM'>
File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py", line 198, in from_engine_args
vllm_config = engine_args.create_engine_config(usage_context)
│ │ └ <UsageContext.ENGINE_CONTEXT: 'ENGINE_CONTEXT'>
│ └ <function EngineArgs.create_engine_config at 0x7f8128f40ea0>
└ AsyncEngineArgs(model='/root/.cache/modelscope/hub/models/OpenDataLab/MinerU2___5-2509-1___2B', served_model_name=None, token...
File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1055, in create_engine_config
device_config = DeviceConfig(
└ <class 'vllm.config.DeviceConfig'>
File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 123, in init
s.pydantic_validator.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
│ │ │ │ │ │ └ DeviceConfig(device='', device_type='')
│ │ │ │ │ └ {'device': ''}
│ │ │ │ └ ()
│ │ │ └ <class 'pydantic_core._pydantic_core.ArgsKwargs'>
│ │ └ <method 'validate_python' of 'pydantic_core._pydantic_core.SchemaValidator' objects>
│ └ SchemaValidator(title="DeviceConfig", validator=Dataclass(
│ DataclassValidator {
│ strict: false,
│ validator:...
└ DeviceConfig(device='', device_type='')
File "/usr/local/lib/python3.12/dist-packages/vllm/config/init.py", line 1900, in post_init
self.device = torch.device(self.device_type)
│ │ │ │ │ └ ''
│ │ │ │ └ DeviceConfig(device='', device_type='')
│ │ │ └ <class 'torch.device'>
│ │ └ <module 'torch' from '/usr/local/lib/python3.12/dist-packages/torch/init.py'>
│ └ ''
└ DeviceConfig(device='', device_type='')

RuntimeError: Device string must not be empty

Running on local URL: http://0.0.0.0:7860
To create a public link, set share=True in launch().
2026-01-14 19:46:15.960 | WARNING | mineru.utils.pdf_page_id:get_end_page_id:8 - end_page_id is out of range, use images length
Creating new Ultralytics Settings v0.0.6 file ✅
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.
2026-01-14 19:46:19.965 | INFO | mineru.backend.pipeline.pipeline_analyze:doc_analyze:129 - Batch 1/1: 15 pages/15 pages
2026-01-14 19:46:20.057 | INFO | mineru.backend.pipeline.pipeline_analyze:batch_image_analyze:189 - GPU Memory: 1 GB, Batch Ratio: 1.
2026-01-14 19:46:20.057 | INFO | mineru.backend.pipeline.model_init:init:209 - DocAnalysis init, this may take some times......
2026-01-14 19:46:22.009 | ERROR | mineru.cli.gradio_app:parse_pdf:60 - unable to mmap 810036696 bytes from file </root/.cache/modelscope/hub/models/OpenDataLab/PDF-Extract-Kit-1___0/models/MFR/unimernet_hf_small_2503/model.safetensors>: Cannot allocate memory (12)
Traceback (most recent call last):

File "/usr/lib/python3.12/threading.py", line 1032, in _bootstrap
self._bootstrap_inner()
│ └ <function Thread._bootstrap_inner at 0x7f82622732e0>
└ <Thread(Thread-3 (run), started daemon 140192184960576)>
File "/usr/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
self.run()
│ └ <function Thread.run at 0x7f8262272fc0>
└ <Thread(Thread-3 (run), started daemon 140192184960576)>
File "/usr/lib/python3.12/threading.py", line 1012, in run
self._target(*self._args, **self._kwargs)
│ │ │ │ │ └ {}
│ │ │ │ └ <Thread(Thread-3 (run), started daemon 140192184960576)>
│ │ │ └ ()
│ │ └ <Thread(Thread-3 (run), started daemon 140192184960576)>
│ └ <bound method Server.run of <gradio.http_server.Server object at 0x7f8109abc140>>
└ <Thread(Thread-3 (run), started daemon 140192184960576)>
File "/usr/local/lib/python3.12/dist-packages/uvicorn/server.py", line 67, in run
return asyncio.run(self.serve(sockets=sockets))
│ │ │ │ └ None
│ │ │ └ <function Server.serve at 0x7f810972d080>
│ │ └ <gradio.http_server.Server object at 0x7f8109abc140>
│ └ <function run at 0x7f82618be840>
└ <module 'asyncio' from '/usr/lib/python3.12/asyncio/init.py'>
File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
return runner.run(main)
│ │ └ <coroutine object Server.serve at 0x7f8109705620>
│ └ <function Runner.run at 0x7f8261908220>
└ <asyncio.runners.Runner object at 0x7f8128e8a4b0>
File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
│ │ │ └ <Task pending name='Task-1' coro=<Server.serve() running at /usr/local/lib/python3.12/dist-packages/uvicorn/server.py:71> wai...
│ │ └ <cyfunction Loop.run_until_complete at 0x7f8109791220>
│ └ <uvloop.Loop running=True closed=False debug=False>
└ <asyncio.runners.Runner object at 0x7f8128e8a4b0>
File "/usr/local/lib/python3.12/dist-packages/gradio/queueing.py", line 759, in process_events
response = await route_utils.call_process_api(
│ └ <function call_process_api at 0x7f8257fc2020>
└ <module 'gradio.route_utils' from '/usr/local/lib/python3.12/dist-packages/gradio/route_utils.py'>
File "/usr/local/lib/python3.12/dist-packages/gradio/route_utils.py", line 354, in call_process_api
output = await app.get_blocks().process_api(
│ └ <function App.get_blocks at 0x7f8257cca160>
└ <gradio.routes.App object at 0x7f8109a6ed50>
File "/usr/local/lib/python3.12/dist-packages/gradio/blocks.py", line 2116, in process_api
result = await self.call_function(
│ └ <function Blocks.call_function at 0x7f82579a20c0>
└ Gradio Blocks instance: 5 backend functions
-------------------------------------------
fn_index=0
inputs:
|-<gradio.compon...
File "/usr/local/lib/python3.12/dist-packages/gradio/blocks.py", line 1621, in call_function
prediction = await fn(*processed_input)
│ └ ['/tmp/gradio/fa82dadca25df0ddc96df17c3ceadae3d9b94f937be9052a38d935d47de1c174/中电信新云运202528号关于印发中国电信新疆公司合作方云网生产作业管理办法的通知.pdf'...
└ <function to_markdown at 0x7f8108aeb7e0>
File "/usr/local/lib/python3.12/dist-packages/gradio/utils.py", line 882, in async_wrapper
response = await f(*args, **kwargs)
│ │ └ {}
│ └ ('/tmp/gradio/fa82dadca25df0ddc96df17c3ceadae3d9b94f937be9052a38d935d47de1c174/中电信新云运202528号关于印发中国电信新疆公司合作方云网生产作业管理办法的通知.pdf'...
└ <function to_markdown at 0x7f81472bf2e0>
File "/usr/local/lib/python3.12/dist-packages/mineru/cli/gradio_app.py", line 118, in to_markdown
local_md_dir, file_name = await parse_pdf(file_path, './output', end_pages - 1, is_ocr, formula_enable, table_enable, language, backend, url)
│ │ │ │ │ │ │ │ └ 'http://localhost:30000'
│ │ │ │ │ │ │ └ 'pipeline'
│ │ │ │ │ │ └ 'ch'
│ │ │ │ │ └ True
│ │ │ │ └ True
│ │ │ └ False
│ │ └ 1000
│ └ '/tmp/gradio/fa82dadca25df0ddc96df17c3ceadae3d9b94f937be9052a38d935d47de1c174/中电信新云运202528号关于印发中国电信新疆公司合作方云网生产作业管理办法的通知.pdf'
└ <function parse_pdf at 0x7f8262332840>

File "/usr/local/lib/python3.12/dist-packages/mineru/cli/gradio_app.py", line 46, in parse_pdf
await aio_do_parse(
└ <function aio_do_parse at 0x7f81472bf060>
File "/usr/local/lib/python3.12/dist-packages/mineru/cli/common.py", line 513, in aio_do_parse
_process_pipeline(
└ <function _process_pipeline at 0x7f81472beb60>
File "/usr/local/lib/python3.12/dist-packages/mineru/cli/common.py", line 193, in _process_pipeline
pipeline_doc_analyze(
└ <function doc_analyze at 0x7f80dc3c7e20>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/pipeline/pipeline_analyze.py", line 133, in doc_analyze
batch_results = batch_image_analyze(batch_image, formula_enable, table_enable)
│ │ │ └ True
│ │ └ True
│ └ [(<PIL.Image.Image image mode=RGB size=1654x2339 at 0x7F80DC119F70>, False, 'ch'), (<PIL.Image.Image image mode=RGB size=1654...
└ <function batch_image_analyze at 0x7f80dc3c7ec0>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/pipeline/pipeline_analyze.py", line 202, in batch_image_analyze
results = batch_model(images_with_extra_info)
│ └ [(<PIL.Image.Image image mode=RGB size=1654x2339 at 0x7F80DC119F70>, False, 'ch'), (<PIL.Image.Image image mode=RGB size=1654...
└ <mineru.backend.pipeline.batch_analyze.BatchAnalyze object at 0x7f80dc5e7e00>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/pipeline/batch_analyze.py", line 39, in call
self.model = self.model_manager.get_model(
│ │ │ └ <function ModelSingleton.get_model at 0x7f80dc3c7d80>
│ │ └ <mineru.backend.pipeline.pipeline_analyze.ModelSingleton object at 0x7f81097a4890>
│ └ <mineru.backend.pipeline.batch_analyze.BatchAnalyze object at 0x7f80dc5e7e00>
└ <mineru.backend.pipeline.batch_analyze.BatchAnalyze object at 0x7f80dc5e7e00>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/pipeline/pipeline_analyze.py", line 35, in get_model
self._models[key] = custom_model_init(
│ │ │ └ <function custom_model_init at 0x7f80dc632340>
│ │ └ (None, True, True)
│ └ {}
└ <mineru.backend.pipeline.pipeline_analyze.ModelSingleton object at 0x7f81097a4890>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/pipeline/pipeline_analyze.py", line 62, in custom_model_init
custom_model = MineruPipelineModel(**model_input)
│ └ {'device': 'cpu', 'table_config': {'enable': True}, 'formula_config': {'enable': True}, 'lang': None}
└ <class 'mineru.backend.pipeline.model_init.MineruPipelineModel'>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/pipeline/model_init.py", line 233, in init
self.mfr_model = atom_model_manager.get_atom_model(
│ │ └ <function AtomModelSingleton.get_atom_model at 0x7f80dc611260>
│ └ <mineru.backend.pipeline.model_init.AtomModelSingleton object at 0x7f80dc11bef0>
└ <mineru.backend.pipeline.model_init.MineruPipelineModel object at 0x7f80dc118140>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/pipeline/model_init.py", line 151, in get_atom_model
self._models[key] = atom_model_init(model_name=atom_model_name, **kwargs)
│ │ │ │ │ └ {'mfr_weight_dir': '/root/.cache/modelscope/hub/models/OpenDataLab/PDF-Extract-Kit-1___0/models/MFR/unimernet_hf_small_2503',...
│ │ │ │ └ 'mfr'
│ │ │ └ <function atom_model_init at 0x7f80dc611120>
│ │ └ 'mfr'
│ └ {'mfd': <mineru.model.mfd.yolo_v8.YOLOv8MFDModel object at 0x7f80ded8c710>}
└ <mineru.backend.pipeline.model_init.AtomModelSingleton object at 0x7f80dc11bef0>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/pipeline/model_init.py", line 167, in atom_model_init
atom_model = mfr_model_init(
└ <function mfr_model_init at 0x7f80dc610f40>
File "/usr/local/lib/python3.12/dist-packages/mineru/backend/pipeline/model_init.py", line 83, in mfr_model_init
mfr_model = UnimernetModel(weight_dir, device)
│ │ └ 'cpu'
│ └ '/root/.cache/modelscope/hub/models/OpenDataLab/PDF-Extract-Kit-1___0/models/MFR/unimernet_hf_small_2503'
└ <class 'mineru.model.mfr.unimernet.Unimernet.UnimernetModel'>
File "/usr/local/lib/python3.12/dist-packages/mineru/model/mfr/unimernet/Unimernet.py", line 29, in init
self.model = UnimernetModel.from_pretrained(weight_dir)
│ │ │ └ '/root/.cache/modelscope/hub/models/OpenDataLab/PDF-Extract-Kit-1___0/models/MFR/unimernet_hf_small_2503'
│ │ └ <classmethod(<function VisionEncoderDecoderModel.from_pretrained at 0x7f80d7f59c60>)>
│ └ <class 'mineru.model.mfr.unimernet.unimernet_hf.modeling_unimernet.UnimernetModel'>
└ <mineru.model.mfr.unimernet.Unimernet.UnimernetModel object at 0x7f80dc8672c0>
File "/usr/local/lib/python3.12/dist-packages/transformers/models/vision_encoder_decoder/modeling_vision_encoder_decoder.py", line 285, in from_pretrained
return super().from_pretrained(pretrained_model_name_or_path, *model_args, **kwargs)
│ │ └ {}
│ └ ()
└ '/root/.cache/modelscope/hub/models/OpenDataLab/PDF-Extract-Kit-1___0/models/MFR/unimernet_hf_small_2503'
File "/usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py", line 317, in _wrapper
return func(*args, **kwargs)
│ │ └ {}
│ └ (<class 'mineru.model.mfr.unimernet.unimernet_hf.modeling_unimernet.UnimernetModel'>, '/root/.cache/modelscope/hub/models/Ope...
└ <function PreTrainedModel.from_pretrained at 0x7f812afbe0c0>
File "/usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py", line 4948, in from_pretrained
with safe_open(checkpoint_files[0], framework="pt") as f:
│ └ ['/root/.cache/modelscope/hub/models/OpenDataLab/PDF-Extract-Kit-1___0/models/MFR/unimernet_hf_small_2503/model.safetensors']
└ <class 'builtins.safe_open'>

RuntimeError: unable to mmap 810036696 bytes from file </root/.cache/modelscope/hub/models/OpenDataLab/PDF-Extract-Kit-1___0/models/MFR/unimernet_hf_small_2503/model.safetensors>: Cannot allocate memory (12)
Traceback (most recent call last):
File "/usr/local/lib/python3.12/dist-packages/gradio/queueing.py", line 759, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/gradio/route_utils.py", line 354, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/gradio/blocks.py", line 2116, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/gradio/blocks.py", line 1621, in call_function
prediction = await fn(*processed_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/gradio/utils.py", line 882, in async_wrapper
response = await f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/mineru/cli/gradio_app.py", line 118, in to_markdown
local_md_dir, file_name = await parse_pdf(file_path, './output', end_pages - 1, is_ocr, formula_enable, table_enable, language, backend, url)
^^^^^^^^^^^^^^^^^^^^^^^
TypeError: cannot unpack non-iterable NoneType object

dosubot[bot] bot Jan 15, 2026

你的问题有两个核心原因：

“Device string must not be empty”
即使你用 -e MINERU_DEVICE_MODE=cpu，mineru-gradio 启动时还是会尝试初始化 vllm-async-engine（VLM 路径），这是因为 CLI 入口代码无论你选什么后端，都会预初始化一次 VLM 引擎（源码分析）。
在纯 CPU 环境下，vllm-async-engine 必然初始化失败并报 device string 相关错误，但这不会影响 pipeline 后端的正常使用，只是日志里会有报错。只要你后续在 WebUI 选择 pipeline，实际解析会走 pipeline 路径，不会再用到 vllm。
PDF 转换报错 “TypeError: cannot unpack non-iterable NoneType object”
这个错误的根本原因是模型加载失败，日志里有 unable to mmap ... Cannot allocate memory (12)，说明你的服务器内存不足，无法加载 PDF-Extract-Kit-1 的 unimernet_hf_small_2503 大模型权重（单个权重文件就 800MB+，实际加载时还需更多内存）。模型加载失败后，解析函数返回 None，导致后续解包时报错（类似案例）。

解决建议：

启动命令建议用 CLI 固定 pipeline 后端和 device，避免 WebUI 传参冲突：

docker run -it --rm \
  -p 7860:7860 \
  -v /root/data/disk/mineru/input:/app/data \
  mineru-myself-build-cpu:2.7.1 \
  mineru-gradio --server-name 0.0.0.0 --server-port 7860 --backend pipeline --device cpu

WebUI 不要切换后端，否则会有参数冲突。

“Device string must not be empty” 的报错可以忽略，只要你实际用的是 pipeline 后端，不影响解析。
重点：你的服务器内存不足，导致模型无法 mmap 加载。
- 检查物理内存，建议至少 4GB 以上，最好 8GB+，并保证有足够 swap。
- 可以尝试减少同时处理的 PDF 页数，或换用更小的模型（如有）。
- 如果是离线部署，确保模型文件完整、路径正确，且容器内有足够权限访问。
如果内存不足无法解决，可以考虑在更高配置的机器上运行，或只用 CLI 工具处理小文件。

参考：

如还有具体内存配置、模型路径等问题，可以贴下 free -h 和 ls -lh 相关输出，便于进一步定位。

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Centos7环境下，通过docker部署mineru，启动报错 #4362

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 11 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Centos7环境下，通过docker部署mineru，启动报错 #4362

Uh oh!

Uh oh!

leavesosoon Jan 14, 2026

Replies: 1 comment · 11 replies

Uh oh!

dosubot[bot] bot Jan 14, 2026

Uh oh!

leavesosoon Jan 15, 2026 Author

Uh oh!

Uh oh!

leavesosoon Jan 15, 2026 Author

Uh oh!

dosubot[bot] bot Jan 15, 2026

Uh oh!

leavesosoon Jan 15, 2026 Author

Uh oh!

dosubot[bot] bot Jan 15, 2026

leavesosoon
Jan 14, 2026

Replies: 1 comment 11 replies

dosubot[bot]
bot Jan 14, 2026

leavesosoon Jan 15, 2026
Author

leavesosoon Jan 15, 2026
Author

leavesosoon Jan 15, 2026
Author