Skip to content

[Bug]: alueError: Call to add_lora method failed: Loading lora ["table_qc failed: No adapter found for /data/home/zhizhehui/dev/train/table_ocr/Reward_model_train/output/qwen_lora_output/checkpoint-57"] #30690

@dgzxx-2000

Description

@dgzxx-2000

Your current environment

(vllm) [zhizhehui@localhost ~]$ python collect_env.py
Collecting environment information...

    System Info

==============================
OS : Kylin Linux Advanced Server V10 (Halberd) (x86_64)
GCC version : (GCC) 7.3.0
Clang version : Could not collect
CMake version : version 3.16.5
Libc version : glibc-2.28

==============================
PyTorch Info

PyTorch version : 2.9.0+cu128
Is debug build : False
CUDA used to build PyTorch : 12.8
ROCM used to build PyTorch : N/A

==============================
Python Environment

Python version : 3.12.12 | packaged by Anaconda, Inc. | (main, Oct 21 2025, 20:16:04) [GCC 11.2.0] (64-bit runtime)
Python platform : Linux-4.19.90-89.11.v2401.ky10.x86_64-x86_64-with-glibc2.28

==============================
CUDA / GPU Info

Is CUDA available : True
CUDA runtime version : 12.8.61
CUDA_MODULE_LOADING set to :
GPU models and configuration :
GPU 0: NVIDIA RTX A6000
GPU 1: NVIDIA RTX A6000
GPU 2: NVIDIA RTX A6000
GPU 3: NVIDIA RTX A6000

Nvidia driver version : 580.105.08
cuDNN version : Probably one of the following:
/usr/local/cuda-12.8/targets/x86_64-linux/lib/libcudnn.so.8.9.7
/usr/local/cuda-12.8/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8.9.7
/usr/local/cuda-12.8/targets/x86_64-linux/lib/libcudnn_adv_train.so.8.9.7
/usr/local/cuda-12.8/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8.9.7
/usr/local/cuda-12.8/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8.9.7
/usr/local/cuda-12.8/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8.9.7
/usr/local/cuda-12.8/targets/x86_64-linux/lib/libcudnn_ops_train.so.8.9.7
HIP runtime version : N/A
MIOpen runtime version : N/A
Is XNNPACK available : True

==============================
CPU Info

架构: x86_64
CPU 运行模式: 32-bit, 64-bit
字节序: Little Endian
Address sizes: 48 bits physical, 48 bits virtual
CPU: 128
在线 CPU 列表: 0-127
每个核的线程数: 2
每个座的核数: 32
座: 2
NUMA 节点: 8
厂商 ID: HygonGenuine
CPU 系列: 24
型号: 0
型号名称: Hygon C86 7285 32-core Processor
步进: 2
CPU MHz: 2000.105
BogoMIPS: 4000.21
虚拟化: AMD-V
L1d 缓存: 2 MiB
L1i 缓存: 4 MiB
L2 缓存: 32 MiB
L3 缓存: 128 MiB
NUMA 节点0 CPU: 0-7,64-71
NUMA 节点1 CPU: 8-15,72-79
NUMA 节点2 CPU: 16-23,80-87
NUMA 节点3 CPU: 24-31,88-95
NUMA 节点4 CPU: 32-39,96-103
NUMA 节点5 CPU: 40-47,104-111
NUMA 节点6 CPU: 48-55,112-119
NUMA 节点7 CPU: 56-63,120-127
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Retbleed: Mitigation; untrained return thunk; SMT vulnerable
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
标记: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf pni monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca

==============================
Versions of relevant libraries

[pip3] flashinfer-python==0.5.2
[pip3] numpy==2.2.6
[pip3] nvidia-cublas-cu12==12.8.4.1
[pip3] nvidia-cuda-cupti-cu12==12.8.90
[pip3] nvidia-cuda-nvrtc-cu12==12.8.93
[pip3] nvidia-cuda-runtime-cu12==12.8.90
[pip3] nvidia-cudnn-cu12==9.10.2.21
[pip3] nvidia-cudnn-frontend==1.16.0
[pip3] nvidia-cufft-cu12==11.3.3.83
[pip3] nvidia-cufile-cu12==1.13.1.3
[pip3] nvidia-curand-cu12==10.3.9.90
[pip3] nvidia-cusolver-cu12==11.7.3.90
[pip3] nvidia-cusparse-cu12==12.5.8.93
[pip3] nvidia-cusparselt-cu12==0.7.1
[pip3] nvidia-cutlass-dsl==4.3.2
[pip3] nvidia-ml-py==13.590.44
[pip3] nvidia-nccl-cu12==2.27.5
[pip3] nvidia-nvjitlink-cu12==12.8.93
[pip3] nvidia-nvshmem-cu12==3.3.20
[pip3] nvidia-nvtx-cu12==12.8.90
[pip3] pyzmq==27.1.0
[pip3] torch==2.9.0
[pip3] torchaudio==2.9.0
[pip3] torchvision==0.24.0
[pip3] transformers==4.57.3
[pip3] triton==3.5.0
[conda] flashinfer-python 0.5.2 pypi_0 pypi
[conda] numpy 2.2.6 pypi_0 pypi
[conda] nvidia-cublas-cu12 12.8.4.1 pypi_0 pypi
[conda] nvidia-cuda-cupti-cu12 12.8.90 pypi_0 pypi
[conda] nvidia-cuda-nvrtc-cu12 12.8.93 pypi_0 pypi
[conda] nvidia-cuda-runtime-cu12 12.8.90 pypi_0 pypi
[conda] nvidia-cudnn-cu12 9.10.2.21 pypi_0 pypi
[conda] nvidia-cudnn-frontend 1.16.0 pypi_0 pypi
[conda] nvidia-cufft-cu12 11.3.3.83 pypi_0 pypi
[conda] nvidia-cufile-cu12 1.13.1.3 pypi_0 pypi
[conda] nvidia-curand-cu12 10.3.9.90 pypi_0 pypi
[conda] nvidia-cusolver-cu12 11.7.3.90 pypi_0 pypi
[conda] nvidia-cusparse-cu12 12.5.8.93 pypi_0 pypi
[conda] nvidia-cusparselt-cu12 0.7.1 pypi_0 pypi
[conda] nvidia-cutlass-dsl 4.3.2 pypi_0 pypi
[conda] nvidia-ml-py 13.590.44 pypi_0 pypi
[conda] nvidia-nccl-cu12 2.27.5 pypi_0 pypi
[conda] nvidia-nvjitlink-cu12 12.8.93 pypi_0 pypi
[conda] nvidia-nvshmem-cu12 3.3.20 pypi_0 pypi
[conda] nvidia-nvtx-cu12 12.8.90 pypi_0 pypi
[conda] pyzmq 27.1.0 pypi_0 pypi
[conda] torch 2.9.0 pypi_0 pypi
[conda] torchaudio 2.9.0 pypi_0 pypi
[conda] torchvision 0.24.0 pypi_0 pypi
[conda] transformers 4.57.3 pypi_0 pypi
[conda] triton 3.5.0 pypi_0 pypi

==============================
vLLM Info

ROCM Version : Could not collect
vLLM Version : 0.11.2
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled
GPU Topology:
GPU0 GPU1 GPU2 GPU3 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X PIX SYS SYS 40-47,104-111 5 N/A
GPU1 PIX X SYS SYS 40-47,104-111 5 N/A
GPU2 SYS SYS X PIX 48-55,112-119 6 N/A
GPU3 SYS SYS PIX X 48-55,112-119 6 N/A

Legend:

X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks

==============================
Environment Variables

LD_LIBRARY_PATH=/data/home/zhizhehui/dev/app/minconda/lib:/usr/local/cuda-12.8/lib64:
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1

🐛 Describe the bug

python -m vllm.entrypoints.openai.api_server --model /data/home/zhizhehui/dev/models/Qwen3-VL-8B-Instruct --enable-lora --max-lora-rank 32 --lora-dtype float16 --dtype float16 --tensor-parallel-size 1 --gpu-memory-utilization 0.8 --port 8000 --max-model-len 8888 --host 0.0.0.0 --served-model-name qwen3-vl --lora-modules '["table_qc=/data/home/zhizhehui/dev/train/table_ocr/Reward_model_train/output/qwen_lora_output/checkpoint-57"]'

(EngineCore_DP0 pid=1511501) WARNING 12-15 18:56:44 [utils.py:250] Using default LoRA kernel configs
Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 100%|████████████████████████████████████| 102/102 [00:15<00:00, 6.77it/s]
Capturing CUDA graphs (decode, FULL): 100%|█████████████████████████████████████████████████████████| 70/70 [00:10<00:00, 6.93it/s]
(EngineCore_DP0 pid=1511501) INFO 12-15 18:57:09 [gpu_model_runner.py:4244] Graph capturing finished in 26 secs, took 1.34 GiB
(EngineCore_DP0 pid=1511501) INFO 12-15 18:57:09 [core.py:250] init engine (profile, create kv cache, warmup model) took 69.12 seconds
(APIServer pid=1510937) INFO 12-15 18:57:10 [api_server.py:1725] Supported tasks: ['generate']
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] Invocation of add_lora method failed
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] Traceback (most recent call last):
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/lora/worker_manager.py", line 102, in _load_adapter
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] peft_helper = PEFTHelper.from_local_dir(
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] ^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/lora/peft_helper.py", line 108, in from_local_dir
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] with open(lora_config_path) as f:
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] ^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] FileNotFoundError: [Errno 2] No such file or directory: '/data/home/zhizhehui/dev/train/table_ocr/Reward_model_train/output/qwen_lora_output/checkpoint-57"]/adapter_config.json'
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918]
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] The above exception was the direct cause of the following exception:
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918]
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] Traceback (most recent call last):
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 915, in _handle_client_request
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] result = method(*self._convert_msgspec_args(method, args))
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 498, in add_lora
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] return self.model_executor.add_lora(lora_request)
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/abstract.py", line 277, in add_lora
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] return all(self.collective_rpc("add_lora", args=(lora_request,)))
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/uniproc_executor.py", line 75, in collective_rpc
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] result = run_method(self.driver_worker, method, args, kwargs)
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/v1/serial_utils.py", line 479, in run_method
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] return func(*args, **kwargs)
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 612, in add_lora
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] return self.model_runner.add_lora(lora_request)
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/v1/worker/lora_model_runner_mixin.py", line 201, in add_lora
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] return self.lora_manager.add_adapter(lora_request)
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/lora/worker_manager.py", line 263, in add_adapter
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] lora = self._load_adapter(lora_request)
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/lora/worker_manager.py", line 138, in _load_adapter
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] raise ValueError(
(EngineCore_DP0 pid=1511501) ERROR 12-15 18:57:10 [core.py:918] ValueError: Loading lora ["table_qc failed: No adapter found for /data/home/zhizhehui/dev/train/table_ocr/Reward_model_train/output/qwen_lora_output/checkpoint-57"]
[rank0]:[W1215 18:57:10.424580246 ProcessGroupNCCL.cpp:1524] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
(APIServer pid=1510937) Traceback (most recent call last):
(APIServer pid=1510937) File "", line 198, in _run_module_as_main
(APIServer pid=1510937) File "", line 88, in _run_code
(APIServer pid=1510937) File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 2096, in
(APIServer pid=1510937) uvloop.run(run_server(args))
(APIServer pid=1510937) File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/uvloop/init.py", line 96, in run
(APIServer pid=1510937) return __asyncio.run(
(APIServer pid=1510937) ^^^^^^^^^^^^^^
(APIServer pid=1510937) File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=1510937) return runner.run(main)
(APIServer pid=1510937) ^^^^^^^^^^^^^^^^
(APIServer pid=1510937) File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=1510937) return self._loop.run_until_complete(task)
(APIServer pid=1510937) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1510937) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=1510937) File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/uvloop/init.py", line 48, in wrapper
(APIServer pid=1510937) return await main
(APIServer pid=1510937) ^^^^^^^^^^
(APIServer pid=1510937) File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 2024, in run_server
(APIServer pid=1510937) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=1510937) File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 2050, in run_server_worker
(APIServer pid=1510937) await init_app_state(engine_client, app.state, args)
(APIServer pid=1510937) File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 1760, in init_app_state
(APIServer pid=1510937) await state.openai_serving_models.init_static_loras()
(APIServer pid=1510937) File "/data/home/zhizhehui/dev/app/minconda/envs/vllm/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_models.py", line 90, in init_static_loras
(APIServer pid=1510937) raise ValueError(load_result.error.message)
(APIServer pid=1510937) ValueError: Call to add_lora method failed: Loading lora ["table_qc failed: No adapter found for /data/home/zhizhehui/dev/train/table_ocr/Reward_model_train/output/qwen_lora_output/checkpoint-57"]

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions