B60 cannot run qwen3vl

系统及版本
linux25.04
multi-arc-bmg-offline-installer-25.38.4.1
docker指令
sudo docker run -td \
    --privileged \
    --net=host \
    --device=/dev/dri \
    --name=Qwen3VL-2B \
    -v "/root/.cache/modelscope/hub/models/Qwen/":/llm/models/ \
    --shm-size="32g" \
    --entrypoint /bin/bash \
    intel/llm-scaler-vllm:1.1-preview
启动命令
export ZE_AFFINITY_MASK=0,1
export TORCH_LLM_ALLREDUCE=1
export VLLM_USE_V1=1
export CCL_ZE_IPC_EXCHANGE=pidfd
export VLLM_ALLOW_LONG_MAX_MODEL_LEN=1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
python3 -m vllm.entrypoints.openai.api_server \
    --model /llm/models/Qwen3-VL-2B-Instruct \
    --served-model-name Qwen3-VL-2B-Instruct \
    --dtype=float16 \
    --enforce-eager \
    --port 8000 \
    --host 0.0.0.0 \
    --trust-remote-code \
    --gpu-memory-util=0.9 \
    --no-enable-prefix-caching \
    --max-num-batched-tokens=8192 \
    --disable-log-requests \
    --max-model-len=8192 \
    --block-size 64 \
    -tp=2

报错
[W1024 03:43:06.029903221 OperatorEntry.cpp:218] Warning: Warning only once for all operators,  other operators may also be overridden.
  Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: aten::geometric_(Tensor(a!) self, float p, *, Generator? generator=None) -> Tensor(a!)
    registered at /pytorch/build/aten/src/ATen/RegisterSchema.cpp:6
  dispatch key: XPU
  previous kernel: registered at /pytorch/aten/src/ATen/VmapModeRegistrations.cpp:37
       new kernel: registered at /root/workspace/frameworks.ai.pytorch.ipex-gpu/build/Release/csrc/gpu/csrc/gpu/xpu/ATen/RegisterXPU_0.cpp:172 (function operat
INFO 10-24 03:43:07 [__init__.py:235] Automatically detected platform xpu.
INFO 10-24 03:43:08 [api_server.py:1755] vLLM API server version 0.10.1.dev0+g6d8d0a24c.d20250902
INFO 10-24 03:43:08 [cli_args.py:261] non-default args: {'host': '0.0.0.0', 'model': '/llm/models/Qwen3-VL-2B-Instruct', 'trust_remote_code': True, 'dtype': 'f8192, 'enforce_eager': True, 'served_model_name': ['Qwen3-VL-2B-Instruct'], 'tensor_parallel_size': 2, 'block_size': 64, 'enable_prefix_caching': False, 'max_n'disable_log_requests': True}
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/usr/local/lib/python3.12/dist-packages/vllm-0.10.1.dev0+g6d8d0a24c.d20250902.xpu-py3.12-linux-x86_64.egg/vllm/entrypoints/openai/api_server.py", line
    uvloop.run(run_server(args))
  File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 109, in run
    return __asyncio.run(
           ^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
  File "/usr/local/lib/python3.12/dist-packages/uvloop/__init__.py", line 61, in wrapper
    return await main
           ^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/vllm-0.10.1.dev0+g6d8d0a24c.d20250902.xpu-py3.12-linux-x86_64.egg/vllm/entrypoints/openai/api_server.py", line
    await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
  File "/usr/local/lib/python3.12/dist-packages/vllm-0.10.1.dev0+g6d8d0a24c.d20250902.xpu-py3.12-linux-x86_64.egg/vllm/entrypoints/openai/api_server.py", line
    async with build_async_engine_client(args, client_config) as engine_client:
  File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/vllm-0.10.1.dev0+g6d8d0a24c.d20250902.xpu-py3.12-linux-x86_64.egg/vllm/entrypoints/openai/api_server.py", line _client
    async with build_async_engine_client_from_engine_args(
  File "/usr/lib/python3.12/contextlib.py", line 210, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/vllm-0.10.1.dev0+g6d8d0a24c.d20250902.xpu-py3.12-linux-x86_64.egg/vllm/entrypoints/openai/api_server.py", line _client_from_engine_args
    vllm_config = engine_args.create_engine_config(usage_context=usage_context)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/vllm-0.10.1.dev0+g6d8d0a24c.d20250902.xpu-py3.12-linux-x86_64.egg/vllm/engine/arg_utils.py", line 1004, in crea
    model_config = self.create_model_config()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/vllm-0.10.1.dev0+g6d8d0a24c.d20250902.xpu-py3.12-linux-x86_64.egg/vllm/engine/arg_utils.py", line 872, in creat
    return ModelConfig(
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 123, in __init__
    s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig
  Value error, The checkpoint you are trying to load has model type `qwen3_vl` but Transformers does not recognize this architecture. This could be because of nt, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not bepports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://githrmers.git` [type=value_error, input_value=ArgsKwargs((), {'model': ...attention_dtype': None}), input_type=ArgsKwargs]
    For further information visit https://errors.pydantic.dev/2.11/v/value_error



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

B60 cannot run qwen3vl #133

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

B60 cannot run qwen3vl #133

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions