Skip to content

从魔搭上下载的deepseekV3.2-AWQ模型启动报错 #4519

@n13151125

Description

@n13151125

模型地址https://www.modelscope.cn/models/QuantTrio/DeepSeek-V3.2-AWQ/summary
将其下载到docker容器中后,xinference使用命令启动报错
命令:xinference launch --model_path /root/.cache/modelscope/hub/models/QuantTrio/DeepSeek-V3.2-AWQ --model-engine vLLM -n DeepSeek-V3.2-AWQ

错误:
2026-01-21 19:02:56,281 xinference.core.worker 139 INFO [request d9d01f40-f73e-11f0-b408-caa1fd778a80] Enter launch_builtin_model, args: <xinference.core.worker.WorkerActor object at 0x7b1e5d92f2f0>, kwargs: model_uid=DeepSeek-V3.2-AWQ-0,model_name=DeepSeek-V3.2-AWQ,model_size_in_billions=None,model_format=None,quantization=None,model_engine=vLLM,model_type=LLM,n_gpu=auto,request_limits=None,peft_model_config=None,gpu_idx=[0],download_hub=None,model_path=/root/.cache/modelscope/hub/models/QuantTrio/DeepSeek-V3.2-AWQ,enable_virtual_env=None,virtual_env_packages=None,envs=None,xavier_config=None,trust_remote_code=True
2026-01-21 19:02:56,281 xinference.core.worker 139 INFO You specify to launch the model: DeepSeek-V3.2-AWQ on GPU index: [0] of the worker: 0.0.0.0:35896, xinference will automatically ignore the n_gpu option.
2026-01-21 19:02:57,236 xinference.core.worker 139 ERROR Failed to load model DeepSeek-V3.2-AWQ-0
Traceback (most recent call last):
File "/usr/local/lib/python3.12/dist-packages/xinference/core/worker.py", line 1546, in launch_builtin_model
model = await asyncio.to_thread(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xinference/model/core.py", line 44, in create_model_instance
return create_llm_model_instance(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xinference/model/llm/core.py", line 243, in create_llm_model_instance
raise ValueError(
ValueError: Model not found, name: DeepSeek-V3.2-AWQ, format: None, size: None, quantization: None
2026-01-21 19:02:57,248 xinference.core.worker 139 ERROR [request d9d01f40-f73e-11f0-b408-caa1fd778a80] Leave launch_builtin_model, error: Model not found, name: DeepSeek-V3.2-AWQ, format: None, size: None, quantization: None, elapsed time: 0 s
Traceback (most recent call last):
File "/usr/local/lib/python3.12/dist-packages/xinference/core/utils.py", line 94, in wrapped
ret = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xinference/core/worker.py", line 1546, in launch_builtin_model
model = await asyncio.to_thread(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xinference/model/core.py", line 44, in create_model_instance
return create_llm_model_instance(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xinference/model/llm/core.py", line 243, in create_llm_model_instance
raise ValueError(
ValueError: Model not found, name: DeepSeek-V3.2-AWQ, format: None, size: None, quantization: None
2026-01-21 19:02:57,249 xinference.core.supervisor 139 ERROR Failed to launch replica DeepSeek-V3.2-AWQ-0: [address=0.0.0.0:35896, pid=139] Model not found, name: DeepSeek-V3.2-AWQ, format: None, size: None, quantization: None
Task exception was never retrieved
future: <Task finished name='Task-15834' coro=<SupervisorActor.launch_builtin_model.._launch_model() done, defined at /usr/local/lib/python3.12/dist-packages/xinference/core/supervisor.py:1144> exception=ValueError()>
Traceback (most recent call last):
File "/usr/local/lib/python3.12/dist-packages/xinference/core/supervisor.py", line 1270, in _launch_model
raise result
File "/usr/local/lib/python3.12/dist-packages/xinference/core/supervisor.py", line 1107, in _launch_one_model
subpool_address = await worker_ref.launch_builtin_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xoscar/backends/context.py", line 262, in send
return self._process_result_message(result)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xoscar/backends/context.py", line 111, in _process_result_message
raise message.as_instanceof_cause()
File "/usr/local/lib/python3.12/dist-packages/xoscar/backends/pool.py", line 689, in send
result = await self._run_coro(message.message_id, coro)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xoscar/backends/pool.py", line 389, in _run_coro
return await coro
^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xoscar/api.py", line 418, in on_receive
return await super().on_receive(message) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "xoscar/core.pyx", line 564, in on_receive
raise ex
File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive
async with self._lock:
File "xoscar/core.pyx", line 527, in xoscar.core._BaseActor.on_receive
with debug_async_timeout('actor_lock_timeout',
File "xoscar/core.pyx", line 532, in xoscar.core._BaseActor.on_receive
result = await result
File "/usr/local/lib/python3.12/dist-packages/xinference/core/utils.py", line 94, in wrapped
ret = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xinference/core/worker.py", line 1546, in launch_builtin_model
model = await asyncio.to_thread(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xinference/model/core.py", line 44, in create_model_instance
return create_llm_model_instance(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/xinference/model/llm/core.py", line 243, in create_llm_model_instance
raise ValueError(
ValueError: [address=0.0.0.0:35896, pid=139] Model not found, name: DeepSeek-V3.2-AWQ, format: None, size: None, quantization: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions