Open
Description
Int the container by image: <ghcr.io/modeltc/lightllm:main> created, use below cmd:
docker run -d --privileged --runtime nvidia --gpus all -p 9012:8000 \
-v /root/lightllm/:/app/ \
-v /root/models/:/data/ \
--name lightllm-qwen \
--entrypoint sleep ghcr.io/modeltc/lightllm:main infinity
start llm cmd:
python -m lightllm.server.api_server --model_dir /data/Qwen-14B-Chat-Int8 --tp 1 --max_total_token_num 10240 --trust_remote_code --tokenizer_mode=auto --eos_id 151643
process feedback below error:
root@0577e1aecb3d:~# python -m lightllm.server.api_server --model_dir /data/Qwen-14B-Chat-Int8 --tp 1 --max_total_token_num 10240 --trust_remote_code --tokenizer_mode=auto --eos_id 151643
INFO 02-20 06:53:17 [tokenizer.py:78] Using a slow tokenizer. This might cause a significant slowdown. Consider using a fast tokenizer instead.
INFO 02-20 06:53:22 [tokenizer.py:78] Using a slow tokenizer. This might cause a significant slowdown. Consider using a fast tokenizer instead.
ERROR 02-20 06:53:41 [model_rpc.py:146] load model error: 'QwenTransformerLayerWeight' object has no attribute 'q_weight_' 'QwenTransformerLayerWeight' object has no attribute 'q_weight_' <class 'AttributeError'>
Traceback (most recent call last):
File "/lightllm/lightllm/server/router/model_infer/model_rpc.py", line 105, in exposed_init_model
self.model = QWenTpPartModel(model_kvargs)
File "/lightllm/lightllm/models/qwen/model.py", line 28, in __init__
super().__init__(kvargs)
File "/lightllm/lightllm/models/llama/model.py", line 35, in __init__
super().__init__(kvargs)
File "/lightllm/lightllm/common/basemodel/basemodel.py", line 50, in __init__
self._init_weights()
File "/lightllm/lightllm/models/llama/model.py", line 99, in _init_weights
[weight.verify_load() for weight in self.trans_layers_weight]
File "/lightllm/lightllm/models/llama/model.py", line 99, in <listcomp>
[weight.verify_load() for weight in self.trans_layers_weight]
File "/lightllm/lightllm/models/qwen/layer_weights/transformer_layer_weight.py", line 82, in verify_load
self.q_weight_,
AttributeError: 'QwenTransformerLayerWeight' object has no attribute 'q_weight_'
Process Process-1:
ERROR 02-20 06:53:41 [start_utils.py:24] init func start_router_process : Traceback (most recent call last):
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/lightllm/lightllm/server/router/manager.py", line 379, in start_router_process
ERROR 02-20 06:53:41 [start_utils.py:24] asyncio.run(router.wait_to_model_ready())
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/opt/conda/lib/python3.9/asyncio/runners.py", line 44, in run
ERROR 02-20 06:53:41 [start_utils.py:24] return loop.run_until_complete(main)
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/lightllm/lightllm/server/router/manager.py", line 83, in wait_to_model_ready
ERROR 02-20 06:53:41 [start_utils.py:24] await asyncio.gather(*init_model_ret)
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/lightllm/lightllm/server/router/model_infer/model_rpc.py", line 395, in init_model
ERROR 02-20 06:53:41 [start_utils.py:24] ans : rpyc.AsyncResult = self._init_model(kvargs)
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/lightllm/lightllm/server/router/model_infer/model_rpc.py", line 149, in exposed_init_model
ERROR 02-20 06:53:41 [start_utils.py:24] raise e
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/lightllm/lightllm/server/router/model_infer/model_rpc.py", line 105, in exposed_init_model
ERROR 02-20 06:53:41 [start_utils.py:24] self.model = QWenTpPartModel(model_kvargs)
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/lightllm/lightllm/models/qwen/model.py", line 28, in __init__
ERROR 02-20 06:53:41 [start_utils.py:24] super().__init__(kvargs)
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/lightllm/lightllm/models/llama/model.py", line 35, in __init__
ERROR 02-20 06:53:41 [start_utils.py:24] super().__init__(kvargs)
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/lightllm/lightllm/common/basemodel/basemodel.py", line 50, in __init__
ERROR 02-20 06:53:41 [start_utils.py:24] self._init_weights()
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/lightllm/lightllm/models/llama/model.py", line 99, in _init_weights
ERROR 02-20 06:53:41 [start_utils.py:24] [weight.verify_load() for weight in self.trans_layers_weight]
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/lightllm/lightllm/models/llama/model.py", line 99, in <listcomp>
ERROR 02-20 06:53:41 [start_utils.py:24] [weight.verify_load() for weight in self.trans_layers_weight]
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] File "/lightllm/lightllm/models/qwen/layer_weights/transformer_layer_weight.py", line 82, in verify_load
ERROR 02-20 06:53:41 [start_utils.py:24] self.q_weight_,
ERROR 02-20 06:53:41 [start_utils.py:24]
ERROR 02-20 06:53:41 [start_utils.py:24] AttributeError: 'QwenTransformerLayerWeight' object has no attribute 'q_weight_'
ERROR 02-20 06:53:41 [start_utils.py:24]
gpu:
Tue Feb 20 15:04:21 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:00:05.0 Off | Off |
| 30% 45C P0 N/A / 450W | 2MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
QuanhuiGuan commentedon May 7, 2024
I met the same question, could anyone give me some suggestion to fix this?
wxjttxs commentedon Sep 30, 2024
me too , how can slove this problem?