[Bug] 推理报错

### Checklist

- [ ] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.

### Describe the bug


model_source: hf_model
WARNING: Can not find tokenizer.json. It may take long time to initialize the tokenizer.
WARNING: Can not find tokenizer.json. It may take long time to initialize the tokenizer.
model_config:
{
  "model_name": "internlm-chat-7b",
  "tensor_para_size": 1,
  "head_num": 32,
  "kv_head_num": 32,
  "vocab_size": 103168,
  "num_layer": 32,
  "inter_size": 11008,
  "norm_eps": 1e-06,
  "attn_bias": 1,
  "start_id": 1,
  "end_id": 2,
  "session_len": 2056,
  "weight_type": "fp16",
  "rotary_embedding": 128,
  "rope_theta": 10000.0,
  "size_per_head": 128,
  "group_size": 0,
  "max_batch_size": 64,
  "max_context_token_num": 1,
  "step_length": 1,
  "cache_max_entry_count": 0.5,
  "cache_block_seq_len": 128,
  "cache_chunk_size": 1,
  "use_context_fmha": 1,
  "quant_policy": 0,
  "max_position_embeddings": 2048,
  "rope_scaling_factor": 0.0,
  "use_logn_attn": 0
}
get 323 model params
Exception in thread Thread-4 (_create_model_instance):
Traceback (most recent call last):
  File "/mnt/bigdata/chatglm2/miniconda3/envs/xtuner-env/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/mnt/bigdata/chatglm2/miniconda3/envs/xtuner-env/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/mnt/bigdata/chatglm2/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/lmdeploy/turbomind/turbomind.py", line 434, in _create_model_instance
    model_inst = self.tm_model.model_comm.create_model_instance(
RuntimeError: [TM][ERROR] CUDA runtime error: operation not supported /lmdeploy/src/turbomind/utils/allocator.h:169

session 1


### Reproduction

lmdeploy chat turbomind internlm-chat-7b --model-name internlm-chat-7b

### Environment

```Shell
lmdeploy-0.1.0
cuda11.7
torch2.1.1
python10
```


### Error traceback

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] 推理报错 #868

Checklist

Describe the bug

Reproduction

Environment

Error traceback

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] 推理报错 #868

Description

Checklist

Describe the bug

Reproduction

Environment

Error traceback

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions