-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Open
Description
huggingface上面有qwen3_4B base和qwen3_4B 两个模型 里面的generation config这个文件 前者只需要transformers4.37.0 后者需要4.51.0 还有一个rope scaling 长序列的区别。
使用vllm推理basemodel的长序列能力128k且使用了rope_scaling,报错信息为LLVM ERROR: Failed to compute parent layout for slice layout.
LLVM ERROR: Failed to compute parent layout for slice layout.
LLVM ERROR: Failed to compute parent layout for slice layout.
LLVM ERROR: Failed to compute parent layout for slice layout.
/usr/local/lib/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
Aborted (core dumped)
Metadata
Metadata
Assignees
Labels
No labels