LLaMA-Factory 对 Qwen2.5_VL全量微调后,LLaMA-Factory 验证结果和 vllm 起模型服务推理结果不一致 #6986
Unanswered
Lauriecando
asked this question in
Q&A
Replies: 1 comment 3 replies
-
我们建议都用 vllm,而不是用 do_predict |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
LLaMA-Factory 验证通过设置 do_predict = True 进行批量推理测试
vllm 通过命令行起模型服务,调用api推理,两个结果diff很大,已关闭随机性
观察到两个方式的prompt有diff:
LLaMA-Facatory的<|image_pad|>拼接按照图像token的个数来算,有好多个。而vllm按照chat_template给出的来算,仅一个。
例如:
LLaMA-Facatory prompt:
<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\n<|vision_start|><|image_pad|><|image_pad|><|image_pad|> ... 好多个 ... <|image_pad|><|vision_end|>user input.<|im_end|>\n<|im_start|>assistant\n
vllm prompt:
<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\n<|vision_start|><|image_pad|><|vision_end|>user input.<|im_end|>\n<|im_start|>assistant\n
<|image_pad|>拼接不一致的diff,应该是造成结果diff的源头吧??
还有就是,遇到这种问题应该如何解决呢?
在线寻求大佬指点迷津!!!万分感谢🙇
Beta Was this translation helpful? Give feedback.
All reactions