推理时警告:Some weights of Qwen2ForCausalLM were not initialized from the model checkpoint 的解决办法

Qwen2.5 0.5B等模型采用了“词嵌入参数共享（tied word embedding）”机制，即 lm_head.weight 和 model.embed_tokens.weight 实际上是同一块权重。检查官方权重文件可以发现其只保存了 model.embed_tokens.weight，并没有单独的 lm_head.weight。

当我们自定义了 VLM时，由于没有执行 Qwen2 的参数共享代码，llm_model.lm_head.weight 默认会被重新初始化为随机参数，而我们又冻结了基础 LLM 的全部参数，导致这一部分无法训练或恢复为官方权重，推理时输出就成了乱码（如一堆感叹号）。

最简单的修复办法，就是在推理阶段手动将输出头和词嵌入绑在一起：

pretrain_model.llm_model.lm_head.weight = pretrain_model.llm_model.get_input_embeddings().weight

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

推理时警告:Some weights of Qwen2ForCausalLM were not initialized from the model checkpoint 的解决办法 #28

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

推理时警告:Some weights of Qwen2ForCausalLM were not initialized from the model checkpoint 的解决办法 #28

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions