qwen2.5 modeling support + conversion back to hf ckpt format #1107

uralik · 2025-04-12T01:20:00Z

What does this PR do? Please describe:

adding support for qwen models that do not require tensor parallelism. All loading is done from HF safetensors and remapping of state dicts to fs2 format.
hugging face tokenizer support added. qwen model uses hf based tokenizer
qwen ckpt conversion command added to save it back into HF model.

all transformers imports are checked with try except given that transformers is not mandatory (yet)

Confirmed that this works by training SFT with 7B size, converting it back to HF and using with vllm.

Ilia Kulikov and others added 7 commits April 12, 2025 00:40

hf tokenizer support added

1c477e3

model files added

1dfd840

rope fix

80d64a0

qwen25 model working

15045ff

qwen2.5 conversion ckpt works

16b3955

formatting

70dcf68

adapting to latest main changes

40a9196

uralik requested a review from cbalioglu as a code owner April 12, 2025 01:20

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 12, 2025

name fix

8dc50ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qwen2.5 modeling support + conversion back to hf ckpt format #1107

qwen2.5 modeling support + conversion back to hf ckpt format #1107

uralik commented Apr 12, 2025

qwen2.5 modeling support + conversion back to hf ckpt format #1107

Are you sure you want to change the base?

qwen2.5 modeling support + conversion back to hf ckpt format #1107

Conversation

uralik commented Apr 12, 2025