Skip to content

Add qwen3‑4B‑instruct‑2507.sh model args script for Qwen3‑4B‑Instruct‑2507#702

Closed
YangJL2003 wants to merge 1 commit intoTHUDM:mainfrom
YangJL2003:qwen3-4b-inst
Closed

Add qwen3‑4B‑instruct‑2507.sh model args script for Qwen3‑4B‑Instruct‑2507#702
YangJL2003 wants to merge 1 commit intoTHUDM:mainfrom
YangJL2003:qwen3-4b-inst

Conversation

@YangJL2003
Copy link

Description:
While using the Qwen3‑4B‑Instruct‑2507 model, I noticed that no model_args entry exists in the Slime framework for this variant. Furthermore, the existing model_args for Qwen3‑4B cannot be applied directly because the rotary_base parameter differs:

  • Qwen3‑4B uses rotary_base = 1 000 000
  • Qwen3‑4B‑Instruct‑2507 uses rotary_base = 5 000 000

To address this, I created a new script at scripts/models/qwen3‑4B‑instruct‑2507.sh that defines the correct model args for this variant.

@fzyzcjy
Copy link
Collaborator

fzyzcjy commented Nov 11, 2025

already added #661

@zhuzilin zhuzilin closed this Nov 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants