Skip to content

[model] add mimo7b #7946

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

[model] add mimo7b #7946

wants to merge 2 commits into from

Conversation

Kuangdd01
Copy link
Collaborator

What does this PR do?

Fixes #7939 (issue)

Just test in chat mode and lora finetune.
image
image

Besides, please comment this line https://huggingface.co/XiaomiMiMo/MiMo-7B-SFT/blob/main/modeling_mimo.py#L64 for ddp training because it is not used during training.

Before submitting

Copy link
Owner

@hiyouga hiyouga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the above comments

@Kuangdd01
Copy link
Collaborator Author

Kuangdd01 commented May 3, 2025

Reasoning data && General data pretraining -> MiMo-7B-Base
MiMo-7B-Base + SFT -> MiMo-7B-SFT
MiMo-7B-SFT + GRPO -> MiMo-7B-RL
MiMo-7B-Base + GRPO -> MiMo-7B-RL-Zero

so did I fix that mismatch? And there is no hard switch like Qwen3 think mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Mimo
2 participants