Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
052b1f3
update callback
Jintao-Huang Jan 14, 2026
b7a2662
update
Jintao-Huang Jan 14, 2026
8e0cce8
Merge branch 'main' into refactor_pipelines_callbacks
Jintao-Huang Jan 14, 2026
451112b
update
Jintao-Huang Jan 15, 2026
30d4205
update
Jintao-Huang Jan 15, 2026
4d99d07
update
Jintao-Huang Jan 15, 2026
130692a
use tuner_type
Jintao-Huang Jan 15, 2026
cbb76a5
update
Jintao-Huang Jan 15, 2026
d3f6223
add length
Jintao-Huang Jan 15, 2026
72eb710
Merge branch 'add_length' into refactor_pipelines_callbacks
Jintao-Huang Jan 15, 2026
e5fa514
update
Jintao-Huang Jan 15, 2026
6c1e9b0
update
Jintao-Huang Jan 15, 2026
32fb713
update
Jintao-Huang Jan 15, 2026
5520ce1
merge
Jintao-Huang Jan 15, 2026
67e1d5e
update
Jintao-Huang Jan 15, 2026
ce6393e
fix
Jintao-Huang Jan 16, 2026
6865e7a
update
Jintao-Huang Jan 16, 2026
09ce5a7
update
Jintao-Huang Jan 16, 2026
937d62e
fix
Jintao-Huang Jan 16, 2026
516c56f
Merge branch 'main' into refactor_pipelines_callbacks
Jintao-Huang Jan 16, 2026
ea45ddc
update
Jintao-Huang Jan 16, 2026
3cbd8e2
update
Jintao-Huang Jan 16, 2026
fe686fb
Merge remote-tracking branch 'refs/remotes/origin/refactor_pipelines_…
Jintao-Huang Jan 16, 2026
1a17bd1
fix tuner_plugin
Jintao-Huang Jan 16, 2026
ca55041
update
Jintao-Huang Jan 16, 2026
82997d2
update
Jintao-Huang Jan 16, 2026
ae02aed
FIX
Jintao-Huang Jan 16, 2026
c210e87
update
Jintao-Huang Jan 16, 2026
036b11c
fix
Jintao-Huang Jan 16, 2026
fd97855
update
Jintao-Huang Jan 16, 2026
b4b9697
update
Jintao-Huang Jan 16, 2026
29cf801
update
Jintao-Huang Jan 16, 2026
990ee88
update
Jintao-Huang Jan 16, 2026
4120c46
lint pass
Jintao-Huang Jan 16, 2026
058cecb
update
Jintao-Huang Jan 16, 2026
affe8d3
train_type -> tuner_type
Jintao-Huang Jan 16, 2026
b3a8feb
Merge branch 'main' into refactor_pipelines_callbacks
Jintao-Huang Jan 16, 2026
a705255
Merge branch 'main' into refactor_pipelines_callbacks
Jintao-Huang Jan 16, 2026
d242f78
fix
Jintao-Huang Jan 16, 2026
b835533
update
Jintao-Huang Jan 16, 2026
108e8f2
lint pass
Jintao-Huang Jan 16, 2026
686792e
update
Jintao-Huang Jan 16, 2026
a3d5046
fix
Jintao-Huang Jan 16, 2026
9e519ca
update
Jintao-Huang Jan 16, 2026
8be3a04
update
Jintao-Huang Jan 16, 2026
2a251bf
update
Jintao-Huang Jan 16, 2026
1e347ac
fix
Jintao-Huang Jan 16, 2026
c3ee412
Merge branch 'main' into refactor_pipelines_callbacks
Jintao-Huang Jan 16, 2026
89495a2
fix ci
Jintao-Huang Jan 17, 2026
12b67b0
fix
Jintao-Huang Jan 17, 2026
bc03275
fix
Jintao-Huang Jan 17, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
23 changes: 13 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ For more optional dependencies, you can refer to [here](https://github.com/model
CUDA_VISIBLE_DEVICES=0 \
swift sft \
--model Qwen/Qwen2.5-7B-Instruct \
--train_type lora \
--tuner_type lora \
--dataset 'AI-ModelScope/alpaca-gpt4-data-zh#500' \
'AI-ModelScope/alpaca-gpt4-data-en#500' \
'swift/self-cognition#500' \
Expand Down Expand Up @@ -248,25 +248,28 @@ ms-swift also supports training and inference using Python. Below is pseudocode
Training:

```python
from swift import get_model_processor, get_template, Swift, load_dataset, EncodePreprocessor, Seq2SeqTrainer
from peft import LoraConfig, get_peft_model
from swift import get_model_processor, get_template, load_dataset, EncodePreprocessor
from swift.trainers import Seq2SeqTrainer, Seq2SeqTrainingArguments
# Retrieve the model and template, and add a trainable LoRA module
model, tokenizer = get_model_processor(model_id_or_path, ...)
template = get_template(tokenizer, ...)
model = Swift.prepare_model(model, lora_config)
lora_config = LoraConfig(...)
model = get_peft_model(model, lora_config)

# Download and load the dataset, and encode the text into tokens
train_dataset, val_dataset = load_dataset(dataset_id_or_path, ...)
train_dataset = EncodePreprocessor(template=template)(train_dataset, num_proc=num_proc)
val_dataset = EncodePreprocessor(template=template)(val_dataset, num_proc=num_proc)

# Train the model
training_args = Seq2SeqTrainingArguments(...)
trainer = Seq2SeqTrainer(
model=model,
args=training_args,
data_collator=template.data_collator,
template=template,
train_dataset=train_dataset,
eval_dataset=val_dataset,
template=template,
)
trainer.train()
```
Expand Down Expand Up @@ -329,7 +332,7 @@ swift pt \
--model Qwen/Qwen2.5-7B \
--dataset swift/chinese-c4 \
--streaming true \
--train_type full \
--tuner_type full \
--deepspeed zero2 \
--output_dir output \
--max_steps 10000 \
Expand All @@ -341,7 +344,7 @@ Fine-tuning:
CUDA_VISIBLE_DEVICES=0 swift sft \
--model Qwen/Qwen2.5-7B-Instruct \
--dataset AI-ModelScope/alpaca-gpt4-data-en \
--train_type lora \
--tuner_type lora \
--output_dir output \
...
```
Expand All @@ -352,7 +355,7 @@ CUDA_VISIBLE_DEVICES=0 swift rlhf \
--rlhf_type dpo \
--model Qwen/Qwen2.5-7B-Instruct \
--dataset hjh0119/shareAI-Llama3-DPO-zh-en-emoji \
--train_type lora \
--tuner_type lora \
--output_dir output \
...
```
Expand All @@ -379,7 +382,7 @@ NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 megatron sft \
--load_safetensors true \
--save_safetensors true \
--dataset AI-ModelScope/alpaca-gpt4-data-zh \
--train_type lora \
--tuner_type lora \
--save output \
...
```
Expand All @@ -404,7 +407,7 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 NPROC_PER_NODE=4 \
swift rlhf \
--rlhf_type grpo \
--model Qwen/Qwen2.5-7B-Instruct \
--train_type lora \
--tuner_type lora \
--use_vllm true \
--vllm_mode colocate \
--dataset AI-MO/NuminaMath-TIR#10000 \
Expand Down
23 changes: 13 additions & 10 deletions README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ pip install -e .
CUDA_VISIBLE_DEVICES=0 \
swift sft \
--model Qwen/Qwen2.5-7B-Instruct \
--train_type lora \
--tuner_type lora \
--dataset 'AI-ModelScope/alpaca-gpt4-data-zh#500' \
'AI-ModelScope/alpaca-gpt4-data-en#500' \
'swift/self-cognition#500' \
Expand Down Expand Up @@ -236,25 +236,28 @@ ms-swift也支持使用python的方式进行训练和推理。下面给出训练

训练:
```python
from swift import get_model_processor, get_template, Swift, load_dataset, EncodePreprocessor, Seq2SeqTrainer
from peft import LoraConfig, get_peft_model
from swift import get_model_processor, get_template, load_dataset, EncodePreprocessor
from swift.trainers import Seq2SeqTrainer, Seq2SeqTrainingArguments
# 获取模型和template,并加入可训练的LoRA模块
model, tokenizer = get_model_processor(model_id_or_path, ...)
template = get_template(tokenizer, ...)
model = Swift.prepare_model(model, lora_config)
lora_config = LoraConfig(...)
model = get_peft_model(model, lora_config)

# 下载并载入数据集,并将文本encode成tokens
train_dataset, val_dataset = load_dataset(dataset_id_or_path, ...)
train_dataset = EncodePreprocessor(template=template)(train_dataset, num_proc=num_proc)
val_dataset = EncodePreprocessor(template=template)(val_dataset, num_proc=num_proc)

# 进行训练
training_args = Seq2SeqTrainingArguments(...)
trainer = Seq2SeqTrainer(
model=model,
args=training_args,
data_collator=template.data_collator,
template=template,
train_dataset=train_dataset,
eval_dataset=val_dataset,
template=template,
)
trainer.train()
```
Expand Down Expand Up @@ -317,7 +320,7 @@ swift pt \
--model Qwen/Qwen2.5-7B \
--dataset swift/chinese-c4 \
--streaming true \
--train_type full \
--tuner_type full \
--deepspeed zero2 \
--output_dir output \
--max_steps 10000 \
Expand All @@ -329,7 +332,7 @@ swift pt \
CUDA_VISIBLE_DEVICES=0 swift sft \
--model Qwen/Qwen2.5-7B-Instruct \
--dataset AI-ModelScope/alpaca-gpt4-data-zh \
--train_type lora \
--tuner_type lora \
--output_dir output \
...
```
Expand All @@ -340,7 +343,7 @@ CUDA_VISIBLE_DEVICES=0 swift rlhf \
--rlhf_type dpo \
--model Qwen/Qwen2.5-7B-Instruct \
--dataset hjh0119/shareAI-Llama3-DPO-zh-en-emoji \
--train_type lora \
--tuner_type lora \
--output_dir output \
...
```
Expand All @@ -366,7 +369,7 @@ NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 megatron sft \
--load_safetensors true \
--save_safetensors true \
--dataset AI-ModelScope/alpaca-gpt4-data-zh \
--train_type lora \
--tuner_type lora \
--save output \
...
```
Expand All @@ -391,7 +394,7 @@ CUDA_VISIBLE_DEVICES=0,1,2,3 NPROC_PER_NODE=4 \
swift rlhf \
--rlhf_type grpo \
--model Qwen/Qwen2.5-7B-Instruct \
--train_type lora \
--tuner_type lora \
--use_vllm true \
--vllm_mode colocate \
--dataset AI-MO/NuminaMath-TIR#10000 \
Expand Down
4 changes: 2 additions & 2 deletions docs/source/BestPractices/GRPO-Code-Training.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ swift rlhf \
--use_vllm true \
--vllm_server_host 127.0.0.1 \
--vllm_server_port 8000 \
--train_type lora \
--tuner_type lora \
--lora_rank 16 \
--lora_alpha 32 \
--torch_dtype bfloat16 \
Expand Down Expand Up @@ -114,7 +114,7 @@ swift rlhf \
--use_vllm true \
--vllm_server_host 127.0.0.1 \
--vllm_server_port 8000 \
--train_type lora \
--tuner_type lora \
--torch_dtype bfloat16 \
--dataset 'open-r1/verifiable-coding-problems-python-10k' \
--load_from_cache_file true \
Expand Down
6 changes: 3 additions & 3 deletions docs/source/BestPractices/GRPO-Multi-Modal-Training.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ swift rlhf \
--vllm_mode server \
--vllm_server_host 127.0.0.1 \
--vllm_server_port 8000 \
--train_type full \
--tuner_type full \
--torch_dtype bfloat16 \
--dataset 'AI-ModelScope/clevr_cogen_a_train' \
--load_from_cache_file true \
Expand Down Expand Up @@ -201,7 +201,7 @@ swift rlhf \
--vllm_mode server \
--vllm_server_host 127.0.0.1 \
--vllm_server_port 8000 \
--train_type full \
--tuner_type full \
--torch_dtype bfloat16 \
--dataset 'AI-ModelScope/GEOQA_R1V_Train_8K' \
--load_from_cache_file true \
Expand Down Expand Up @@ -269,7 +269,7 @@ swift rlhf \
--vllm_mode server \
--vllm_server_host 127.0.0.1 \
--vllm_server_port 8000 \
--train_type full \
--tuner_type full \
--torch_dtype bfloat16 \
--dataset 'lmms-lab/multimodal-open-r1-8k-verified' \
--load_from_cache_file true \
Expand Down
2 changes: 1 addition & 1 deletion docs/source/BestPractices/GRPO.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ swift rlhf \
--vllm_mode server \
--vllm_server_host 127.0.0.1 \
--vllm_server_port 8000 \
--train_type full \
--tuner_type full \
--torch_dtype bfloat16 \
--dataset 'zouxuhong/Countdown-Tasks-3to4#50000' \
--load_from_cache_file true \
Expand Down
4 changes: 2 additions & 2 deletions docs/source/BestPractices/MLLM-Registration.md
Original file line number Diff line number Diff line change
Expand Up @@ -543,7 +543,7 @@ if __name__ == '__main__':
template='my_qwen2_5_omni',
load_from_cache_file=True,
split_dataset_ratio=0.01,
train_type='lora',
tuner_type='lora',
torch_dtype='bfloat16',
attn_impl='flash_attn',
padding_free=True,
Expand Down Expand Up @@ -589,7 +589,7 @@ swift sft \
'swift/VideoChatGPT:all#2000' \
--load_from_cache_file true \
--split_dataset_ratio 0.01 \
--train_type lora \
--tuner_type lora \
--torch_dtype bfloat16 \
--attn_impl flash_attn \
--padding_free true \
Expand Down
10 changes: 5 additions & 5 deletions docs/source/BestPractices/NPU-support.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ Legend:

## 微调

以下介绍LoRA的微调, 全参数微调设置参数`--train_type full`即可. **更多训练脚本**参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/ascend/train).
以下介绍LoRA的微调, 全参数微调设置参数`--tuner_type full`即可. **更多训练脚本**参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/ascend/train).

| 模型大小 | NPU数量 | deepspeed类型 | 最大显存占用量 |
| -------- | ------- | ------------- | -------------- |
Expand Down Expand Up @@ -181,7 +181,7 @@ swift sft \
--dataset AI-ModelScope/blossom-math-v2 \
--split_dataset_ratio 0.01 \
--num_train_epochs 5 \
--train_type lora \
--tuner_type lora \
--output_dir output \
--learning_rate 1e-4 \
--gradient_accumulation_steps 16 \
Expand All @@ -206,7 +206,7 @@ swift sft \
--dataset AI-ModelScope/blossom-math-v2 \
--split_dataset_ratio 0.01 \
--num_train_epochs 5 \
--train_type lora \
--tuner_type lora \
--output_dir output \
...
```
Expand All @@ -227,7 +227,7 @@ swift sft \
--dataset AI-ModelScope/blossom-math-v2 \
--split_dataset_ratio 0.01 \
--num_train_epochs 5 \
--train_type lora \
--tuner_type lora \
--output_dir output \
--deepspeed zero2 \
...
Expand All @@ -246,7 +246,7 @@ swift sft \
--dataset AI-ModelScope/blossom-math-v2 \
--split_dataset_ratio 0.01 \
--num_train_epochs 5 \
--train_type lora \
--tuner_type lora \
--output_dir output \
--deepspeed zero3 \
...
Expand Down
6 changes: 3 additions & 3 deletions docs/source/BestPractices/Qwen3-Best-Practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ swift infer \
CUDA_VISIBLE_DEVICES=0 \
swift sft \
--model Qwen/Qwen3-8B \
--train_type lora \
--tuner_type lora \
--dataset 'swift/Qwen3-SFT-Mixin#2000' \
'swift/self-cognition:qwen3#600' \
--load_from_cache_file true \
Expand Down Expand Up @@ -221,7 +221,7 @@ NPROC_PER_NODE=4 \
CUDA_VISIBLE_DEVICES=0,1,2,3 \
swift sft \
--model Qwen/Qwen3-8B \
--train_type full \
--tuner_type full \
--dataset '<your-dataset>' \
--load_from_cache_file true \
--split_dataset_ratio 0.01 \
Expand Down Expand Up @@ -292,7 +292,7 @@ NPROC_PER_NODE=8 \
swift rlhf \
--rlhf_type grpo \
--model Qwen/Qwen3-8B \
--train_type full \
--tuner_type full \
--dataset 'AI-MO/NuminaMath-TIR#5000' \
--load_from_cache_file true \
--torch_dtype bfloat16 \
Expand Down
2 changes: 1 addition & 1 deletion docs/source/BestPractices/Qwen3-VL-Best-Practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ swift sft \
'swift/VideoChatGPT:Generic#2000' \
--load_from_cache_file true \
--split_dataset_ratio 0.01 \
--train_type lora \
--tuner_type lora \
--torch_dtype bfloat16 \
--num_train_epochs 1 \
--per_device_train_batch_size 1 \
Expand Down
4 changes: 2 additions & 2 deletions docs/source/BestPractices/Rapidly-Training-VL-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
swift sft \
--model /path/to/new_vl_model \
--model_type qwen2_5_vl \
--train_type full \
--tuner_type full \
--dataset xxx \
--load_from_cache_file true \
--split_dataset_ratio 0.01 \
Expand Down Expand Up @@ -149,7 +149,7 @@ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
swift sft \
--model /path/to/stage1_checkpoint \
--model_type qwen2_5_vl \
--train_type full \
--tuner_type full \
--dataset xxx \
--load_from_cache_file true \
--split_dataset_ratio 0.01 \
Expand Down
4 changes: 2 additions & 2 deletions docs/source/Customization/Pluginization.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

## callback回调

example在[这里](https://github.com/modelscope/ms-swift/blob/main/swift/plugins/callback.py).
example在[这里](https://github.com/modelscope/ms-swift/blob/main/swift/callbacks).

`callback`机制是transformers Trainer中的一种训练定制化机制。开发者可以在callback中控制训练流程。通常来说,callback的定制化类似下面的样子:
```python
Expand Down Expand Up @@ -114,7 +114,7 @@ example在[这里](https://github.com/modelscope/ms-swift/blob/main/swift/agent_

## 定制化tuner

example在[这里](https://github.com/modelscope/ms-swift/blob/main/swift/plugins/tuner.py).
example在[这里](https://github.com/modelscope/ms-swift/blob/main/swift/tuner_plugin).
- 多模态模型对ViT部分使用全参数训练,LLM部分使用LoRA训练,参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/multimodal/lora_llm_full_vit)
- Phi4-multimodal,直接对其已有LoRA进行训练而不额外附加LoRA,参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/plugins/tuner_phi4_mm.sh)

Expand Down
2 changes: 1 addition & 1 deletion docs/source/GetStarted/Quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ ms-swift的安装请参考[安装文档](./SWIFT-installation.md)。
CUDA_VISIBLE_DEVICES=0 \
swift sft \
--model Qwen/Qwen2.5-7B-Instruct \
--train_type lora \
--tuner_type lora \
--dataset 'AI-ModelScope/alpaca-gpt4-data-zh#500' \
'AI-ModelScope/alpaca-gpt4-data-en#500' \
'swift/self-cognition#500' \
Expand Down
Loading
Loading