-
Notifications
You must be signed in to change notification settings - Fork 190
Open
Description
有几个问题请教一下大佬:
1.你们放出的代码是否支持多机多卡的增量预训练呢?需要怎么做呢,因为我看没有配置多个机器的地方呀?
deepspeed \
--include="localhost:0,1,2,3" \
./train_clm.py \
--deepspeed ./ds_config/ds_config_zero3.json \
--model_name_or_path TigerResearch/tigerbot-7b-base \
--dataset_name TigerResearch/dev_pretrain \
--do_train \
--output_dir ./ckpt-clm \
--overwrite_output_dir \
--preprocess_num_workers 8 \
--num_train_epochs 5 \
--learning_rate 1e-5 \
--evaluation_strategy steps \
--eval_steps 10 \
--bf16 True \
--save_strategy steps \
--save_steps 10 \
--save_total_limit 2 \
--logging_steps 10 \
--tf32 True \
--per_device_train_batch_size 2 \
--per_device_eval_batch_size 2
2.70B的模型持续增量预训练,至少需要多少个机器呢?
3.有多机卡训练的教程吗
谢谢大佬的回复
Metadata
Metadata
Assignees
Labels
No labels