-
Notifications
You must be signed in to change notification settings - Fork 128
Open
Description
I run the following train.sh on Mistral-7b:
accelerate launch finetune.py \
--output-dir output/yarn-mistral-7b-64k \
--model mistralai/Mistral-7B-v0.1 \
--architecture mistral \
--scaling-factor 8 \
--max-position-embeddings 4096tr \
--dataset emozilla/yarn-train-tokenized-16k-mistral \
--sliding-window-attention-schedule 65536 \
--lr-schedule constant \
--learning-rate 0.000001 \
--max-train-steps 1000
with accelerate config as:
compute_environment: LOCAL_MACHINE
debug: false
distributed_type: MULTI_GPU
downcast_bf16: 'no'
gpu_ids: 2,3,4,5,6,7
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 6
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
but I encountered OutOfMemory problem on my 80G A800s:

I don't know if there's something wrong with my distributed training configuration、、🥺
Hope someone help me、、、🙏🙏
disperaller
Metadata
Metadata
Assignees
Labels
No labels