Open
Description
Seeing following error when running train.sh
data==0.4
installed
Using your latest prebuilt docker image
Launching script via bash training/train.sh
[2024-09-26 18:05:30,229] [INFO] [launch.py:256:main] process 1734 spawned with command: ['/usr/bin/python', '-u', './training/finetune_w8a16.py', '--local_rank=0', '--deepspeed=./training/deepspeed_w8a16.json', '--context_length=2048', '--per_device_train_batch_size=2', '--gradient_accumulation_steps=2', '--per_device_eval_batch_size=4', '--output_dir=output', '--dataset_path=./training/sample_train.jsonl', '--model_name_or_path', '/models/Meta-Llama-3-8B//Meta-Llama-3.1-405B-Instruct', '--conv_config=./training/conversation.json', '--do_eval', '--save_steps', '20', '--max_steps', '60', '--learning_rate', '2e-5', '--lr_scheduler_type', 'linear', '--warmup_ratio', '0.02']
[2024-09-26 18:05:31,638] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Traceback (most recent call last):
File "/workspace/gh200-llm/./training/finetune_w8a16.py", line 18, in <module>
from data.finetune_data import filter_long
ModuleNotFoundError: No module named 'data.finetune_data'
Metadata
Assignees
Labels
No labels