-
Notifications
You must be signed in to change notification settings - Fork 267
Open
Description
I have created a dataset in the following format:
- Dataset_folder
- videos
- video1,mp4
- video2.mp4
train.json
train.json is in the following format:
[
{
"video":"videos/calling.mp4",
"QA":[{
"i":"Go through the video and understand the all the actions performed in the video",
"q":"Describe the video",
"a":"The person is making phone call and talking on the phone"
}]
},
]
How to prepare a custom dataset and what are the changes I need to do in order to train on this custom dataset for stage3 finetuning.
I have set the train_file variable of config_7b_stage3.py to the path of this train.json and i get the following error:
2024-12-07T07:52:41 | __main__: train_file: /home/ubuntu/Custom_Data/train.json
2024-12-07T07:52:41 | __main__: Creating dataset for it
2024-12-07T07:52:41 | dataset.it_dataset: Load json file
Traceback (most recent call last):
File "/home/ubuntu/Ask-Anything/video_chat2/tasks/train_it.py", line 221, in <module>
main(cfg)
File "/home/ubuntu/Ask-Anything/video_chat2/tasks/train_it.py", line 138, in main
train_loaders, train_media_types = setup_dataloaders(
File "/home/ubuntu/Ask-Anything/video_chat2/tasks/train_it.py", line 105, in setup_dataloaders
train_datasets = create_dataset(f"{mode}_train", config)
File "/home/ubuntu/Ask-Anything/video_chat2/dataset/__init__.py", line 174, in create_dataset
datasets.append(dataset_cls(**dataset_kwargs))
File "/home/ubuntu/Ask-Anything/video_chat2/dataset/it_dataset.py", line 37, in __init__
with open(self.label_file, 'r') as f:
IsADirectoryError: [Errno 21] Is a directory: '/'
Could you please help in understading the steps and changes required to train on a custom dataset
Metadata
Metadata
Assignees
Labels
No labels