-
Notifications
You must be signed in to change notification settings - Fork 46
Open
Description
Hi, nice work!
I would like to try SFT. However, the data processing pipeline appears to be incomplete. The files referenced in sft_stage_1.sh: train_s12w24_with_seeks.jsonl, train_livecc_with_seeks.jsonl, valid_s12w24_with_seeks.jsonl, and valid_livecc_with_seeks.jsonl do not exist in the downloaded dataset. Could you please advise how I can obtain these files?
In addition, I would like to understand why train_s12w24_with_seeks.jsonl and valid_s12w24_with_seeks.jsonl appear twice.
Metadata
Metadata
Assignees
Labels
No labels