-
Notifications
You must be signed in to change notification settings - Fork 37
Open
Labels
Description
- Explain that the expressions used for splitting shards into train/val during prepare are regexes
- Explain how the seed offset can be used to get new random orders, and that by default the order is always the same
- Mention SkipSample and explain when to use it
- Update list of sample types in
basic/data_prep. Some are missing. - Add little how to on automated dataset preparation (non-interactive)
- Metadataset options not fully documented (e.g.
dataset_config) - Difference between
get_val_datasetandget_val_datasetsand also fix their docstrings