Skip to content

Various documentation improvements needed #82

@philipp-fischer

Description

@philipp-fischer
  • Explain that the expressions used for splitting shards into train/val during prepare are regexes
  • Explain how the seed offset can be used to get new random orders, and that by default the order is always the same
  • Mention SkipSample and explain when to use it
  • Update list of sample types in basic/data_prep. Some are missing.
  • Add little how to on automated dataset preparation (non-interactive)
  • Metadataset options not fully documented (e.g. dataset_config)
  • Difference between get_val_dataset and get_val_datasets and also fix their docstrings

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions