- improve documentation
- contributing guidelines mention some pre-hoook from the old template
- switch comments and default configs from vaani to a dataset that can be downloaded from hf
- mentions using callbacks for metrics, this is only during training mode. make this clear.
- mention clearly that intrinsic datasets are not yet uploaded to hf
- pfer computation should directly load jsonl without the intermittent jsonl2json script.
- upload intrinsic datasets to hf in kaldi format. reuse kaldi dataset for intrinsic evals
Thanks @SanderGi for bringing up these issues.