The [XTREME-S](https://huggingface.co/datasets/google/xtreme_s) dataset includes dozen of languages with a lot of hours.