Skip to content

Downloading large or all splits of the dataset fails intermittently due to network stability issues #65

@kinanmartin

Description

@kinanmartin

We have experienced frequent failures during download of the large and all reazonspeech dataset, seemingly due to network issues (such as TimeOutErrors).

This was also experienced by the kotoba-tech team, described here. They created a manual downloader to try to sidestep this issue.

We should perhaps include scripts to download the dataset into smaller pieces, or prepare smaller chunks of the dataset for users to download to reduce the likelihood of failures during download of the larger splits of the dataset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions