Open
Description
So far we've been only dealing with public bi-text, and we haven't setup the pipeline for downloading & processing mined data.
There is ~450 GB of data in this dataset: https://huggingface.co/datasets/allenai/nllb
Create a pipeline for downloading & analysis of this data.
Metadata
Metadata
Assignees
Projects
Status
Todo