slow dataset loading

It seems that loading a dataset from HF using Unitxt is much slower than doing it using the `datasets` package.


Compare this:
```python
from datasets import load_dataset
from time import time
import os
from uuid import uuid4

path = os.path.join(f"cache/{uuid4()}")

t0 = time()
ds = load_dataset("PrimeQA/clapnq_passages", cache_dir=path)
t1 = time()

print(t1-t0)

print(len(ds))
```

To:
```python
from time import time

from unitxt import load_dataset


t0 = time()
ds = load_dataset('card=cards.rag.documents.clap_nq.en')
t1 = time()

print(t1-t0)

print(len(ds))

```

The Unitxt version takes about x5 times longer.

In both cases a fresh new copy is downloaded. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

slow dataset loading #1559

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

slow dataset loading #1559

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions