Open
Description
Describe the bug
Hi,
a issue related to #4760 here when loading a single file from a dataset, unable to access it in offline mode afterwards
Steps to reproduce the bug
import os
# os.environ["HF_HUB_OFFLINE"] = "1"
os.environ["HF_TOKEN"] = "xxxxxxxxxxxxxx"
import datasets
dataset_name = "uonlp/CulturaX"
data_files = "fr/fr_part_00038.parquet"
ds = datasets.load_dataset(dataset_name, split='train', data_files=data_files)
print(f"Dataset loaded : {ds}")
Once the file has been cached, I rerun with the HF_HUB_OFFLINE activated an get this error :
ValueError: Couldn't find cache for uonlp/CulturaX for config 'default-1e725f978350254e'
Available configs in the cache: ['default-2935e8cdcc21c613']
Expected behavior
Should be able to access the previously cached files
Environment info
datasets
version: 3.2.0- Platform: Linux-5.4.0-215-generic-x86_64-with-glibc2.31
- Python version: 3.12.0
huggingface_hub
version: 0.27.0- PyArrow version: 19.0.0
- Pandas version: 2.2.2
fsspec
version: 2024.3.1
Metadata
Metadata
Assignees
Labels
No labels