Skip to content

Caching does not work when using python3.14 #7813

@intexcor

Description

@intexcor

Describe the bug

Traceback (most recent call last):
File "/workspace/ctn.py", line 8, in
ds = load_dataset(f"naver-clova-ix/synthdog-{lang}") # или "synthdog-zh" для китайского
File "/workspace/.venv/lib/python3.14/site-packages/datasets/load.py", line 1397, in load_dataset
builder_instance = load_dataset_builder(
path=path,
...<10 lines>...
**config_kwargs,
)
File "/workspace/.venv/lib/python3.14/site-packages/datasets/load.py", line 1185, in load_dataset_builder
builder_instance._use_legacy_cache_dir_if_possible(dataset_module)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^
File "/workspace/.venv/lib/python3.14/site-packages/datasets/builder.py", line 612, in _use_legacy_cache_dir_if_possible
self._check_legacy_cache2(dataset_module) or self._check_legacy_cache() or None
~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^
File "/workspace/.venv/lib/python3.14/site-packages/datasets/builder.py", line 485, in _check_legacy_cache2
config_id = self.config.name + "-" + Hasher.hash({"data_files": self.config.data_files})
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/.venv/lib/python3.14/site-packages/datasets/fingerprint.py", line 188, in hash
return cls.hash_bytes(dumps(value))
~~~~~^^^^^^^
File "/workspace/.venv/lib/python3.14/site-packages/datasets/utils/_dill.py", line 120, in dumps
dump(obj, file)
~~~~^^^^^^^^^^^
File "/workspace/.venv/lib/python3.14/site-packages/datasets/utils/_dill.py", line 114, in dump
Pickler(file, recurse=True).dump(obj)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^
File "/workspace/.venv/lib/python3.14/site-packages/dill/_dill.py", line 428, in dump
StockPickler.dump(self, obj)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^
File "/usr/lib/python3.14/pickle.py", line 498, in dump
self.save(obj)
~~~~~~~~~^^^^^
File "/workspace/.venv/lib/python3.14/site-packages/datasets/utils/_dill.py", line 70, in save
dill.Pickler.save(self, obj, save_persistent_id=save_persistent_id)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/.venv/lib/python3.14/site-packages/dill/_dill.py", line 422, in save
StockPickler.save(self, obj, save_persistent_id)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.14/pickle.py", line 572, in save
f(self, obj) # Call unbound method with explicit self
~^^^^^^^^^^^
File "/workspace/.venv/lib/python3.14/site-packages/dill/_dill.py", line 1262, in save_module_dict
StockPickler.save_dict(pickler, obj)
~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "/usr/lib/python3.14/pickle.py", line 1064, in save_dict
self._batch_setitems(obj.items(), obj)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
TypeError: Pickler._batch_setitems() takes 2 positional arguments but 3 were given

Steps to reproduce the bug

ds_train = ds["train"].map(lambda x: {**x, "lang": lang})

Expected behavior

Fixed bugs

Environment info

  • datasets version: 4.2.0
  • Platform: Linux-6.8.0-85-generic-x86_64-with-glibc2.39
  • Python version: 3.14.0
  • huggingface_hub version: 0.35.3
  • PyArrow version: 21.0.0
  • Pandas version: 2.3.3
  • fsspec version: 2025.9.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions