-
Notifications
You must be signed in to change notification settings - Fork 3k
Open
Description
Describe the bug
As stated in the title, since Numpy changed in version >2.0 with copy, the stratify parameters break.
e.g. all_dataset.train_test_split(test_size=0.2,stratify_by_column="label")
returns a Numpy error.
It works if you downgrade Numpy to a version lower than 2.0.
Steps to reproduce the bug
- Numpy > 2.0
all_dataset.train_test_split(test_size=0.2,stratify_by_column="label")
Expected behavior
It returns a stratified split as per the results of Numpy < 2.0
Environment info
datasets
version: 2.14.4- Platform: Linux-6.8.0-85-generic-x86_64-with-glibc2.35
- Python version: 3.13.7
- Huggingface_hub version: 0.34.4
- PyArrow version: 19.0.0
- Pandas version: 2.3.2
Metadata
Metadata
Assignees
Labels
No labels