Skip to content
This repository was archived by the owner on Jan 12, 2026. It is now read-only.

Commit 9fc9a9e

Browse files
authored
Mark Ray Dataset as always having enough shards (#275)
Make `assert_enough_shards_for_actors` always pass if Ray Datasets are used, as they will be split to match the number of actors during loading. Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
1 parent e5ccecc commit 9fc9a9e

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

xgboost_ray/matrix.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -496,6 +496,11 @@ def get_data_source(self) -> Type[DataSource]:
496496
def assert_enough_shards_for_actors(self, num_actors: int):
497497
data_source = self.get_data_source()
498498

499+
# Ray Datasets will be automatically split to match the number
500+
# of actors.
501+
if isinstance(data_source, RayDataset):
502+
return
503+
499504
max_num_shards = self._cached_n or data_source.get_n(self.data)
500505
if num_actors > max_num_shards:
501506
raise RuntimeError(

0 commit comments

Comments
 (0)