Skip to content

Commit

Permalink
Consistent ordering of cached data files between deeplearning and Gre…
Browse files Browse the repository at this point in the history
…at Lakes (#1075)

Sort files from cached_data_path when loading them in
  • Loading branch information
timwhite0 authored Oct 7, 2024
1 parent 1baf4d4 commit 0e565b1
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion bliss/cached_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,9 @@ def setup(self, stage: str) -> None: # noqa: WPS324
raise RuntimeError(f"setup skips stage {stage}")

def _load_file_paths_and_slices(self):
file_names = [f for f in os.listdir(str(self.cached_data_path)) if f.endswith(".pt")]
file_names = [
f for f in sorted(os.listdir(str(self.cached_data_path))) if f.endswith(".pt")
]
if self.subset_fraction:
file_names = file_names[: math.ceil(len(file_names) * self.subset_fraction)]
self.file_paths = [os.path.join(str(self.cached_data_path), f) for f in file_names]
Expand Down

0 comments on commit 0e565b1

Please sign in to comment.