Open
Description
Description
It appears that RandomGeoSampler is attempting to sample a window from the ChesapeakeCVPR dataset that is either out of bounds, or is empty. Rasterio is not able to handle this and errors out. Full stacktrace:
61502 Traceback (most recent call last):
61503 File "/media/share/share/projects/geolayers/train_baseline.py", line 432, in train
61504 for i, data in enumerate(testloader,0):
61505 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 701, in __next__
61506 data = self._next_data()
61507 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1445, in _next_data
61508 return self._process_data(data)
61509 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1491, in _process_data
61510 data.reraise()
61511 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/_utils.py", line 715, in reraise
61512 raise exception
61513 ValueError: Caught ValueError in DataLoader worker process 4.
61514 Original Traceback (most recent call last):
61515 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/mask.py", line 80, in raster_geometry_mask
61516 window = geometry_window(dataset, shapes, pad_x=pad_x, pad_y=pad_y)
61517 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/features.py", line 477, in geometry_window
61518 window = window.intersection(raster_window)
61519 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/windows.py", line 775, in intersection
61520 return intersection([self, other])
61521 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/windows.py", line 125, in wrapper
61522 return function(*args[0])
61523 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/windows.py", line 239, in intersection
61524 return functools.reduce(_intersection, windows)
61525 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/windows.py", line 257, in _intersection
61526 raise WindowError(f"Intersection is empty {w1} {w2}")
61527 rasterio.errors.WindowError: Intersection is empty Window(col_off=-205, row_off=6158, width=201, height=201) Window(col_off=0, row_off=0, width=4901, height=6511)
61528
61529 During handling of the above exception, another exception occurred:
61530
61531 Traceback (most recent call last):
61532 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 351, in _worker_loop
61533 data = fetcher.fetch(index) # type: ignore[possibly-undefined]
61534 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
61535 data = [self.dataset[idx] for idx in possibly_batched_index]
61536 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in <listcomp>
61537 data = [self.dataset[idx] for idx in possibly_batched_index]
61538 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/torchgeo/datasets/chesapeake.py", line 559, in __getitem__
61539 data, _ = rasterio.mask.mask(
61540 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/mask.py", line 178, in mask
61541 shape_mask, transform, window = raster_geometry_mask(
61542 File "/media/share/share/envs/tgeo/lib/python3.10/site-packages/rasterio/mask.py", line 86, in raster_geometry_mask
61543 raise ValueError('Input shapes do not overlap raster.')
61544 ValueError: Input shapes do not overlap raster.
Steps to reproduce
This error is rather random – it generally can occur at any given iteration in the training process on any epoch. Here are steps to reproduce it.
- Create a ChesapeakeCVPR dataset:
from torchgeo.datasets import ChesapeakeCVPR
states = ['de', 'md', 'va', 'wv', 'pa', 'ny']
spl_train = [f'{state}-train' for state in states]
spl_val = ([f'{state}-val' for state in states])
spl_test = ([f'{state}-test' for state in states])
trainset = ChesapeakeCVPR(root='/share/chesapeake/cvpr_chesapeake_landcover', download=False, cache=True, layers=modality, splits=spl_train, transforms=None)
- Initialize a RandomGeoSampler and dataloader
from torchgeo.samplers import RandomGeoSampler, RandomBatchGeoSampler
trainsampler = RandomGeoSampler(trainset, size=256, units=torchgeo.samplers.Units.PIXELS, generator=generator)
trainloader = torch.utils.data.DataLoader(trainset, sampler=trainsampler, batch_size=BATCH_SIZE, num_workers=cfg['num_workers'], drop_last=False, generator=generator, collate_fn=stack_samples)
- Iterate through the dataloader. Ideally, you should catch an exception at some point.
Version
0.7.0.dev0