Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ Version NEXTVERSION
`cfdm.read` (https://github.com/NCAS-CMS/cfdm/issues/355)
* New function `cfdm.dataset_flatten` that replaces the deprecated*
`cfdm.netcdf_flatten` (https://github.com/NCAS-CMS/cfdm/issues/355)
* Raise `IndexError` for out-of-range indices in a value-setting
operation on a data array (https://github.com/NCAS-CMS/cfdm/issues/377)
* New function to control the creation of cached elements during data
display: `cfdm.display_data`
(https://github.com/NCAS-CMS/cfdm/issues/363)
Expand Down
15 changes: 13 additions & 2 deletions cfdm/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -2517,13 +2517,24 @@ def parse_indices(shape, indices, keepdims=True, newaxis=False):
"New axis indices are not allowed"
)

if keepdims and isinstance(index, Integral):
# Check that any integer indices are in range for the dimension sizes
# before integral indices are converted to slices below, for (one for)
# consistent behaviour between setitem and getitem. Note out-of-range
# slicing works in Python generally (slices are allowed to extend past
# end points with clipping applied) so we allow those.
integral_index = isinstance(index, Integral)
if integral_index and not -size <= index < size: # could be negative
raise IndexError(
f"Index {index!r} is out of bounds for axis {i} with "
f"size {size}."
)

if integral_index and keepdims:
# Convert an integral index to a slice
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need some indentation here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(my fault for not including it in my last suggestion!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AH - I made a comment about this on the thread and now I don't see it. Not sure why it's disappeared!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK after checking my tabs said comment is gone with no trace - did I just imagine/dream posting it?

What I said was (in said dream, or otherwise) is that there is an elif hasattr(index, "to_dask_array"): following the original if keepdims and isinstance(index, Integral): which means that converting to having the top-level if isinstance(index, Integral): actually complicates the flow - we'd need further somewhat contrived checking afterwards to get the equivalent. So actually I am not sure it's a good idea.

That said, no need to actually call isinstance twice - so I suggest we store that in a variable to use each time but use the original conditional structure. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub having a moment ...

Thanks for spotting this. I think what you say is worthwhile. It may seem like overkill (how long does an isinstance really take?), but this is library code, and parse_indices gets call called a lot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No not overkill, you made a very good point about the duplication - but I should effectively revert the original suggestion to keep the conditionals simple given that elif, whilst getting rid of the second isinstance call. Will do that now if you're happy...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

if index == -1:
index = slice(-1, None, None)
else:
index = slice(index, index + 1, 1)

elif hasattr(index, "to_dask_array"):
to_dask_array = index.to_dask_array
if callable(to_dask_array):
Expand Down
32 changes: 32 additions & 0 deletions cfdm/test/test_Data.py
Original file line number Diff line number Diff line change
Expand Up @@ -413,6 +413,22 @@ def test_Data__setitem__(self):
d[indices] = value
self.assertEqual((d.array < 0).sum(), 4)

# Test IndexError emerges for out-of-range indices
d = cfdm.Data(np.ones((4, 4)))
for indices in (
(0, 4),
(4, 0),
(100, 100),
(slice(None), 4),
(4, slice(None)),
(0, -5),
(-5, 0),
(-100, -100),
(100, -100),
):
with self.assertRaises(IndexError):
d[indices] = 2

def test_Data_apply_masking(self):
"""Test Data.apply_masking."""
a = np.ma.arange(12).reshape(3, 4)
Expand Down Expand Up @@ -2092,6 +2108,22 @@ def test_Data__getitem__(self):
(d[0, [4, 0, 3, 1]].array == [[0.018, 0.007, 0.014, 0.034]]).all()
)

# Test IndexError emerges for out-of-range indices
d = cfdm.Data(np.ones((4, 4)))
for indices in (
(0, 4),
(4, 0),
(100, 100),
(slice(None), 4),
(4, slice(None)),
(0, -5),
(-5, 0),
(-100, -100),
(100, -100),
):
with self.assertRaises(IndexError):
d[indices]

def test_Data_BINARY_AND_UNARY_OPERATORS(self):
"""Test arithmetic, logical and comparison operators on Data."""
a = np.arange(3 * 4 * 5).reshape(3, 4, 5)
Expand Down