Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unsafe casts from float to unsigned int #9964

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion xarray/coding/variables.py
Original file line number Diff line number Diff line change
Expand Up @@ -345,7 +345,7 @@
# otherwise numpy unsigned ints will silently cast to the signed counterpart
fill_value = fill_value.item()
# passes if provided fill value fits in encoded on-disk type
new_fill = encoded_dtype.type(fill_value)

Check warning on line 348 in xarray/coding/variables.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).

Check warning on line 348 in xarray/coding/variables.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10 min-all-deps

NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of 255 to int8 will fail in the future. For the old behavior, usually: np.array(value).astype(dtype)` will give the desired result (the cast overflows).
except OverflowError:
encoded_kind_str = "signed" if encoded_dtype.kind == "i" else "unsigned"
warnings.warn(
Expand Down Expand Up @@ -426,7 +426,10 @@
if fill_value is not None and has_unsigned:
pop_to(encoding, attrs, "_Unsigned")
# XXX: Is this actually needed? Doesn't the backend handle this?
data = duck_array_ops.astype(duck_array_ops.around(data), dtype)
signed_dtype = np.dtype(f"i{dtype.itemsize}")
data = duck_array_ops.view(
duck_array_ops.astype(duck_array_ops.around(data), signed_dtype), dtype
)
attrs["_FillValue"] = fill_value

return Variable(dims, data, attrs, encoding, fastpath=True)
Expand Down
10 changes: 10 additions & 0 deletions xarray/core/duck_array_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -231,11 +231,21 @@
xp = get_array_namespace(data)
if xp == np:
# numpy currently doesn't have a astype:
return data.astype(dtype, **kwargs)

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

invalid value encountered in cast

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.10

invalid value encountered in cast

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

invalid value encountered in cast

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / macos-latest py3.12

invalid value encountered in cast

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

invalid value encountered in cast

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.10

invalid value encountered in cast

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

invalid value encountered in cast

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / ubuntu-latest py3.12

invalid value encountered in cast

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / windows-latest py3.10

invalid value encountered in cast

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / windows-latest py3.10

invalid value encountered in cast

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / windows-latest py3.10

invalid value encountered in cast

Check warning on line 234 in xarray/core/duck_array_ops.py

View workflow job for this annotation

GitHub Actions / windows-latest py3.10

invalid value encountered in cast
return xp.astype(data, dtype, **kwargs)
return data.astype(dtype, **kwargs)


def view(data, *args, **kwargs):
if hasattr(data, "__array_namespace__"):
xp = get_array_namespace(data)
if xp == np:
# numpy currently doesn't have a view:
return data.view(*args, **kwargs)
return xp.view(data, *args, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which package supports this? I can't find xp.view in the standard https://data-apis.org/array-api/draft/API_specification/index.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't realize this was implementing a standard when @kmuehlbauer brought it up in #9815. As in #9815, it works for NumPy and Dask. I don't know about any other packages other than tests don't seem to fail about it specifically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't check this fully, but to be honest, I also tried it the same way locally.

@Illviljan can you suggest a way forward here, so that it is usable for numpy and dask?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also try view and fall back to astype(copy=False) if that's more portable. And also not expose view so publicly if desired.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcherian Could you have a look here, too? This would unblock @QuLogic for some Fedora builds. Is the implementation of view the right way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using hasattr(__array_namespace__) implies the method is included in the array api standard. But the standard does not include a .view. That's why I simply don't think the __array_namespace__ should be used in this case.

But using .astype(copy=False) does indeed seem like a nice way forward, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

view and astype are very different things no? view simply reinterprets the bytes as a different type, astype preserves the value and may change number of bytes to do so, for example. I'm not really sure what to do here. Perhaps just xfail? Or else you'll have to track down what netcdf-java does, which is the ultimate source of this _Unsigned convention IIRC

return data.view(*args, **kwargs)


def asarray(data, xp=np, dtype=None):
converted = data if is_duck_array(data) else xp.asarray(data)

Expand Down
Loading