Support proper custom class reflexive operator applied to xarray objects #9944
Description
Is your feature request related to a problem?
I would like to implement reflexive operator on a custom class applied to xarray objects.
Following is a demo snippet:
import numpy as np
import xarray as xr
class DemoObj:
def __add__(self, other):
print(f'__add__ call: type={other.__class__}, value={other}')
return other
def __radd__(self, other):
print(f'__radd__ call: type={other.__class__}, value={other}')
return other
obj = DemoObj()
da = xr.DataArray(np.arange(8))
print('#### Test __add__ ####')
obj + da
print('\n')
print('#### Test __radd__ ####')
da + obj
Actual Output:
#### Test __add__ ####
__add__ call: type=<class 'xarray.core.dataarray.DataArray'>, value=<xarray.DataArray (dim_0: 8)>
array([0, 1, 2, 3, 4, 5, 6, 7])
Dimensions without coordinates: dim_0
#### Test __radd__ ####
__radd__ call: type=<class 'int'>, value=0
__radd__ call: type=<class 'int'>, value=1
__radd__ call: type=<class 'int'>, value=2
__radd__ call: type=<class 'int'>, value=3
__radd__ call: type=<class 'int'>, value=4
__radd__ call: type=<class 'int'>, value=5
__radd__ call: type=<class 'int'>, value=6
__radd__ call: type=<class 'int'>, value=7
We can see __add__
got called once and received xr.DataArray
obj but __radd__
got called 8 times and received int
s. This causes 2 problems;
- Performance issue on large
xr.DataArray
- No access to
xr.DataArray
coords which is needed in a more realistic use case
Describe the solution you'd like
I would like to have a mechanism so that DemoObj.__radd__
got called only once and received xr.DataArray
instance in the above example.
Describe alternatives you've considered
Option 1:
The most naive approach to workaround this is to call obj.__radd__(da)
to achieve da + obj
which defeats the purpose of implementing the reflexive operator and not offer good readability.
Option 2:
As xr.DataArray._binary_op
replies on numpy's operator resolving mechanism under the hood, I could improve the situation by setting __array_ufunc__ = None
on my class, e.g.:
class DemoObj:
__array_ufunc__ = None
def __add__(self, other):
...
def __radd__(self, other):
...
This will make __radd__
get called once with np.ndarray
instead of 8 times with int
s. This solves the potential perf concern, however, it still doesn't cover the case if xr.Dataarray.coords
is needed.
Additional context
Considering xr.DataArray._binary_op
has already returned NoImplemented
for a list of classes:
https://github.com/pydata/xarray/blob/v2025.01.1/xarray/core/dataarray.py#L4808-L4809
I'm wondering whether we should do the same for classes has __array_ufunc__ = None
, i.e.:
def _binary_op(
self: T_DataArray,
other: Any,
f: Callable,
reflexive: bool = False,
) -> T_DataArray:
if hasattr(other, '__array_ufunc__') and other.__array_ufunc__ is None:
return NotImplementd
...
I'm happy with a similar property if you prefer to make it xarray specific. I'm happy to make the PR as well once you confirmed the mechanism / property name you preferred.
Many thanks in advance!