Description
The raster
functionality in for weights building currently supports DataArray
objects created through xarray.open_rasterio
:
import xarray
from libpysal.weights import Queen
pop = xarray.open_rasterio(
'https://geographicdata.science/book/_downloads/5263090bd0bdbd7d1635505ff7d36d04/ghsl_sao_paulo.tif'
)
w = Queen.from_xarray(pop)
The code above correctly picks up that missing data are specified with -200 and removes them from the calculation of the weights:
w.n
> 97232
This is not the case when we use DataArray
objects build through the (generally recommended) rioxarray.open_rasterio
:
import rioxarray
pop2 = rioxarray.open_rasterio(
'https://geographicdata.science/book/_downloads/5263090bd0bdbd7d1635505ff7d36d04/ghsl_sao_paulo.tif'
)
w2 = Queen.from_xarray(pop2)
which does not pick up the missing data:
w.n
> 194688
My hunch is that this is because the builder picks up missing values only through the attrs
object:
libpysal/libpysal/weights/raster.py
Lines 229 to 233 in 0f03a31
But rioxarray
does not populate that general object, instead logs missing data under pop2.rio.nodata
:
pop.attrs
> {'transform': (250.0, 0.0, -4482000.0, 0.0, -250.0, -2822000.0),
'crs': '+proj=moll +lon_0=0 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs=True',
'res': (250.0, 250.0),
'is_tiled': 0,
'nodatavals': (-200.0,),
'scales': (1.0,),
'offsets': (0.0,),
'AREA_OR_POINT': 'Area',
'grid_mapping': 'spatial_ref'}
pop2.attrs
> {'_FillValue': -200.0, 'scale_factor': 1.0, 'add_offset': 0.0}
pop2.rio.nodata
> -200.0
I think we should support both formats, read through vanilla xarray
and with rioxarray
loaded in the session too. The former is a more generic approach, the latter is a more robust one for rasters.
Not sure what'd be the ideal way of supporting this. I'm cc'ing here @MgeeeeK since he worked on the original implementation and might have some ideas, and @snowman2 as the main driver behind rioxarray
in case they have some suggestions in terms of best practices.