Skip to content

Latest commit

 

History

History
43 lines (37 loc) · 2.13 KB

xarray.md

File metadata and controls

43 lines (37 loc) · 2.13 KB

Xarray and PythonCall.jl

In the Python ecosystem Xarray is by far the most popular package for working with multidimensional labelled arrays. The main data structures it provides are:

DimensionalData integrates with PythonCall.jl to allow converting these Xarray types to their DimensionalData equivalent:

import PythonCall: pyconvert

# By default this will share the underlying array
my_dimarray = pyconvert(DimArray, my_dataarray)

my_dimstack = pyconvert(DimStack, my_dataset)

Here are some things to keep in mind when converting:

  • pyconvert(DimArray, x) is zero-copy by default, i.e. it will share the underlying array with Python and register itself with Pythons GC to ensure that the memory isn't garbage-collected prematurely. If you want to make a copy you can call it like pyconvert(DimArray, x; copy=true).

  • When doing a zero-copy conversion from x to x_jl, parent(x_jl) will be a PyArray. In most situations there should be no overhead from this but note that a PyArray is not a DenseArray so some operations that dispatch on DenseArray may not be performant, e.g. BLAS calls. See these issues for more information:

    When copy=true, parent(x_jl) will always be a standard Array. However, we do not consider the type of parent array covered by semver so this may change in the future.

  • Python stores arrays in row-major order whereas Julia stores them in column-major order, hence the dimensions on a converted DimArray will be in reverse order from the original DataArray. This is done to ensure that the 'fast axis' to iterate over is the same dimension in both Julia and Python.