In the Python ecosystem Xarray is by far the most popular package for working with multidimensional labelled arrays. The main data structures it provides are:
DimensionalData integrates with PythonCall.jl to allow converting these Xarray types to their DimensionalData equivalent:
import PythonCall: pyconvert
# By default this will share the underlying array
my_dimarray = pyconvert(DimArray, my_dataarray)
my_dimstack = pyconvert(DimStack, my_dataset)
Here are some things to keep in mind when converting:
-
pyconvert(DimArray, x)
is zero-copy by default, i.e. it will share the underlying array with Python and register itself with Pythons GC to ensure that the memory isn't garbage-collected prematurely. If you want to make a copy you can call it likepyconvert(DimArray, x; copy=true)
. -
When doing a zero-copy conversion from
x
tox_jl
,parent(x_jl)
will be a PyArray. In most situations there should be no overhead from this but note that aPyArray
is not aDenseArray
so some operations that dispatch onDenseArray
may not be performant, e.g. BLAS calls. See these issues for more information:When
copy=true
,parent(x_jl)
will always be a standardArray
. However, we do not consider the type of parent array covered by semver so this may change in the future. -
Python stores arrays in row-major order whereas Julia stores them in column-major order, hence the dimensions on a converted
DimArray
will be in reverse order from the originalDataArray
. This is done to ensure that the 'fast axis' to iterate over is the same dimension in both Julia and Python.