Support datashader.spatial.points.to_parquet with pyarrow

PR https://github.com/pyviz/datashader/pull/702 introduced support for spatially indexing Dask dataframes and writing them out as parquet files with custom spatial metadata using the `datashader.spatial.points.to_parquet`.

To accomplish this, the parquet file is initially written out using dask's `dask.dataframe.io.to_parquet` function. Then the parquet file is opened with `fastparquet` directly.  The parquet metadata is retrieved using fastparquet, the spatial metadata is added, and then the updated metadata is written back to the file.

In order to support the creation of spatially partitioned parquet files using pyarrow (rather than fastparquet), we would need to work out an similar approach to adding properties to the parquet metadata using the [pyarrow parquet API](https://arrow.apache.org/docs/python/parquet.html).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support datashader.spatial.points.to_parquet with pyarrow #67

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support datashader.spatial.points.to_parquet with pyarrow #67

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions