DaskGeoDataFrame and datashader incompatibility

Hello, I am running into an issue where using datashader on a DaskGeoDataFrame results in an error. To reproduce, I have the following poetry environment running on Ubuntu 22.04.5 LTS:

```plaintext
python = ">=3.12,<3.13"
spatialpandas = "0.5.0"
dask = "2025.3.0"
datashader = "0.17.0"
numpy = "2.1.3"
```

I followed [this blog post](https://examples.holoviz.org/gallery/ship_traffic/ship_traffic.html) from Holoviz to set up the DaskGeoDataFrame, and the code that generates the error is the below:

```python
from pathlib import Path

from datashader import Canvas
from spatialpandas.dask import DaskGeoDataFrame
from spatialpandas.io import read_parquet_dask


def run():
    pq_file = Path(__file__).parent / "data" / "test.parq"

    gdf = read_parquet_dask(pq_file)
    assert isinstance(gdf, DaskGeoDataFrame)

    canvas = Canvas()
    canvas.points(gdf, geometry="geometry")


if __name__ == "__main__":
    run()

```

This gives the following error:

```console
Traceback (most recent call last):
  File "2025-03-27_minimal.py", line 54, in <module>
    run()
  File "2025-03-27_minimal.py", line 50, in run
    canvas.points(gdf, geometry="geometry")
  File "/home/titanium/.cache/pypoetry/virtualenvs/sandbox-datashader2-_RrFaDUd-py3.12/lib/python3.12/site-packages/datashader/core.py", line 229, in points
    return bypixel(source, self, glyph, agg)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/titanium/.cache/pypoetry/virtualenvs/sandbox-datashader2-_RrFaDUd-py3.12/lib/python3.12/site-packages/datashader/core.py", line 1351, in bypixel
    return bypixel.pipeline(source, schema, canvas, glyph, agg, antialias=antialias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/titanium/.cache/pypoetry/virtualenvs/sandbox-datashader2-_RrFaDUd-py3.12/lib/python3.12/site-packages/datashader/utils.py", line 121, in __call__
    return lk[cls](head, *rest, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/titanium/.cache/pypoetry/virtualenvs/sandbox-datashader2-_RrFaDUd-py3.12/lib/python3.12/site-packages/datashader/data_libraries/dask.py", line 42, in dask_pipeline
    return da.compute(dsk, scheduler=scheduler)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/titanium/.cache/pypoetry/virtualenvs/sandbox-datashader2-_RrFaDUd-py3.12/lib/python3.12/site-packages/dask/base.py", line 656, in compute
    results = schedule(dsk, keys, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/titanium/.cache/pypoetry/virtualenvs/sandbox-datashader2-_RrFaDUd-py3.12/lib/python3.12/site-packages/dask/local.py", line 455, in get_async
    raise ValueError("Found no accessible jobs in dask")
ValueError: Found no accessible jobs in dask

Process finished with exit code 1

```

To get the code to work, I had to revert the packages to the following:
```plaintext
python = ">=3.12,<3.13"
spatialpandas = "0.4.10"
dask = "2024.12.1"
datashader = "0.17.0"
numpy = "1.26.4"
```

The only output now is a bunch of warnings:

```console
/home/titanium/.cache/pypoetry/virtualenvs/sandbox-datashader2-_RrFaDUd-py3.12/lib/python3.12/site-packages/dask/dataframe/__init__.py:49: FutureWarning: 
Dask dataframe query planning is disabled because dask-expr is not installed.

You can install it with `pip install dask[dataframe]` or `conda install dask`.
This will raise in a future version.

  warnings.warn(msg, FutureWarning)
/home/titanium/.cache/pypoetry/virtualenvs/sandbox-datashader2-_RrFaDUd-py3.12/lib/python3.12/site-packages/spatialpandas/io/parquet.py:353: FutureWarning: Passing 'use_legacy_dataset' is deprecated as of pyarrow 15.0.0 and will be removed in a future version.
  d = ParquetDataset(
/home/titanium/.cache/pypoetry/virtualenvs/sandbox-datashader2-_RrFaDUd-py3.12/lib/python3.12/site-packages/spatialpandas/io/parquet.py:137: FutureWarning: Passing 'use_legacy_dataset' is deprecated as of pyarrow 15.0.0 and will be removed in a future version.
  dataset = ParquetDataset(
# the same warning is repeated many times

Process finished with exit code 0

```

---

I wasn't sure how to create an empty DaskGeoDataFrame, but the way I generated the parquet file was to download one of the csv files as mentioned in the above Holoviz blog post and use the below script:

```python
from pathlib import Path

import dask.dataframe as dd
import numpy as np
from dask.diagnostics import ProgressBar
from spatialpandas import GeoDataFrame
from spatialpandas.geometry import PointArray


def lon_lat_to_easting_northing(longitude, latitude):
    # copied here to avoid dependency on holoviews
    origin_shift = np.pi * 6378137
    easting = longitude * origin_shift / 180.0
    with np.errstate(divide="ignore", invalid="ignore"):
        northing = (
            np.log(np.tan((90 + latitude) * np.pi / 360.0)) * origin_shift / np.pi
        )
    return easting, northing


def convert_partition(df):
    east, north = lon_lat_to_easting_northing(
        df["LON"].astype("float32"), df["LAT"].astype("float32")
    )
    return GeoDataFrame({"geometry": PointArray((east, north))})


def convert_csv_to_gdf():
    base_dir = Path(__file__).parent / "data"
    csv_files = base_dir / "AIS_2020_01*.csv"

    pq_file = base_dir / "test.parq"
    example = GeoDataFrame({"geometry": PointArray([], dtype="float32")})

    with ProgressBar():
        print("Reading csv files")
        gdf = dd.read_csv(csv_files, assume_missing=True)
        gdf = gdf.map_partitions(convert_partition, meta=example)

        print("Writing parquet file")
        gdf = gdf.pack_partitions_to_parquet(pq_file, npartitions=64)

    return gdf


if __name__ == "__main__":
    convert_csv_to_gdf()

```

using the below versions:

```plaintext
python = ">=3.12,<3.13"
spatialpandas = "0.4.10"
dask = "2024.12.1"
datashader = "0.17.0"
numpy = "1.26.4"
```

---

This is not exactly breaking, but it would be nice to be able to use updated packages. Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DaskGeoDataFrame and datashader incompatibility #178

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DaskGeoDataFrame and datashader incompatibility #178

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions