Skip to content

write_zarr fails if a column of floats is object type #1850

@WeilerP

Description

@WeilerP

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of anndata.
  • (optional) I have confirmed this bug exists on the master branch of anndata.

Report

In the following example, adata.obs["col_1"] is of data type object whcih makes write_zarr fail; if it is of type float everything works as expected.

Code:

import pandas as pd
import numpy as np
from anndata import AnnData

adata = AnnData(X=np.eye(2))
df = pd.DataFrame({"col_1": [0.1, 0.001], "col_2": ["a", "b"]})

adata.obs[df.columns] = df.values
adata.write_zarr("dummy.zarr")

Traceback:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/lib/python3.10/site-packages/anndata/_core/anndata.py", line 1928, in write_zarr
    write_zarr(store, self, chunks=chunks)
  File "/lib/python3.10/site-packages/anndata/_io/zarr.py", line 50, in write_zarr
    write_dispatched(f, "/", adata, callback=callback, dataset_kwargs=ds_kwargs)
  File "/lib/python3.10/site-packages/anndata/experimental/_dispatch_io.py", line 77, in write_dispatched
    writer.write_elem(store, key, elem, dataset_kwargs=dataset_kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/utils.py", line 249, in func_wrapper
    return func(*args, **kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/specs/registry.py", line 355, in write_elem
    return self.callback(
  File "/lib/python3.10/site-packages/anndata/_io/zarr.py", line 48, in callback
    func(s, k, elem, dataset_kwargs=dataset_kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/specs/registry.py", line 71, in wrapper
    result = func(g, k, *args, **kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/specs/methods.py", line 278, in write_anndata
    _writer.write_elem(g, "obs", adata.obs, dataset_kwargs=dataset_kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/utils.py", line 249, in func_wrapper
    return func(*args, **kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/specs/registry.py", line 355, in write_elem
    return self.callback(
  File "/lib/python3.10/site-packages/anndata/_io/zarr.py", line 48, in callback
    func(s, k, elem, dataset_kwargs=dataset_kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/specs/registry.py", line 71, in wrapper
    result = func(g, k, *args, **kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/specs/methods.py", line 873, in write_dataframe
    _writer.write_elem(
  File "/lib/python3.10/site-packages/anndata/_io/utils.py", line 249, in func_wrapper
    return func(*args, **kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/specs/registry.py", line 355, in write_elem
    return self.callback(
  File "/lib/python3.10/site-packages/anndata/_io/zarr.py", line 48, in callback
    func(s, k, elem, dataset_kwargs=dataset_kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/specs/registry.py", line 71, in wrapper
    result = func(g, k, *args, **kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/utils.py", line 308, in func_wrapper
    func(f, k, elem, _writer=_writer, dataset_kwargs=dataset_kwargs)
  File "/lib/python3.10/site-packages/anndata/_io/specs/methods.py", line 528, in write_vlen_string_array_zarr
    f[k][:] = elem
  File "/lib/python3.10/site-packages/zarr/core.py", line 1449, in __setitem__
    self.set_basic_selection(pure_selection, value, fields=fields)
  File "/lib/python3.10/site-packages/zarr/core.py", line 1545, in set_basic_selection
    return self._set_basic_selection_nd(selection, value, fields=fields)
  File "/lib/python3.10/site-packages/zarr/core.py", line 1935, in _set_basic_selection_nd
    self._set_selection(indexer, value, fields=fields)
  File "/lib/python3.10/site-packages/zarr/core.py", line 1988, in _set_selection
    self._chunk_setitem(chunk_coords, chunk_selection, chunk_value, fields=fields)
  File "/lib/python3.10/site-packages/zarr/core.py", line 2261, in _chunk_setitem
    self._chunk_setitem_nosync(chunk_coords, chunk_selection, value, fields=fields)
  File "/lib/python3.10/site-packages/zarr/core.py", line 2271, in _chunk_setitem_nosync
    self.chunk_store[ckey] = self._encode_chunk(cdata)
  File "/lib/python3.10/site-packages/zarr/core.py", line 2387, in _encode_chunk
    chunk = f.encode(chunk)
  File "numcodecs/vlen.pyx", line 104, in numcodecs.vlen.VLenUTF8.encode
TypeError: expected unicode string, found 0.1
Error raised while writing key 'col_1' of <class 'zarr.hierarchy.Group'> to /obs

Versions

| Dependency         | Version     |
| ------------------ | ----------- |
| setuptools         | 75.8.0      |
| natsort            | 8.4.0       |
| pyarrow            | 19.0.0      |
| numcodecs          | 0.13.1      |
| asciitree          | 0.3.3       |
| charset-normalizer | 3.4.1       |
| pytz               | 2025.1      |
| python-dateutil    | 2.9.0.post0 |
| h5py               | 3.12.1      |
| zarr               | 2.18.3      |
| six                | 1.17.0      |

| Component | Info                                                                              |
| --------- | --------------------------------------------------------------------------------- |
| Python    | 3.10.16 | packaged by conda-forge | (main, Dec  5 2024, 14:12:04) [Clang 18.1.8 ] |
| OS        | macOS-15.1.1-x86_64-i386-64bit                                                    |
| Updated   | 2025-02-06 20:36                                                                  |

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions