-
Notifications
You must be signed in to change notification settings - Fork 175
Open
Labels
Description
Please make sure these conditions are met
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of anndata.
- (optional) I have confirmed this bug exists on the main branch of anndata.
Report
Code:
import anndata as ad
import h5py
import zarr
from anndata.experimental import read_lazy
with h5py.File("/lustre/groups/ml01/workspace/100mil/100m_int_indices.h5ad", "r") as f:
adata_all = ad.AnnData(
obs=read_lazy(f["obs"]),
var=read_lazy(f["var"]),
uns=read_lazy(f["uns"]),
obsm=read_lazy(f["obsm"]),
)
adata_all.obs['cell_line'] # fails here (it is categorical data)
This is what it looks like normally
0 CVCL_0131
1 CVCL_0480
2 CVCL_0293
3 CVCL_0397
4 CVCL_1097
...
95624329 CVCL_0504
95624330 CVCL_1693
95624331 CVCL_1381
95624332 CVCL_1285
95624333 CVCL_1550
Name: cell_line, Length: 95624334, dtype: category
Categories (50, object): ['CVCL_0023', 'CVCL_0028', 'CVCL_0069', 'CVCL_0099', ..., 'CVCL_1717', 'CVCL_1724', 'CVCL_1731', 'CVCL_C466']
Traceback:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
File /ictstr01/home/icb/selman.ozleyen/.local/share/mamba/envs/lpert/lib/python3.12/site-packages/IPython/core/formatters.py:770, in PlainTextFormatter.__call__(self, obj)
763 stream = StringIO()
764 printer = pretty.RepresentationPrinter(stream, self.verbose,
765 self.max_width, self.newline,
766 max_seq_length=self.max_seq_length,
767 singleton_pprinters=self.singleton_printers,
768 type_pprinters=self.type_printers,
769 deferred_pprinters=self.deferred_printers)
--> 770 printer.pretty(obj)
771 printer.flush()
772 return stream.getvalue()
File /ictstr01/home/icb/selman.ozleyen/.local/share/mamba/envs/lpert/lib/python3.12/site-packages/IPython/lib/pretty.py:411, in RepresentationPrinter.pretty(self, obj)
400 return meth(obj, self, cycle)
401 if (
402 cls is not object
403 # check if cls defines __repr__
(...) 409 and callable(_safe_getattr(cls, "__repr__", None))
410 ):
--> 411 return _repr_pprint(obj, self, cycle)
413 return _default_pprint(obj, self, cycle)
414 finally:
...
File h5py/h5d.pyx:399, in h5py.h5d.DatasetID.get_type()
ValueError: Invalid dataset identifier (identifier is not of specified type)
Error raised while reading key '??' of <class 'h5py._hl.dataset.Dataset'> from /
Versions
| Package | Version |
| ------- | ----------------------- |
| xarray | 2025.9.0 |
| anndata | 0.13.0.dev28+g1ba19458f |
| h5py | 3.14.0 |
| zarr | 3.1.2 |
| pandas | 2.3.2 |
| Dependency | Version |
| ------------------ | --------------------- |
| packaging | 25.0 |
| debugpy | 1.8.12 |
| msgpack | 1.1.1 |
| cupy-cuda12x | 13.6.0 |
| urllib3 | 2.5.0 |
| setuptools | 80.9.0 |
| click | 8.2.1 |
| Pygments | 2.19.2 |
| legacy-api-wrap | 1.4.1 |
| cloudpickle | 3.1.1 |
| donfig | 0.8.1.post1 |
| executing | 2.2.1 |
| certifi | 2025.8.3 (2025.08.03) |
| jupyter_core | 5.8.1 |
| ipywidgets | 8.1.7 |
| tqdm | 4.67.1 |
| jedi | 0.19.2 |
| ipykernel | 6.29.5 |
| crc32c | 2.7.1 |
| MarkupSafe | 3.0.2 |
| attrs | 25.3.0 |
| stack-data | 0.6.3 |
| fastrlock | 0.8.3 |
| jupyter_client | 8.6.3 |
| wcwidth | 0.2.13 |
| prompt_toolkit | 3.0.52 |
| parso | 0.8.5 |
| six | 1.17.0 |
| sortedcontainers | 2.4.0 |
| idna | 3.10 |
| Jinja2 | 3.1.6 |
| requests | 2.32.5 |
| simplejson | 3.20.1 |
| omegaconf | 2.3.0 |
| scipy | 1.15.3 |
| pyarrow | 21.0.0 |
| pure_eval | 0.2.3 |
| tblib | 3.1.0 |
| fsspec | 2025.9.0 |
| pytz | 2025.2 |
| toolz | 1.0.0 |
| comm | 0.2.2 |
| numpy | 2.2.6 |
| ipython | 9.5.0 |
| typing_extensions | 4.15.0 |
| torch | 2.8.0 |
| locket | 1.0.0 |
| numcodecs | 0.16.2 |
| asttokens | 3.0.0 |
| natsort | 8.4.0 |
| dask | 2025.9.1 |
| distributed | 2025.9.1 |
| scanpy | 1.11.4 |
| python-dateutil | 2.9.0.post0 |
| psutil | 7.0.0 |
| PyYAML | 6.0.2 |
| platformdirs | 4.4.0 |
| rich | 14.1.0 |
| traitlets | 5.14.3 |
| tornado | 6.5.2 |
| session-info2 | 0.2.1 |
| zict | 3.0.0 |
| pyzmq | 27.1.0 |
| decorator | 5.2.1 |
| charset-normalizer | 3.4.3 |
| Component | Info |
| --------- | ------------------------------------------------------------------------------ |
| Python | 3.12.11 | packaged by conda-forge | (main, Jun 4 2025, 14:45:31) [GCC 13.3.0] |
| OS | Linux-5.14.0-570.25.1.el9_6.x86_64-x86_64-with-glibc2.34 |
| Updated | 2025-10-16 13:10