Skip to content

AnnLoader raises TypeError for batch_size=1 #2065

@chillerb

Description

@chillerb

Please make sure these conditions are met

  • (optional) I have confirmed this bug exists on the main branch of anndata.
  • I have confirmed this bug exists on the latest version of anndata.
  • I have checked that this issue has not already been reported.

Report

Code:

import numpy as np

from anndata import AnnData
from anndata.experimental import AnnLoader

samples = [AnnData(np.random.normal(size=(100,100))) for i in range(10)]

loader = AnnLoader(samples, batch_size=1)

for batch in loader:  # Don't wrap in DataLoader!
    print(type(batch))  # Should be AnnData
    print(batch.shape)
    break

Traceback:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[5], line 10
      6 samples = [AnnData(np.random.normal(size=(100,100))) for i in range(10)]
      8 loader = AnnLoader(samples, batch_size=1)
---> 10 for batch in loader:
     11     print(type(batch))
     12     print(batch.shape)

File ~/micromamba/envs/dev/lib/python3.12/site-packages/torch/utils/data/dataloader.py:733, in _BaseDataLoaderIter.__next__(self)
    730 if self._sampler_iter is None:
    731     # TODO(https://github.com/pytorch/pytorch/issues/76750)
    732     self._reset()  # type: ignore[call-arg]
--> 733 data = self._next_data()
    734 self._num_yielded += 1
    735 if (
    736     self._dataset_kind == _DatasetKind.Iterable
    737     and self._IterableDataset_len_called is not None
    738     and self._num_yielded > self._IterableDataset_len_called
    739 ):

File ~/micromamba/envs/dev/lib/python3.12/site-packages/torch/utils/data/dataloader.py:789, in _SingleProcessDataLoaderIter._next_data(self)
    787 def _next_data(self):
    788     index = self._next_index()  # may raise StopIteration
--> 789     data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
    790     if self._pin_memory:
    791         data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)

File ~/micromamba/envs/dev/lib/python3.12/site-packages/torch/utils/data/_utils/fetch.py:55, in _MapDatasetFetcher.fetch(self, possibly_batched_index)
     53 else:
     54     data = self.dataset[possibly_batched_index]
---> 55 return self.collate_fn(data)

File ~/micromamba/envs/dev/lib/python3.12/site-packages/torch/utils/data/_utils/collate.py:398, in default_collate(batch)
    337 def default_collate(batch):
    338     r"""
    339     Take in a batch of data and put the elements within the batch into a tensor with an additional outer dimension - batch size.
    340 
   (...)    396         >>> default_collate(batch)  # Handle `CustomType` automatically
    397     """
--> 398     return collate(batch, collate_fn_map=default_collate_fn_map)

File ~/micromamba/envs/dev/lib/python3.12/site-packages/torch/utils/data/_utils/collate.py:240, in collate(batch, collate_fn_map)
    232         except TypeError:
    233             # The sequence type may not support `copy()` / `__setitem__(index, item)`
    234             # or `__init__(iterable)` (e.g., `range`).
    235             return [
    236                 collate(samples, collate_fn_map=collate_fn_map)
    237                 for samples in transposed
    238             ]
--> 240 raise TypeError(default_collate_err_msg_format.format(elem_type))

TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class 'anndata.experimental.multi_files._anncollection.AnnCollectionView'>

Versions

| Package | Version |
| ------- | ------- |
| anndata | 0.12.1  |
| numpy   | 1.26.4  |

| Dependency        | Version     |
| ----------------- | ----------- |
| typing_extensions | 4.14.1      |
| psutil            | 7.0.0       |
| fsspec            | 2025.7.0    |
| python-dateutil   | 2.9.0.post0 |
| pandas            | 2.3.1       |
| pytz              | 2025.2      |
| executing         | 2.2.0       |
| six               | 1.17.0      |
| wcwidth           | 0.2.13      |
| session-info2     | 0.2         |
| debugpy           | 1.8.15      |
| torch             | 2.7.1       |
| tqdm              | 4.67.1      |
| jupyter_core      | 5.8.1       |
| platformdirs      | 4.3.8       |
| ipython           | 9.4.0       |
| parso             | 0.8.4       |
| numcodecs         | 0.16.1      |
| ipykernel         | 6.30.0      |
| natsort           | 8.4.0       |
| msgpack           | 1.1.1       |
| donfig            | 0.8.1.post1 |
| jedi              | 0.19.2      |
| h5py              | 3.14.0      |
| comm              | 0.2.3       |
| tornado           | 6.5.1       |
| pyzmq             | 27.0.0      |
| prompt_toolkit    | 3.0.51      |
| jupyter_client    | 8.6.3       |
| asttokens         | 3.0.0       |
| Pygments          | 2.19.2      |
| colorama          | 0.4.6       |
| legacy-api-wrap   | 1.4.1       |
| zstandard         | 0.23.0      |
| setuptools        | 80.9.0      |
| stack_data        | 0.6.3       |
| PyYAML            | 6.0.2       |
| zarr              | 3.1.0       |
| crc32c            | 2.7.1       |
| pyarrow           | 20.0.0      |
| scipy             | 1.16.0      |
| packaging         | 25.0        |
| scanpy            | 1.11.2      |
| decorator         | 5.2.1       |
| pure_eval         | 0.2.3       |
| traitlets         | 5.14.3      |
| xarray            | 2025.7.1    |
| cloudpickle       | 3.1.1       |
| ipywidgets        | 8.1.7       |

| Component | Info                                                                           |
| --------- | ------------------------------------------------------------------------------ |
| Python    | 3.12.11 | packaged by conda-forge | (main, Jun  4 2025, 14:45:31) [GCC 13.3.0] |
| OS        | Linux-5.15.0-142-generic-x86_64-with-glibc2.36                                 |
| Updated   | 2025-07-30 14:48                                                               |

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions