Skip to content

Custom to_memory handling compute #2185

@ilan-gold

Description

@ilan-gold

Please describe your wishes and possible alternatives to achieve the desired result.

Felix pointed out that .compute() is not really optimized for reading large amounts of data into memory because it brings the data onto one node potentially (at least with sparse data, but not sure why the logic would be different for dense). To get around this, we came up with

https://github.com/scverse/annbatch/pull/75/files#diff-0c2c0ac5efaec3e0ed57fb30c4b5910000492bbb03d61232720e7791de2fa20eR169-R171

I don't know if this is an issue with dense as well, but with sparse, it gave huge memory savings, and made things predictable in terms of memory.

cc @Intron7 maybe you've seen something like this? I would think this belongs in to_memory() if we were to implement it: https://anndata.readthedocs.io/en/latest/generated/anndata.AnnData.to_memory.html but am open to other ideas. Maybe FAU actually @flying-sheep ?

Metadata

Metadata

Assignees

No one assigned
    No fields configured for Enhancement.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions