Skip to content

: GPU-accelerated I/O for AnnData (h5ad), ETA for Wilcoxon DE (#487), and guidance on get.anndata_to_GPU/CPU #500

@asmlgkj

Description

@asmlgkj

Hi rsc team — thanks a lot for the great work on rapids-singlecell!

I have three related questions:

  1. GPU-accelerated I/O for AnnData / other formats

Is there a plan or timeline to support GPU-accelerated reading of h5ad (and possibly loom/mtx) so that data land on the device without an extra round-trip?

Even partial acceleration (e.g., reading into GPU-native sparse/dense arrays, or a zero-copy path for .X) would already help a lot for large datasets.

Any recommended interim best practices to minimize host↔device copies when starting from h5ad?

  1. Wilcoxon differential expression (PR Wilcoxon rank-sum addition to rank_genes_groups #487)

I noticed that Wilcoxon DE has been added here: #487

What’s the expected release or availability timeline?

Will the API mirror Scanpy’s tl.rank_genes_groups(..., method="wilcoxon") semantics (ties, groups vs. reference, dense/sparse support)?

Any constraints to be aware of (e.g., memory behavior on large CSR inputs, batching, multi-GPU)?

  1. When to use rsc.get.anndata_to_GPU / rsc.get.anndata_to_CPU

I’m a bit unsure about when these calls are required vs. when rsc will handle device placement automatically.

Concretely:

Do rsc functions auto-move .X (and relevant matrices) to GPU if they detect host arrays, or should we always call rsc.get.anndata_to_GPU(adata) first?

What exactly gets moved by these helpers: only .X, or also layers, obsm (e.g., embeddings), and other arrays if present?

Expected GPU dtypes/structures: should .X be on the device as Cupy dense or cupyx.scipy.sparse.csr_matrix? Any guidance on supported sparsity patterns?

Mixed workflows: If I run a CPU step (e.g., a Scanpy CPU-only function) and then a GPU step, do I need to call get.anndata_to_GPU(adata) again?

When is get.anndata_to_CPU(adata) recommended (e.g., for plotting, exporting, certain Scanpy ops)? Are there safeguards to avoid redundant copies?

A tiny MWE to clarify expectations would be super helpful:

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions