-
Notifications
You must be signed in to change notification settings - Fork 40
Open
Description
This is a curiosity question, but could pycytominer work on backed data? Would this have speed up benefits? (probably worth discussing in the context of a
narwhalsupgrade in 2.0 as well)
Originally posted by @gwaybio in #573 (comment)
We could implement lazy file reader capabilities to enhance efficiency by avoiding full data reads unless absolutely necessary.
Examples / inspirations include:
anndata.experimental.read_lazy(which enables lazy reads of anndata)pyarrow.parquet.ParquetFile(which abstracts Parquet data reads)pl.LazyFrame(for common interfacing with these data)narwhals.LazyFrame(for a common abstraction layer for multiple dataframe types)- duckdb pl.LazyFrame support (another possible abstraction layer for lazyframes from a number of data sources)
Metadata
Metadata
Assignees
Labels
No labels