Skip to content

Handle None values when writing and avoid AnnData file corruption. #12

@SeppeDeWinter

Description

@SeppeDeWinter

Hi

Thank you for the impressive work!

When a None value is trying to be written to the AnnData file, the program will crash resulting in a corrupted AnnData.

For example:

Polars allows for None values https://docs.pola.rs/user-guide/expressions/missing-data/ and when a Polars Series of type str is trying to be written to the AnnData file the program will panic resulting in a corrupted hdf5 file.

thread '<unnamed>' panicked at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/anndata-0.4.1/src/data/array/dataframe.rs:261:28: called `Option::unwrap()` on a `None` value

Reproducible example

import polars as pl
import snapatac2 as snap

adata = snap.AnnData(
        filename = "/tmp/test.h5ad",
        obs = pl.DataFrame({"Index": ["foo1", "foo2", "foo3", "foo4", "foo5"]})
)

adata.obs["test"] = pl.Series(name = "example", values = ["bar1", "bar2", "bar3", "bar4", None])

thread '<unnamed>' panicked at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/anndata-0.4.1/src/data/array/dataframe.rs:261:28:
called `Option::unwrap()` on a `None` value
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Aborted (core dumped)

File corruption

>>> import snapatac2 as snap
>>> snap.read("/tmp/test.h5ad")
thread '<unnamed>' panicked at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyanndata-0.4.1/src/anndata.rs:36:60:
called `Result::unwrap()` on an `Err` value: H5Fopen(): unable to open file: bad object header version number
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---------------------------------------------------------------------------
PanicException                            Traceback (most recent call last)
Cell In[2], line 1
----> 1 snap.read("/tmp/test.h5ad")

PanicException: called `Result::unwrap()` on an `Err` value: H5Fopen(): unable to open file: bad object header version number

This is caused by the unhandled exceptions (unwrap) at:

https://github.com/kaizhang/anndata-rs/blob/a7752429597126c4eb548ebb064dda717294c7e8/anndata/src/data/array/dataframe.rs#L175-L275

All the best,

Seppe

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions