diff --git a/DESCRIPTION b/DESCRIPTION index df6c53d5..2331dea6 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: anndataR Title: AnnData interoperability in R -Version: 0.1.0.9011 +Version: 0.2.0 Authors@R: c( person("Robrecht", "Cannoodt", , "robrecht@data-intuitive.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-3641-729X", github = "rcannood")), @@ -48,7 +48,7 @@ Suggests: knitr, processx, reticulate (>= 1.41.1), - rhdf5, + rhdf5 (>= 2.52.1), rmarkdown, S4Vectors, Seurat, diff --git a/NEWS.md b/NEWS.md index 3cf0ce21..10504fa5 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,59 +1,56 @@ -# anndataR devel +# anndataR 0.2.0 -## anndataR 0.1.0.9011 +## Breaking changes -- Updates for compatibility with Python **anndata** >= 0.12.0 (PR #305, Fixes #304) +- Switch the HDF5 back end to use the **{rhdf5}** package instead of **{hdf5r}** + (PR #283, Fixes #272, #175, #299) + - This addresses various issues related to H5AD files and allows better + integration with Bioconductor. Most of the previous known issues have now + been resolved. + - It also greatly improves compatibility with H5AD files written by Python + **anndata** + - **NOTE:** Make sure to install **{rhdf5}** instead of **{hdf5r}** to be able + to read and write H5AD files! + +## Major changes + +- Updates for compatibility with Python **anndata** >= 0.12.0 (PR #305, + Fixes #304) - Add helpers for reading/writing `NULL` values to/from H5AD files - Writing of `NULL` values can be disabled by setting `option(anndataR.write_null = FALSE)` to allow the files to be read by Python **anndata** < 0.12.0 -- Fix a bug where string arrays were not transposed correctly when writing to - H5AD files (PR #305) -- Fix a bug where the dimenions of dense arrays were not properly conserved - when reading from H5AD (PR #305) -- Remove workarounds and skipping of `none` values in roundtrip tests (PR #305) - -## anndataR 0.1.0.9010 - -- Switch HDF5 back end from **{hdf5r}** to **{rhdf5}** (PR #283, Fixes #272, #175, #299) - - Includes improved compatibility with H5AD files written by Python **anndata** -- Improvements to roundtrip testing (PR #283) - -## anndataR 0.1.0.9009 - -- Fix execution of roundtrip tests (PR #293) - -## anndataR 0.1.0.9008 - -- Add Bioconductor installation instructions in preparation of submission (PR #297) - -## anndataR 0.1.0.9007 - +- A `counts` or `data` layer is no longer required during `Seurat` conversion + (PR #284) + - There will still be a warning if neither of this is present as it may + affect compatibility with **{Seurat}** functions + +## Minor changes + +- Use accessor functions/methods instead of direct slot access where possible + (PR #291) - Refactor superfluous for loops (PR #298) -- -## anndataR 0.1.0.9006 - +- Change uses of `sapply()` to `vapply()` (PR #294) - Ignore `development_status.Rmd` vignette when building package (PR #296) +- Remove `anndataR.Rproj` file from repository (PR #292) -## anndataR 0.1.0.9005 - -- Bypass requiring a `counts` or `data` layer during `Seurat` conversion (PR #284) - -## anndataR 0.1.0.9004 +## Bug fixes -- Use accessors instead of direct slot access where possible (PR #291) +- Fix a bug where string arrays were not transposed correctly when writing to + H5AD files (PR #305) +- Fix a bug where the dimensions of dense arrays were not properly conserved + when reading from H5AD (PR #305) -## anndataR 0.1.0.9003 +## Documentation - Simplify and update vignettes (PR #282) +- Add Bioconductor installation instructions in preparation for submission (PR #297) -## anndataR 0.1.0.9002 - -- Remove `anndataR.Rproj` file from repository (PR #292) - -## anndataR 0.1.0.9001 +## Testing -- Change uses of `sapply()` to `vapply()` (PR #294) +- Improvements to round trip testing (PR #283, PR #293, PR #305) + - Most round trip tests are now enabled and pass successfully + - Conversion helpers have been added to assist with **{reticulate}** tests # anndataR 0.1.0 (inital release candidate) diff --git a/R/write_h5ad.R b/R/write_h5ad.R index 69853c35..3c977cbf 100644 --- a/R/write_h5ad.R +++ b/R/write_h5ad.R @@ -17,9 +17,20 @@ #' @param ... Additional arguments passed to [as_AnnData()] #' #' @details +#' +#' ## Compression +#' #' Compression is currently not supported for Boolean arrays, they will be #' written uncompressed. #' +#' ## `NULL` values +#' +#' For compatibility with changes in Python **anndata** 0.12.0, `NULL` values +#' in `uns` are written to H5AD files as a `NULL` dataset (instead of not being +#' written at all). To disable this behaviour, set +#' `option(anndataR.write_null = FALSE)`. This may be required to allow the file +#' to be read by older versions of Python **anndata**. +#' #' @return `path` invisibly #' @export #' diff --git a/README.md b/README.md index 1a0e4d3e..84465e95 100644 --- a/README.md +++ b/README.md @@ -1,31 +1,22 @@ # {anndataR}: An R package for working with AnnData objects anndataR logo [![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental) -[![CRAN status](https://www.r-pkg.org/badges/version/anndataR.png)](https://CRAN.R-project.org/package=anndataR) [![R-CMD-check](https://github.com/scverse/anndataR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/scverse/anndataR/actions/workflows/R-CMD-check.yaml) -**{anndataR}** aims to make the AnnData format a first-class citizen in -the R ecosystem, and to make it easy to work with AnnData files in R, -either directly or by converting them to a SingleCellExperiment or Seurat -object. +**{anndataR}** aims to make the `AnnData` format a first-class citizen in the R ecosystem, and to make it easy to work with AnnData files in R, either directly or by converting them to a `SingleCellExperiment` or `Seurat` object. **{anndataR}** is an scverse® community project maintained by [Data Intuitive](https://data-intuitive.com/), and is fiscally sponsored by the [Chan Zuckerberg Initiative](https://chanzuckerberg.com/). - ## Features of {anndataR} -- Provide an `R6` class to work with AnnData objects in R (either in-memory or on-disk). +- Provide an `R6` class to work with `AnnData` objects in R (either in-memory or on-disk) - Read/write `*.h5ad` files natively - Convert to/from `SingleCellExperiment` objects - Convert to/from `Seurat` objects -> [!WARNING] -> -> This package is still in the experimental stage, and may not work as -> expected. You can find the status of development of anndataR on the -> [feature tracking page](https://anndatar.data-intuitive.com/articles/design.html#feature-tracking) -> of the website. Please [report](https://github.com/scverse/anndataR/issues) any issues you encounter. +You can find the status of development of **{anndataR}** on the [feature tracking page](https://anndatar.data-intuitive.com/articles/design.html#feature-tracking) of the package website. +Please [report](https://github.com/scverse/anndataR/issues) any issues you encounter. ## Installation @@ -74,17 +65,13 @@ pak::pak("scverse/anndataR") Take note that you need all suggested dependencies available, and that building them can take some time. -- **Getting started**: An introduction to the package and its features. +- [**Getting started**](https://anndatar.data-intuitive.com/articles/anndataR.html): An introduction to the package and its features. `vignette("anndataR", package = "anndataR")` -- **Reading and writing H5AD files**: How to read and write `*.h5ad` files. - `vignette("usage_h5ad", package = "anndataR")` -- **Converting to/from Seurat objects**: How to convert between `AnnData` and `Seurat` objects. +- [**Read/write `Seurat` objects**](https://anndatar.data-intuitive.com/articles/usage_seurat.html): How to convert between `AnnData` and `Seurat` objects. `vignette("usage_seurat", package = "anndataR")` -- **Converting to/from SingleCellExperiment objects**: How to convert between `AnnData` and `SingleCellExperiment` objects. +- [**Read/write `SingleCellExperiment` objects**](https://anndatar.data-intuitive.com/articles/usage_singlecellexperiment.html): How to convert between `AnnData` and `SingleCellExperiment` objects `vignette("usage_singlecellexperiment", package = "anndataR")` -- **Software Design**: An overview of the design of the package. - `vignette("software_design", package = "anndataR")` -- **Development Status**: An overview of the development status of the package. - `vignette("development_status", package = "anndataR")` -- **Known Isses**: An overview of known issues with the package. +- [**Software Design**](https://anndatar.data-intuitive.com/articles/software_design.html): An overview of the design of the package +- [**Development Status**](https://anndatar.data-intuitive.com/articles/development_status.html): An overview of the development status of the package +- [**Known Issues**](https://anndatar.data-intuitive.com/articles/known_issues.html): An overview of known issues with the package. `vignette("known_issues", package = "anndataR")` diff --git a/_pkgdown.yml b/_pkgdown.yml index 96b70dac..bb785f30 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -9,8 +9,6 @@ navbar: text: Articles menu: - text: Usage - - text: Read/write H5AD files - href: articles/anndataR.html - text: Read/write Seurat objects href: articles/usage_seurat.html - text: Read/write SingleCellExperiment objects diff --git a/man/write_h5ad.Rd b/man/write_h5ad.Rd index 4ea0f8e1..78786406 100644 --- a/man/write_h5ad.Rd +++ b/man/write_h5ad.Rd @@ -39,9 +39,21 @@ file. Can be one of \code{"none"}, \code{"gzip"} or \code{"lzf"}. Defaults to \c Write an H5AD file } \details{ +\subsection{Compression}{ + Compression is currently not supported for Boolean arrays, they will be written uncompressed. } + +\subsection{\code{NULL} values}{ + +For compatibility with changes in Python \strong{anndata} 0.12.0, \code{NULL} values +in \code{uns} are written to H5AD files as a \code{NULL} dataset (instead of not being +written at all). To disable this behaviour, set +\code{option(anndataR.write_null = FALSE)}. This may be required to allow the file +to be read by older versions of Python \strong{anndata}. +} +} \examples{ adata <- AnnData( X = matrix(1:5, 3L, 5L), diff --git a/vignettes/anndataR.Rmd b/vignettes/anndataR.Rmd index 70b6e17b..56842aaf 100644 --- a/vignettes/anndataR.Rmd +++ b/vignettes/anndataR.Rmd @@ -28,13 +28,11 @@ library(SingleCellExperiment) ## Introduction **{anndataR}** allows users to work with `.h5ad` files, access various slots in the datasets and convert these files to `SingleCellExperiment` objects or `Seurat` objects, and vice versa. This enables users to move data easily between the different programming languages and analysis ecosystems needed to perform single-cell data analysis. -This package differs from [zellkonverter](https://bioconductor.org/packages/release/bioc/html/zellkonverter.html) because it reads and writes these `.h5ad` files natively in R, and allows conversion to and from `Seurat` objects as well. - -Check out `?anndataR` for a full list of the functions provided by this package. +This package differs from [**{zellkonverter}**](https://bioconductor.org/packages/release/bioc/html/zellkonverter.html) because it reads and writes these `.h5ad` files natively in R, and allows conversion to and from `Seurat` objects as well as `SingleCellExperiment`. ## Installation -Install using either **BiocManager** or from GitHub using **pak**: +Install using either **{BiocManager}** or from GitHub using **{pak}**: ```{r, eval = FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) { @@ -62,13 +60,25 @@ library(anndataR) h5ad_path <- system.file("extdata", "example.h5ad", package = "anndataR") ``` -Read an h5ad file in memory: +By default, a H5AD is read to an in-memory `AnnData` object: ```{r read-in-memory} adata <- read_h5ad(h5ad_path) ``` -Read an h5ad file on disk: +It can also be read as a `SingleCellExperiment` object: + +```{r read-as-SingleCellExperiment} +sce <- read_h5ad(h5ad_path, as = "SingleCellExperiment") +``` + +Or as a `Seurat` object: + +```{r read-as-Seurat} +obj <- read_h5ad(h5ad_path, as = "Seurat") +``` + +There is also a HDF5-backed `AnnData` object: ```{r read-on-disk} adata <- read_h5ad(h5ad_path, as = "HDF5AnnData") @@ -80,7 +90,7 @@ View structure: adata ``` -Access AnnData slots: +Access `AnnData` slots: ```{r access-slots} dim(adata$X) @@ -90,35 +100,35 @@ adata$var[1:5, 1:6] ## Interoperability -Convert the AnnData object to a SingleCellExperiment object: +Convert the `AnnData` object to a `SingleCellExperiment` object: ```{r as-SingleCellExperiment} sce <- adata$as_SingleCellExperiment() sce ``` -Convert the AnnData object to a Seurat object: +Convert the `AnnData` object to a `Seurat` object: ```{r as-Seurat} obj <- adata$as_Seurat() obj ``` -Convert a SingleCellExperiment object to an AnnData object: +Convert a `SingleCellExperiment` object to an `AnnData` object: ```{r as-AnnData-from-SingleCellExperiment} adata <- as_AnnData(sce) adata ``` -Convert a Seurat object to an AnnData object: +Convert a `Seurat` object to an `AnnData` object: ```{r as-AnnData-from-Seurat} adata <- as_AnnData(obj) adata ``` -## Manually create an object +## Manually create an `AnnData` object ```{r manually-create-object} adata <- AnnData( @@ -136,21 +146,21 @@ adata ## Write to disk: -Write an AnnData object to disk: +Write an `AnnData` object to disk: ```{r write-to-disk} tmpfile <- tempfile(fileext = ".h5ad") write_h5ad(adata, tmpfile) ``` -Write an SCE object to disk: +Write a `SingleCellExperiment` object to disk: ```{r write-SingleCellExperiment-to-disk} tmpfile <- tempfile(fileext = ".h5ad") write_h5ad(sce, tmpfile) ``` -Write a Seurat object to disk: +Write a `Seurat` object to disk: ```{r write-Seurat-to-disk} tmpfile <- tempfile(fileext = ".h5ad") diff --git a/vignettes/usage_seurat.Rmd b/vignettes/usage_seurat.Rmd index 9be53a69..c5c7c17e 100644 --- a/vignettes/usage_seurat.Rmd +++ b/vignettes/usage_seurat.Rmd @@ -18,27 +18,24 @@ knitr::opts_chunk$set( This vignette demonstrates how to read and write `Seurat` objects using the **{anndataR}** package, leveraging the interoperability between `Seurat` and the `AnnData` format. -Check out `?anndataR` for a full list of the functions provided by this package. - ## Introduction -Seurat is a widely used toolkit for single-cell analysis in R. - **{anndataR}** enables conversion between `Seurat` objects and `AnnData` objects, -allowing you to leverage the strengths of both the scverse and Seurat ecosystems. +**{Seurat}** is a widely used toolkit for single-cell analysis in R. + **{anndataR}** enables conversion between `Seurat` objects and `AnnData` objects, allowing you to leverage the strengths of both the **scverse** and **{Seurat}** ecosystems. ## Prerequisites -Before you begin, make sure you have both Seurat and **{anndataR}** installed. You can install them using the following code: +This vignette requires the **{Seurat}** package in addition to **{anndataR}**. +You can install them using the following code: ```r if (!requireNamespace("pak", quietly = TRUE)) { install.packages("pak") } pak::pak("Seurat") -pak::pak("scverse/anndataR") ``` -## Converting an AnnData Object to a Seurat Object +## Converting an `AnnData` Object to a `Seurat` Object Using an example `.h5ad` file included in the package, we will demonstrate how to read an `.h5ad` file and convert it to a `Seurat` object. @@ -49,7 +46,7 @@ library(Seurat) h5ad_file <- system.file("extdata", "example.h5ad", package = "anndataR") ``` -Read the `.h5ad` file and convert it to a `Seurat` object: +Read the `.h5ad` file as a `Seurat` object: ```{r read_data} seurat_obj <- read_h5ad(h5ad_file, as = "Seurat") @@ -64,17 +61,18 @@ seurat_obj <- adata$as_Seurat() seurat_obj ``` -Note that there is no one-to-one mapping possible between the AnnData and SeuratObject data structures, -so some information might be lost during conversion. It is recommended to carefully inspect the converted object -to ensure that all necessary information has been transferred. +Note that there is no one-to-one mapping possible between the `AnnData` and `Seurat` data structures, +so some information might be lost during conversion. +It is recommended to carefully inspect the converted object to ensure that all necessary information has been transferred. ### Customizing the conversion -You can customize the conversion process by providing specific mappings for each slot in the Seurat object. +You can customize the conversion process by providing specific mappings for each slot in the `Seurat` object. + Each of the mapping arguments can be provided with one of the following: - `TRUE`: all items in the slot will be copied using the default mapping - `FALSE`: the slot will not be copied -- a (named) character vector: the names are the names of the slot in the Seurat object, the values are the names of the slot in the AnnData object. +- A (named) character vector: the names are the names of the slot in the `Seurat` object, the values are the names of the slot in the `AnnData` object. See `?as_Seurat` for more details on how to customize the conversion process. For instance: @@ -93,7 +91,9 @@ seurat_obj <- adata$as_Seurat( seurat_obj ``` -## Convert a Seurat Object to an AnnData Object +The mapping arguments can also be passed directly to `read_h5ad()`. + +## Convert a `Seurat` object to an `AnnData` object The reverse conversion is also possible, allowing you to convert the `Seurat` object back to an `AnnData` object, or to just write out the `Seurat` object as an `.h5ad` file. @@ -102,6 +102,7 @@ write_h5ad(seurat_obj, tempfile(fileext = ".h5ad")) ``` This is equivalent to converting the `Seurat` object to an `AnnData` object and then writing it out: + ```{r convert_to_anndata} adata <- as_AnnData(seurat_obj) write_h5ad(adata, tempfile(fileext = ".h5ad")) @@ -111,6 +112,7 @@ You can again customize the conversion process by providing specific mappings fo For more details, see `?as_AnnData`. Here's an example: + ```{r customize_anndata_conversion} adata <- as_AnnData( seurat_obj, @@ -126,6 +128,8 @@ adata <- as_AnnData( adata ``` +The mapping arguments can also be passed directly to `write_h5ad()`. + ## Session info ```{r} diff --git a/vignettes/usage_singlecellexperiment.Rmd b/vignettes/usage_singlecellexperiment.Rmd index f084de27..841530cc 100644 --- a/vignettes/usage_singlecellexperiment.Rmd +++ b/vignettes/usage_singlecellexperiment.Rmd @@ -18,26 +18,24 @@ knitr::opts_chunk$set( This vignette demonstrates how to read and write `SingleCellExperiment` objects using the **{anndataR}** package, leveraging the interoperability between `SingleCellExperiment` and the `AnnData` format. -Check out `?anndataR` for a full list of the functions provided by this package. - ## Introduction -SingleCellExperiment is a widely used class for storing single-cell data in R, especially within the Bioconductor ecosystem. -**{anndataR}** enables conversion between `SingleCellExperiment` objects and `AnnData` objects, allowing you to leverage the strengths of both the scverse and Bioconductor ecosystems. +`SingleCellExperiment` is a widely used class for storing single-cell data in R, especially within the **Bioconductor** ecosystem. +**{anndataR}** enables conversion between `SingleCellExperiment` objects and `AnnData` objects, allowing you to leverage the strengths of both the **scverse** and **Bioconductor** ecosystems. ## Prerequisites -Before you begin, make sure you have both SingleCellExperiment and **{anndataR}** installed. You can install them using the following code: +This vignette requires **{SingleCellExperiment}** in addition to **{anndataR}**. +You can install them using the following code: ```r -if (!requireNamespace("pak", quietly = TRUE)) { - install.packages("pak") +if (!requireNamespace("BiocManager", quietly = TRUE)) { + install.packages("BiocManager") } -pak::pak(c("SingleCellExperiment", "SummarizedExperiment")) -pak::pak("scverse/anndataR") +BiocManager::install("SingleCellExperiment") ``` -## Converting an AnnData Object to a SingleCellExperiment Object +## Converting an `AnnData` object to a `SingleCellExperiment` object Using an example `.h5ad` file included in the package, we will demonstrate how to read an `.h5ad` file and convert it to a `SingleCellExperiment` object. @@ -48,28 +46,32 @@ library(SingleCellExperiment) h5ad_file <- system.file("extdata", "example.h5ad", package = "anndataR") ``` -Read the `.h5ad` file: +Read the `.h5ad` file as a `SingleCellExperiment` object: -```{r read_h5ad} -adata <- read_h5ad(h5ad_file) -adata +```{r read_data} +sce_obj <- read_h5ad(h5ad_file, as = "SingleCellExperiment") +sce_obj ``` -Convert to a `SingleCellExperiment` object: +This is equivalent to reading in the `.h5ad` file and explicitly converting. -```{r convert_implicit} -sce_obj <- adata$as_SingleCellExperiment() -sce_obj +```{r read_h5ad} +adata <- read_h5ad(h5ad_file) +sce <- adata$as_SingleCellExperiment() +sce ``` -Note that there is no one-to-one mapping possible between the AnnData and SingleCellExperiment data structures, so some information might be lost during conversion. It is recommended to carefully inspect the converted object to ensure that all necessary information has been transferred. +Note that there is no one-to-one mapping possible between the `AnnData` and `SingleCellExperiment` data structures, so some information might be lost during conversion. +It is recommended to carefully inspect the converted object to ensure that all necessary information has been transferred. ### Customizing the conversion + You can customize the conversion process by providing specific mappings for each slot in the `SingleCellExperiment` object. + Each of the mapping arguments can be provided with one of the following: - `TRUE`: all items in the slot will be copied using the default mapping - `FALSE`: the slot will not be copied -- a (named) character vector: the names are the names of the slot in the `SingleCellExperiment` object, the values are the names of the slot in the `AnnData` object. +- A (named) character vector: the names are the names of the slot in the `SingleCellExperiment` object, the values are the names of the slot in the `AnnData` object. See `?as_SingleCellExperiment` for more details on how to customize the conversion process. For instance: @@ -90,7 +92,9 @@ sce_obj <- adata$as_SingleCellExperiment( sce_obj ``` -## Convert a SingleCellExperiment Object to an AnnData Object +The mapping arguments can also be passed directly to `read_h5ad()`. + +## Convert a `SingleCellExperiment` object to an `AnnData` object The reverse conversion is also possible, allowing you to convert a `SingleCellExperiment` object back to an `AnnData` object, or to just write out the `SingleCellExperiment` object as an `.h5ad` file. @@ -99,6 +103,7 @@ write_h5ad(sce_obj, tempfile(fileext = ".h5ad")) ``` This is equivalent to converting the `SingleCellExperiment` object to an `AnnData` object and then writing it out: + ```{r convert_and_write} adata <- as_AnnData(sce_obj) write_h5ad(adata, tempfile(fileext = ".h5ad")) @@ -107,6 +112,7 @@ write_h5ad(adata, tempfile(fileext = ".h5ad")) You can again customize the conversion process by providing specific mappings for each slot in the `AnnData` object. For more details, see `?as_AnnData`. Here's an example: + ```{r customize_anndata_conversion} as_AnnData( sce_obj, @@ -120,6 +126,8 @@ as_AnnData( ) ``` +The mapping arguments can also be passed directly to `write_h5ad()`. + ## Session info ```{r}