This repository houses tutorial notebooks to run GPU-accelerated single-cell analysis workflows using RAPIDS-singlecell, a GPU accelerated library developed by scverse®. The goal is of this repository is to help users try out and explore different capabilities of RAPIDS-singlecell on datasets ranging from 250 thousand to 11 million cells. To make this as easy as possible, we set up two different GPU environments on Brev that are designed to get you working with GPU-accelerated single-cell workflows as quickly as possible (see Quickstart). We've also provided instructions to run these notebooks on your own CUDA-enabled GPU systems (see Bring your own compute).
These notebooks will be valuable for single-cell scientists who want to quickly evaluate ease of use as well as explore the biological interpretability of RAPIDS-singlecell results. Secondarily, scientists will find value in learning to apply these methods to very large data sets. This repository is also broadly useful for any data scientist or developer who wants to run and evaluate single cell methods leveraging RAPIDS-singlecell. Data sets used for this tutorial were made publicly available by 10X as well as CZ cellxgene. The base container is the 26.02 RAPIDSAI Notebooks Container, which you can freely get from NVIDIA's NGC Catalog following the instructions below.
If you like these notebooks and this GPU accelerated capability, and want to support scverse's efforts, please learn more about them here as well as consider joining their community.
The quickest way to use these blueprints is to use one of our pre-configured NVIDIA Brev resources.
- Select your resource size, and click "Deploy Now":
-
Click Deploy Launchable on the Brev.dev Launchable page
-
Wait for the Container status show Ready (can take up to 8 minutes). Then, click Access GPU
-
On the Instance page, click Open Notebook
You should drop into a fully installed and populated JupyterLab environment. Open up your desired notebook from the list below, and have a great time!
This repository contains a diverse set of notebooks to help get anyone started using RAPIDS-singlecell developed by scverse.
The outline below is a suggested exploration flow. Unless otherwise noted, you can choose any notebook to get started, as long as you have the GPU resources to run the notebook.
For those who are new to doing basic analysis for single cell data, the end to end analysis of 01_scRNA_analysis_preprocessing.ipynb is the best place to start, where you are walked through the steps of data preprocessing, cleanup, visualization, and investigation.
| Notebook | Description | Instance Type |
|---|---|---|
| 01_scRNA_analysis_preprocessing.ipynb | End to end workflow, where we understand the cells, run ETL on the data set then visiualize and explore the results. This tutorial is good for all users |
Standard Instance |
| 02_scRNA_analysis_extended.ipynb | This notebook continues from the outputs of 01_scRNA_analysis_preprocessing.ipynb as an overview of methods that can be used to investigate transcriptional regulation | Standard Instance |
| 03_scRNA_analysis_with_pearson_residuals.ipynb | End to end workflow, like 01_scRNA_analysis_preprocessing.ipynb, but uses pearson residuals for normalization. | Standard Instance |
| 04_scRNA_analysis_dask_out_of_core.ipynb | In this notebook, we show the scalability of the analysis to up to 11M cells easily by using Dask and out of core processing. | Advanced Instance |
| 05_spatial_demo.ipynb | GPU-accelerated spatial analysis using rapids-singlecell and Squidpy. Covers spatial autocorrelation (Moran's I and Geary's C) and co-occurrence analysis to reveal cell-type co-localization and tissue organization patterns. | Standard Instance |
| 06_scRNA_analysis_1.0M_brain_example.ipynb | In this notebook, we scale up the analysis of the 01_scRNA_analysis_preprocessing.ipynb example to 1 million brain cells. | Advanced Instance |
| 07_perturbation_analysis_invivo_brain_example.ipynb | GPU-accelerated perturbation analysis on a whole-brain single-nucleus CRISPR atlas (~3.5M cells, ~2,000 target genes). Computes pairwise E-distances between perturbation groups and non-targeting controls across neuronal cell types to build a global perturbation-response map. | Advanced Instance |
You can find more detail on each notebook in the Notebooks README.
Note
To ensure you have the maximum GPU memory available, please remember to shut down your completed notebook's kernel before starting a new notebook. If you don't, you may experience Out Of Memory (OOM) based errors. To fix that, simply kill all the kernels, and the restart only the kernel for the notebook you want to run.
The goal of this repository is to make it easy to try GPU-accelerated single-cell analysis workflows on different compute environments and datasets. Our preferred environment is NVIDIA Brev, but you can also run these in your own GPU-connected environment. We've provided a few tutorials below on how to set this up, and the easiest place to start is to follow the Quickstart instructions.
Follow our Quickstart Instructions above.
If you want to try a compute environment on Brev that's not one of the Quickstart Launchables, you will need to create a new Launchable or Standalone Compute Instance. This will let you select your desired cloud provider and desired compute resource. Note, we have not tested this on every combination of cloud provider and instance type, so your experience may vary.
If you're interested in trying this out, please follow the instructions here: Setting up your Custom Brev Launchable
Some people may want to have this experience off of Brev and take it with you. Great! We wrote a (somewhat) easy tutorial here: Bring your own compute
If you have any questions about these notebooks or need support, please open an Issue on this repository and we will respond there.


