To install the dependencies, run these commands from the project root:
BiocManager::install("sva")Rscript -e "devtools::install_deps()"To create the necessary data folders, run:
Rscript -e "dataorganizer::CreateFolders()"The notebooks are organized by dataset in the Analyses folder. Within each dataset folder, numbered notebooks should be run first in the order of the numbers. Non-numbered notebooks can be run in any order after the numbered ones.
The notebooks assume that the data is stored in ./data/spatial_data. See R/data_loading.R for details.
This section explains how to download and prepare the data required for the notebooks.
To download the data, use the following links:
- Human Breast Cancer (BC)
- Xenium: 10x Genomics Preview. Only need 'In Situ Sample 1, Replicate 1', 'Xenium Output Bundle', which should be unzipped into the root dataset folder.
- scRNA-seq: GEO GSM7782698. Need
GSM7782698_count_raw_feature_bc_matrix.h5(from GEO) andumap_annotation.csvfrom Source Data -> "Fig. 3E".
- Mouse Hypothalamus (Brain)
- MERFISH: Dryad
- scRNA-seq: GEO GSE113576. Only need
regional_sampling_UMIcounts.txtandmouse_int_meta.csvfiles. - scRNA-seq metadata from the paper supplementary Table 1 should be saved in
mouse_hypothalamus_rna/scRNA_metadata.xlsx. - There is also
merfish_barcodes.csvwith the molecule information, which wasn't published in the original study.
- Mouse Ileum (Gut)
- MERFISH: Dryad
- scRNA-seq: GEO GSE92332
- Human NSCLC (NSCLC)
- CosMx: Nanostring FFPE.
- scRNA-seq: GEO GSE127465
- DIALOGUE results: GitHub. Put
DIALOGUE1_LungCancer.SMI.rdsinNSCLC/CosmX. - iTALK LR database: GitHub. Put
LR_database.rdainNSCLC/CosmX.
- Human Ovarian Cancer (OC)
- Xenium: 10x Genomics
- scRNA-seq: 10x Genomics scFFPE
- Human Pancreatic Cancer (Pancreas)
- Xenium: 10x Genomics
- snRNA-seq: Single Cell Portal
The code assumes that the data is stored in ../data with a separate subfolder per dataset and per modality. See data_mapping.yml for details. There, you can also change the paths to match your local setup.