Code to reproduce the figures and results from the cytomarker paper.
Prior to running, the following data files need to be placed in the data directory:
singlecells.csvfrom the cytomarker paper data (download from zenodo here: https://zenodo.org/records/13891857)- The nygc multimodal pbmc data set from:
https://datasets.cellxgene.cziscience.com/de42a173-458a-429c-b129-c26bcd3adb3b.h5ad, named asnygc-pbmc.h5ad - The transcriptome and proteome data from Nicolet et al. 2022 from
https://doi.org/10.1371/journal.pone.0276294.s006 - The protein-RNA correlation table from the Gygi Lab here: https://gygi.hms.harvard.edu/data/ccle/Table_S4_Protein_RNA_Correlation_and_Enrichments.xlsx, put
into the
depmapsub-directory indata
To run all:
cd 2023-cytomarkerpaper-analysis
snakemake --cores all # user can specify number of cores
data/sce_screen_full.rdsanddata/sce_screen_subsample.rds/figs/screen_cluster_heatmap_unscaled.pngand/figs/screen_cluster_heatmap_scaled.pngto be used in cluster interpretation. User should then createdata/cluster-interpretation-nov23.xlsx(what clusters are what cell types)results/nygc_pbmc_subsampled.rdsfigs/screen-vs-scrna.pdffigs/rna-protein-scatter.pdfandfigs/cytomarker_sens_spec_no_ms.pdffigs/heatmap_mammary_single_cell.pdf