Skip to content

sandhya212/Sparcle_for_spot_reassignments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sparcle for spot reassignments in transcriptomic images

This is the code base for Sparcle: Spatial reassignment of spots to cells via maximum likelihood estimation in transcriptomic images. Sparcle is available at Bioinformatics Advances (2022).

Abstract

Background

Imaging-based spatial transcriptomics has the power to reveal patterns of single-cell gene expression by detecting mRNA transcripts as individually resolved spots in multiplexed images. However, molecular quantification has been severely limited by the computational challenges of segmenting poorly outlined, overlapping cells, and of overcoming technical noise; the majority of transcripts are routinely discarded because they fall outside the segmentation boundaries. This lost information leads to less accurate gene count matrices and weakens downstream analyses, such as cell type or gene program identification.

Results

Here, we present Sparcle, a probabilistic model that reassigns transcripts to cells based on gene covariation patterns and incorporates spatial features such as distance to nucleus. We demonstrate its utility on both multiplexed error-robust fluorescence in situ hybridization (MERFISH) and single-molecule FISH (smFISH) data.

Conclusions

Sparcle improves transcript assignment, providing more realistic per-cell quantification of each gene, better delineation of cell boundaries, and improved cluster assignments. Critically, our approach does not require an accurate segmentation and is agnostic to technological platform.

Graphical abstract

Sparcle iteratively recovers dangling mRNA transcripts.

  • a. A Merfish exemplar FoV showing DAPI channel with cell segments (red lines) and transcripts (points).
  • b. A zoomed-in section showing neuronal and non neuronal cells with dangling mRNAs. The neuron at the center shows partial cell segmentation of the nucleus (dark brown region) which contains 4 mRNA transcripts (in pink and green) accounting for 2 genes. The dangling mRNAs present outside the cell segment are completely ignored by current computational downstream approaches. Dotted lines between dangling mRNAs and cells denote potential mRNA to cell assignments.
  • c. A further zoomed-in section shows nuclear mRNA as dots as dangling mRNA as crosses. Colors represent one of the 140 genes. Also shown is a mockcell (blue circle) centered at a dangling mRNA (pink cross).
  • d. A count matrix of genes x cells is created using cell segments from all FoVs. This is clustered to give a set of cell types along with cluster moments.
  • e. Sparcle builds a weighted mockcell for each dangling mRNA and assigns the mockcell to the nearest cell sharing the same cluster as that of the mockcell, using MLE.
  • f. The count matrix is updated for cells and relevant genes based on the newly-assigned dangling mRNAs, the count matrix is re-clustered and this process iterates for a fixed number of iterations.

Datasets used

  1. MERFISH
  1. Allen smFISH VISp:
  1. STARmap:
  1. pciSeq/ISS:
  1. Vizgen FoV 75:
  1. Code to extract the dangling mRNA for smFISH, STARmap, ISS and Vizgen will be made available upon request.

Installation

  1. Download this code repository or Open Terminal and use git clone

$ git clone https://github.com/sandhya212/Sparcle_for_spot_reassignments

  1. The folder ‘Code_HPC’ contains the Python code implementing SPARCLE
  • Sparcle_submit.sh: Shell script to submit code to the cluster
  • start_file.py: Set current working directory and path variables for data here
  • init_file.py: Initialise the variables for Sparcle here
  • Sparcle.py: This file is Sparcle’s engine that iterates over FoVs recovering dangling mRNAs while parallel processing across FoVs per iteration
  1. Submit code using:

bsub -W 2:00 -R 'rusage[mem=25]' < Sparcle_submit.sh -o sparcle_out_file

  1. For general reference, 'Code_HPC' equivalent is provided as a Python notebook and Python code in Sparcle_ver_1.ipynb and Sparcle_ver_1.py, respectively.

  2. For another example of Sparcle, a standalone folder with data and code for pciSeq_ISS is provided.

About

Code for Sparcle: Spatial reassignment of spots to cells via profile likelihood estimation in transcriptomic images

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages