You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tangram is a Python package, written in [PyTorch](https://pytorch.org/) and based on [scanpy](https://scanpy.readthedocs.io/en/stable/), for mapping single-cell (or single-nucleus) gene expression data onto spatial gene expression data. The single-cell dataset and the spatial dataset should be collected from the same anatomical region/tissue type, ideally from a biological replicate, and need to share a set of genes. Tangram aligns the single-cell data in space by fitting gene expression on the shared genes. The best way to familiarize yourself with Tangram is to check out [our tutorial](example/1_tutorial_tangram.ipynb).
Tangram is a Python package, written in [PyTorch](https://pytorch.org/) and based on [scanpy](https://scanpy.readthedocs.io/en/stable/), for mapping single-cell (or single-nucleus) gene expression data onto spatial gene expression data. The single-cell dataset and the spatial dataset should be collected from the same anatomical region/tissue type, ideally from a biological replicate, and need to share a set of genes. Tangram aligns the single-cell data in space by fitting gene expression on the shared genes. The best way to familiarize yourself with Tangram is to check out [our tutorial](https://github.com/broadinstitute/Tangram/blob/master/example/1_tutorial_tangram.ipynb). [](https://colab.research.google.com/drive/1SVLUIZR6Da6VUyvX_2RkgVxbPn8f62ge?usp=sharing)
Tangram has been tested on various types of transcriptomic data (10Xv3, Smart-seq2 and SHARE-seq for single cell data; MERFISH, Visium, Slide-seq, smFISH and STARmap as spatial data). In our [preprint](https://www.biorxiv.org/content/10.1101/2020.08.29.272831v1), we used Tangram to reveal spatial maps of cell types and gene expression at single cell resolution in the adult mouse brain. More recently, we have applied our method to different tissue types including human lung, human kidney developmental mouse brain and metastatic breast cancer.
7
9
8
10
***
9
11
## How to run Tangram
10
12
11
-
To install Tangram, make sure you have [PyTorch](https://pytorch.org/) and [scanpy](https://scanpy.readthedocs.io/en/stable/) installed. If you need more details on the dependences, look at the `environment.yml` file. Then clone this repo, and import as follows:
13
+
To install Tangram, make sure you have [PyTorch](https://pytorch.org/) and [scanpy](https://scanpy.readthedocs.io/en/stable/) installed. If you need more details on the dependences, look at the `environment.yml` file. To install and import tangram, please use the following code:
12
14
13
15
```
14
-
import sys
15
-
sys.path.append("/home/tbiancal/git/Tangram")
16
+
pip install tangram-sc
16
17
import tangram as tg
17
18
```
18
19
@@ -38,7 +39,7 @@ The returned AnnData,`ad_map`, is a cell-by-voxel structure where `ad_map.X[i, j
38
39
39
40
The returned `ad_ge` is a voxel-by-gene AnnData, similar to spatial data `ad_sp`, but where gene expression has been projected from the single cells. This allows to extend gene throughput, or correct for dropouts, if the single cells have higher quality (or more genes) than single cell data. It can also be used to transfer cell types onto space.
40
41
41
-
For more details on how to use Tangram check out [our tutorial](example/1_tutorial_tangram.ipynb).
42
+
For more details on how to use Tangram check out [our tutorial](https://github.com/broadinstitute/Tangram/blob/master/example/1_tutorial_tangram.ipynb).[](https://colab.research.google.com/drive/1SVLUIZR6Da6VUyvX_2RkgVxbPn8f62ge?usp=sharing)
42
43
43
44
***
44
45
## How Tangram works under the hood
@@ -48,7 +49,7 @@ Tangram instantiates a `Mapper` object passing the following arguments:
48
49
49
50
Then, Tangram searches for a mapping matrix _M_, with shape voxels-by-cells, where the element _M\_ij_ signifies the probability of cell _i_ of being in spot _j_. Tangram computes the matrix _M_ by minimizing the following loss:
where cos_sim is the cosine similarity. The meaning of the loss function is that gene expression of the mapped single cells should be as similar as possible to the spatial data _G_, under the cosine similarity sense.
54
55
@@ -79,7 +80,9 @@ If you have questions, please contact the authors of the method:
Copy file name to clipboardExpand all lines: example/1_tutorial_tangram.ipynb
+13-14Lines changed: 13 additions & 14 deletions
Original file line number
Diff line number
Diff line change
@@ -9,10 +9,18 @@
9
9
"- The notebook uses data from mouse brain cortex, although different than those adopted in the manuscript (which need to wait May 2020 before being released).\n",
"- Written - Dec 30th 2020 by Tommaso Biancalani <[email protected]>"
14
15
]
15
16
},
17
+
{
18
+
"source": [
19
+
"First of all, make sure tangram-sc is installed (pip install tangram-sc) and environment is set up according to environment.yml"
20
+
],
21
+
"cell_type": "markdown",
22
+
"metadata": {}
23
+
},
16
24
{
17
25
"cell_type": "code",
18
26
"execution_count": 1,
@@ -25,9 +33,6 @@
25
33
"import matplotlib.pyplot as plt\n",
26
34
"import scanpy as sc\n",
27
35
"import torch\n",
28
-
"\n",
29
-
"# add path of Tangram repository for importing it\n",
30
-
"sys.path.append(\"./..\") \n",
31
36
"import tangram as tg"
32
37
]
33
38
},
@@ -182,7 +187,7 @@
182
187
"\n",
183
188
"- By single cell data, we generally mean either scRNAseq or snRNAseq.\n",
184
189
"- We start by mapping the MOp 10Xv3 dataset, which contains single nuclei collected from a posterior region of the primary motor cortex.\n",
185
-
"- They are approximately 53k profiled cells with 28k genes."
190
+
"- They are approximately 26k profiled cells with 28k genes."
186
191
]
187
192
},
188
193
{
@@ -453,9 +458,6 @@
453
458
"import matplotlib.pyplot as plt\n",
454
459
"import scanpy as sc\n",
455
460
"import torch\n",
456
-
"\n",
457
-
"# add path of Tangram repository for importing it\n",
458
-
"sys.path.append(\"./..\") \n",
459
461
"import tangram as tg"
460
462
]
461
463
},
@@ -477,7 +479,7 @@
477
479
"- Mapping should be interrupted after the score plateaus,which can be controlled by passing the `num_epochs` parameter. \n",
478
480
"- The score measures the similarity between the gene expression of the mapped cells vs spatial data: higher score means \n",
479
481
"- Note that we obtained excellent mapping even if Tangram converges to a low scores (the typical case is when the spatial data are very sparse): we use the score merely to assess convergence.\n",
480
-
"- If you are running Tangram with a GPU, uncomment`device=cuda: 0` and comment the line `device=cpu`. On a MacBook Pro 2018, it takes ~1h to run. On a P100 GPU it should be done in a few minutes.\n",
482
+
"- If you are running Tangram with a GPU, uncomment`device=cuda: 0` and comment the line `device=cpu`. On a MacBook Pro 2018, it takes ~1h to run. On a P100 GPU it should be done in a few minutes.\n",
481
483
"- For this basic mapping, we do not use regularizers (hence the `NaN`). More sophisticated loss functions can be used using the Tangram library (refer to manuscript or dive into the code)."
482
484
]
483
485
},
@@ -568,9 +570,6 @@
568
570
"import matplotlib.pyplot as plt\n",
569
571
"import scanpy as sc\n",
570
572
"import torch\n",
571
-
"\n",
572
-
"# add path of Tangram repository for importing it\n",
573
-
"sys.path.append(\"./..\") \n",
574
573
"import tangram as tg"
575
574
]
576
575
},
@@ -579,7 +578,7 @@
579
578
"metadata": {},
580
579
"source": [
581
580
"- We load the single cell data, the spatial data and the mapping results.\n",
582
-
"- We load the original datasets, rather than the `AnnData`s pre-processed with `pp_adatas`, as we would like to "
581
+
"- We load the original datasets, rather than the `AnnData` pre-processed with `pp_adatas`, as we would like to "
583
582
]
584
583
},
585
584
{
@@ -1370,7 +1369,7 @@
1370
1369
"metadata": {},
1371
1370
"source": [
1372
1371
"- We can use again `plot_genes` to visualize gene patterns.\n",
1373
-
"- Interestingly, the agreement for genes `Atp1b1` or `Apt1a3`, seems less good that that for `Ctgf` and `Nefh`, despite the scores are higher for the former genes. This is because even though the latter gene patterns are localized correctly, their expression values are not so well correlated (for instance, in `Ctgf` the \"bright yellow spot\" is in different part of layer 6b). In contrast, for `Atpb1` the gene expression pattern is largely recover, even though the overall gene expression in the spatial data is more dim."
1372
+
"- Interestingly, the agreement for genes `Atp1b1` or `Apt1a3`, seems less good than that for `Ctgf` and `Nefh`, despite the scores are higher for the former genes. This is because even though the latter gene patterns are localized correctly, their expression values are not so well correlated (for instance, in `Ctgf` the \"bright yellow spot\" is in different part of layer 6b). In contrast, for `Atpb1` the gene expression pattern is largely recover, even though the overall gene expression in the spatial data is more dim."
0 commit comments