Merge pull request #8 from broadinstitute/release-v0.2.1

Tommaso Biancalani · web-flow · commit bcfbfb3822ed · 2021-01-31T11:39:56.000-05:00
Release v0.2.1
diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
@@ -0,0 +1,30 @@
+name: Release to PyPI
+
+on:
+  release:
+    types: [released]
+
+jobs:
+  release:
+
+    runs-on: ubuntu-latest
+
+    steps:
+    - uses: actions/checkout@v2
+    - name: Set up Python 3.6
+      uses: actions/setup-python@v2
+      with:
+        python-version: 3.6
+    - name: Install Tools
+      run: |
+        python3 -m pip install --user --upgrade setuptools wheel
+        python3 -m pip install --user --upgrade twine
+    - name: Package and Upload
+      env:
+        TANGRAM_VERSION: ${{ github.event.release.tag_name }}
+        TWINE_USERNAME: __token__
+        TWINE_PASSWORD: ${{ secrets.PYPI_APIKEY }}
+      run: |
+        python3 setup.py sdist bdist_wheel
+        python3 -m twine upload dist/*
+
diff --git a/README.md b/README.md
@@ -1,18 +1,19 @@
-<img src="figures/tangram_large.png" width="400">
+<img src="https://raw.githubusercontent.com/broadinstitute/Tangram/master/figures/tangram_large.png" width="400"> 
 
-Tangram is a Python package, written in [PyTorch](https://pytorch.org/) and based on [scanpy](https://scanpy.readthedocs.io/en/stable/), for mapping single-cell (or single-nucleus) gene expression data onto spatial gene expression data. The single-cell dataset and the spatial dataset should be collected from the same anatomical region/tissue type, ideally from a biological replicate, and need to share a set of genes. Tangram aligns the single-cell data in space by fitting gene expression on the shared genes. The best way to familiarize yourself with Tangram is to check out [our tutorial](example/1_tutorial_tangram.ipynb).
+[![PyPI version](https://badge.fury.io/py/tangram-sc.svg)](https://badge.fury.io/py/tangram-sc)
 
-![Tangram_overview](figures/tangram_overview.png)
+Tangram is a Python package, written in [PyTorch](https://pytorch.org/) and based on [scanpy](https://scanpy.readthedocs.io/en/stable/), for mapping single-cell (or single-nucleus) gene expression data onto spatial gene expression data. The single-cell dataset and the spatial dataset should be collected from the same anatomical region/tissue type, ideally from a biological replicate, and need to share a set of genes. Tangram aligns the single-cell data in space by fitting gene expression on the shared genes. The best way to familiarize yourself with Tangram is to check out [our tutorial](https://github.com/broadinstitute/Tangram/blob/master/example/1_tutorial_tangram.ipynb). [![colab tutorial](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1SVLUIZR6Da6VUyvX_2RkgVxbPn8f62ge?usp=sharing)
+
+![Tangram_overview](https://raw.githubusercontent.com/broadinstitute/Tangram/master/figures/tangram_overview.png)
 Tangram has been tested on various types of transcriptomic data (10Xv3, Smart-seq2 and SHARE-seq for single cell data; MERFISH, Visium, Slide-seq, smFISH and STARmap as spatial data). In our [preprint](https://www.biorxiv.org/content/10.1101/2020.08.29.272831v1), we used Tangram to reveal spatial maps of cell types and gene expression at single cell resolution in the adult mouse brain. More recently, we have applied our method to different tissue types including human lung, human kidney developmental mouse brain and metastatic breast cancer.
 
 ***
 ## How to run Tangram
 
-To install Tangram, make sure you have [PyTorch](https://pytorch.org/) and [scanpy](https://scanpy.readthedocs.io/en/stable/) installed. If you need more details on the dependences, look at the `environment.yml` file. Then clone this repo, and import as follows:
+To install Tangram, make sure you have [PyTorch](https://pytorch.org/) and [scanpy](https://scanpy.readthedocs.io/en/stable/) installed. If you need more details on the dependences, look at the `environment.yml` file. To install and import tangram, please use the following code:
 
 ```
-    import sys
-    sys.path.append("/home/tbiancal/git/Tangram") 
+    pip install tangram-sc
     import tangram as tg
 ```
 
@@ -38,7 +39,7 @@ The returned AnnData,`ad_map`, is a cell-by-voxel structure where `ad_map.X[i, j
 
 The returned `ad_ge` is a voxel-by-gene AnnData, similar to spatial data `ad_sp`, but where gene expression has been projected from the single cells. This allows to extend gene throughput, or correct for dropouts, if the single cells have higher quality (or more genes) than single cell data. It can also be used to transfer cell types onto space. 
 
-For more details on how to use Tangram check out [our tutorial](example/1_tutorial_tangram.ipynb).
+For more details on how to use Tangram check out [our tutorial](https://github.com/broadinstitute/Tangram/blob/master/example/1_tutorial_tangram.ipynb). [![colab tutorial](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1SVLUIZR6Da6VUyvX_2RkgVxbPn8f62ge?usp=sharing)
 
 ***
 ## How Tangram works under the hood
@@ -48,7 +49,7 @@ Tangram instantiates a `Mapper` object passing the following arguments:
 
 Then, Tangram searches for a mapping matrix _M_, with shape voxels-by-cells, where the element _M\_ij_ signifies the probability of cell _i_ of being in spot _j_. Tangram computes the matrix _M_ by minimizing the following loss:
 
-<img src="figures/tangram_loss.gif" width="400">
+<img src="https://raw.githubusercontent.com/broadinstitute/Tangram/master/figures/tangram_loss.gif" width="400">
 
 where cos_sim is the cosine similarity. The meaning of the loss function is that gene expression of the mapped single cells should be as similar as possible to the spatial data _G_, under the cosine similarity sense.
 
@@ -79,7 +80,9 @@ If you have questions, please contact the authors of the method:
 - Tommaso Biancalani - <tbiancal@broadinstitute.org>  
 - Gabriele Scalia - <gscalia@broadinstitute.org>
 
-The artwork has been curated by:
-- Anna Hupalowska <ahupalow@broadinstitute.org>
-
+PyPI maintainer:
+- Tommaso Biancalani - <tbiancal@broadinstitute.org>
+- Ziqing Lu - <lu.ziq@northeastern.edu>
 
+The artwork has been curated by:
+- Anna Hupalowska <ahupalow@broadinstitute.org>
diff --git a/example/1_tutorial_tangram.ipynb b/example/1_tutorial_tangram.ipynb
@@ -9,10 +9,18 @@
     "- The notebook uses data from mouse brain cortex, although different than those adopted in the manuscript (which need to wait May 2020 before being released).\n",
     "\n",
     "#### Changelog\n",
+    "- Fixes - Jan 31 2021 by Ziqing Lu <lu.ziq@northeastern.edu>\n",
     "- Fixes - Jan 4th 2021 by Ziqing Lu <lu.ziq@northeastern.edu>\n",
     "- Written - Dec 30th 2020 by Tommaso Biancalani <tbiancal@broadinstitute.org>"
    ]
   },
+  {
+   "source": [
+    "First of all, make sure tangram-sc is installed (pip install tangram-sc) and environment is set up according to environment.yml"
+   ],
+   "cell_type": "markdown",
+   "metadata": {}
+  },
   {
    "cell_type": "code",
    "execution_count": 1,
@@ -25,9 +33,6 @@
     "import matplotlib.pyplot as plt\n",
     "import scanpy as sc\n",
     "import torch\n",
-    "\n",
-    "# add path of Tangram repository for importing it\n",
-    "sys.path.append(\"./..\") \n",
     "import tangram as tg"
    ]
   },
@@ -182,7 +187,7 @@
     "\n",
     "- By single cell data, we generally mean either scRNAseq or snRNAseq.\n",
     "- We start by mapping the MOp 10Xv3 dataset, which contains single nuclei collected from a posterior region of the primary motor cortex.\n",
-    "- They are approximately 53k profiled cells with 28k genes."
+    "- They are approximately 26k profiled cells with 28k genes."
    ]
   },
   {
@@ -453,9 +458,6 @@
     "import matplotlib.pyplot as plt\n",
     "import scanpy as sc\n",
     "import torch\n",
-    "\n",
-    "# add path of Tangram repository for importing it\n",
-    "sys.path.append(\"./..\") \n",
     "import tangram as tg"
    ]
   },
@@ -477,7 +479,7 @@
     "- Mapping should be interrupted after the score plateaus,which can be controlled by passing the `num_epochs` parameter. \n",
     "- The score measures the similarity between the gene expression of the mapped cells vs spatial data: higher score means \n",
     "- Note that we obtained excellent mapping even if Tangram converges to a low scores (the typical case is when the spatial data are very sparse): we use the score merely to assess convergence.\n",
-    "- If you are running Tangram with a GPU, uncomment`device=cuda: 0` and comment the line `device=cpu`. On a MacBook Pro 2018, it takes ~1h to run. On a P100 GPU it should be done in a few minutes.\n",
+    "- If you are running Tangram with a GPU, uncomment `device=cuda: 0` and comment the line `device=cpu`. On a MacBook Pro 2018, it takes ~1h to run. On a P100 GPU it should be done in a few minutes.\n",
     "- For this basic mapping, we do not use regularizers (hence the `NaN`). More sophisticated loss functions can be used using the Tangram library (refer to manuscript or dive into the code)."
    ]
   },
@@ -568,9 +570,6 @@
     "import matplotlib.pyplot as plt\n",
     "import scanpy as sc\n",
     "import torch\n",
-    "\n",
-    "# add path of Tangram repository for importing it\n",
-    "sys.path.append(\"./..\") \n",
     "import tangram as tg"
    ]
   },
@@ -579,7 +578,7 @@
    "metadata": {},
    "source": [
     "- We load the single cell data, the spatial data and the mapping results.\n",
-    "- We load the original datasets, rather than the `AnnData`s pre-processed with `pp_adatas`, as we would like to "
+    "- We load the original datasets, rather than the `AnnData` pre-processed with `pp_adatas`, as we would like to "
    ]
   },
   {
@@ -1370,7 +1369,7 @@
    "metadata": {},
    "source": [
     "- We can use again `plot_genes` to visualize gene patterns.\n",
-    "- Interestingly, the agreement for genes `Atp1b1` or `Apt1a3`, seems less good that that for `Ctgf` and `Nefh`, despite the scores are higher for the former genes. This is because even though the latter gene patterns are localized correctly, their expression values are not so well correlated (for instance, in `Ctgf` the \"bright yellow spot\" is in different part of layer 6b). In contrast, for `Atpb1` the gene expression pattern is largely recover, even though the overall gene expression in the spatial data is more dim."
+    "- Interestingly, the agreement for genes `Atp1b1` or `Apt1a3`, seems less good than that for `Ctgf` and `Nefh`, despite the scores are higher for the former genes. This is because even though the latter gene patterns are localized correctly, their expression values are not so well correlated (for instance, in `Ctgf` the \"bright yellow spot\" is in different part of layer 6b). In contrast, for `Atpb1` the gene expression pattern is largely recover, even though the overall gene expression in the spatial data is more dim."
    ]
   },
   {
@@ -1429,4 +1428,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 4
-}
+}
diff --git a/setup.py b/setup.py
@@ -0,0 +1,40 @@
+#!/usr/bin/env python3
+
+import setuptools
+
+with open("README.md","r", encoding="utf-8") as fh:
+    long_description = fh.read()
+
+d = {}
+with open('tangram/_version.py') as f:  exec(f.read(), d)
+
+setuptools.setup(
+    name="tangram-sc",
+    version=d['__version__'],
+    author="Tommaso Biancalani, Gabriele Scalia",
+    author_email="tommaso.biancalani@gmail.com",
+    description="Spatial alignment of single cell transcriptomic data.",
+    long_description=long_description,
+    long_description_content_type="text/markdown",
+    url="https://github.com/broadinstitute/Tangram",
+    packages=setuptools.find_packages(),
+    classifiers=[
+        "Programming Language :: Python :: 3.6",
+        "Operating System :: MacOS",
+    ],
+    python_requires='>=3.6',
+    install_requires=[
+        "pip>=19.0.0",
+        "torch>=1.4.0",
+        "pandas>=1.1.0",
+        "numpy>=1.19.1",
+        "scipy>=1.5.2",
+        "matplotlib>=3.0.0",
+        "seaborn>=0.10.1",
+        "scanpy==1.6.0",
+        # "jupyterlab>=2.2.6",
+    ]
+)
+
+
+
diff --git a/tangram/__init__.py b/tangram/__init__.py
@@ -1,3 +1,4 @@
+from ._version import __version__
 from .mapping_utils import *
 from .utils import *
 from .plot_utils import *
diff --git a/tangram/_version.py b/tangram/_version.py
@@ -0,0 +1,2 @@
+import os
+__version__ = os.environ.get('TANGRAM_VERSION')

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+import os`
	`2`	`+__version__ = os.environ.get('TANGRAM_VERSION')`