Skip to content

Commit f05e7d2

Browse files
authored
✨ Add Hubmap to atlas docs (#220)
* ✨ Add hubmap * 🎨 Add hubmap * 🎨 Polish * 🎨 Polish
1 parent 3545b81 commit f05e7d2

File tree

3 files changed

+220
-0
lines changed

3 files changed

+220
-0
lines changed

docs/atlases.md

+1
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,6 @@ The following use cases demonstrate how to use LaminDB to query popular atlases.
99
1010
cellxgene
1111
arc-virtual-cell-atlas
12+
hubmap
1213
rxrx
1314
```

docs/hubmap.ipynb

+218
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Hubmap: scRNA-seq"
8+
]
9+
},
10+
{
11+
"cell_type": "markdown",
12+
"metadata": {},
13+
"source": [
14+
"The [HubMAP (Human BioMolecular Atlas Program) consortium](https://hubmapconsortium.org/) is an initiative mapping human cells to create a comprehensive atlas, with its [Data Portal](https://portal.hubmapconsortium.org/) serving as the platform where researchers can access, visualize, and download (single-cell) tissue data.\n",
15+
"\n",
16+
"Lamin mirrors most of the datasets for simplified access here: [laminlabs/hubmap](https://lamin.ai/laminlabs/hubmap).\n",
17+
"\n",
18+
"If you use the data academically, please cite the original publication [Jain et al. 2023](https://www.nature.com/articles/s41556-023-01194-w).\n",
19+
"\n",
20+
"Here, we show how the HubMAP instance is structured and how datasets and be queried and accessed."
21+
]
22+
},
23+
{
24+
"cell_type": "markdown",
25+
"metadata": {},
26+
"source": [
27+
"Connect to the source instance:"
28+
]
29+
},
30+
{
31+
"cell_type": "code",
32+
"execution_count": null,
33+
"metadata": {
34+
"tags": [
35+
"hide-output"
36+
]
37+
},
38+
"outputs": [],
39+
"source": [
40+
"# pip install 'lamindb[jupyter,bionty,wetlab]'\n",
41+
"!lamin connect laminlabs/hubmap"
42+
]
43+
},
44+
{
45+
"cell_type": "markdown",
46+
"metadata": {},
47+
"source": [
48+
"```{note}\n",
49+
"\n",
50+
"If you want to transfer artifacts or metadata into your own instance, use `.using(\"laminlabs/hubmap\")` when accessing registries and then `.save()` ({doc}`/transfer`).\n",
51+
"\n",
52+
"```"
53+
]
54+
},
55+
{
56+
"cell_type": "code",
57+
"execution_count": null,
58+
"metadata": {
59+
"tags": [
60+
"hide-output"
61+
]
62+
},
63+
"outputs": [],
64+
"source": [
65+
"import lamindb as ln"
66+
]
67+
},
68+
{
69+
"cell_type": "markdown",
70+
"metadata": {},
71+
"source": [
72+
"## Getting HubMAP datasets and data products"
73+
]
74+
},
75+
{
76+
"cell_type": "markdown",
77+
"metadata": {},
78+
"source": [
79+
"HubMAP associates several data products, which are the single raw datasets, into higher level datasets.\n",
80+
"For example, the dataset [HBM983.LKMP.544](https://portal.hubmapconsortium.org/browse/dataset/20ee458e5ee361717b68ca72caf6044e) has three data products:\n",
81+
"\n",
82+
"1. [raw_expr.h5ad](https://assets.hubmapconsortium.org/f6eb890063d13698feb11d39fa61e45a/raw_expr.h5ad)\n",
83+
"1. [expr.h5ad](https://assets.hubmapconsortium.org/f6eb890063d13698feb11d39fa61e45a/expr.h5ad)\n",
84+
"2. [secondary_analysis.h5ad](https://assets.hubmapconsortium.org/f6eb890063d13698feb11d39fa61e45a/secondary_analysis.h5ad)\n",
85+
"3. [scvelo_annotated.h5ad](https://assets.hubmapconsortium.org/f6eb890063d13698feb11d39fa61e45a/scvelo_annotated.h5ad)\n",
86+
"\n",
87+
"The [laminlabs/hubmap](https://lamin.ai/laminlabs/hubmap) instance registers these data products as {class}`~lamindb.Artifact` that jointly form a {class}`~lamindb.Collection`."
88+
]
89+
},
90+
{
91+
"cell_type": "markdown",
92+
"metadata": {},
93+
"source": [
94+
"The `key` attribute of `ln.Artifact` and `ln.Collection` corresponds to the IDs of the URLs.\n",
95+
"For example, the id in the URL [https://portal.hubmapconsortium.org/browse/dataset/20ee458e5ee361717b68ca72caf6044e](https://portal.hubmapconsortium.org/browse/dataset/20ee458e5ee361717b68ca72caf6044e) is the `key` of the corresponding collection:"
96+
]
97+
},
98+
{
99+
"cell_type": "code",
100+
"execution_count": null,
101+
"metadata": {
102+
"tags": [
103+
"hide-output"
104+
]
105+
},
106+
"outputs": [],
107+
"source": [
108+
"small_intenstine_collection = ln.Collection.get(key=\"20ee458e5ee361717b68ca72caf6044e\")\n",
109+
"small_intenstine_collection"
110+
]
111+
},
112+
{
113+
"cell_type": "markdown",
114+
"metadata": {},
115+
"source": [
116+
"We can get all associated data products like:"
117+
]
118+
},
119+
{
120+
"cell_type": "code",
121+
"execution_count": null,
122+
"metadata": {
123+
"tags": [
124+
"hide-output"
125+
]
126+
},
127+
"outputs": [],
128+
"source": [
129+
"small_intenstine_collection.artifacts.all().df()"
130+
]
131+
},
132+
{
133+
"cell_type": "markdown",
134+
"metadata": {},
135+
"source": [
136+
"Note the key of these three `Artifacts` which corresponds to the assets URL.\n",
137+
"For example, [https://assets.hubmapconsortium.org/f6eb890063d13698feb11d39fa61e45a/expr.h5ad](https://assets.hubmapconsortium.org/f6eb890063d13698feb11d39fa61e45a/expr.h5ad) is the direct URL to the `expr.h5ad` data product.\n",
138+
"\n",
139+
"Artifacts can be directly loaded:"
140+
]
141+
},
142+
{
143+
"cell_type": "code",
144+
"execution_count": null,
145+
"metadata": {
146+
"tags": [
147+
"hide-output"
148+
]
149+
},
150+
"outputs": [],
151+
"source": [
152+
"small_intenstine_af = (\n",
153+
" small_intenstine_collection.artifacts.filter(key__icontains=\"raw_expr.h5ad\")\n",
154+
" .distinct()\n",
155+
" .one()\n",
156+
")\n",
157+
"adata = small_intenstine_af.load()\n",
158+
"adata"
159+
]
160+
},
161+
{
162+
"cell_type": "markdown",
163+
"metadata": {},
164+
"source": [
165+
"## Querying single-cell datasets"
166+
]
167+
},
168+
{
169+
"cell_type": "markdown",
170+
"metadata": {},
171+
"source": [
172+
"Currently, only the `Artifacts` of the `raw_expr.h5ad` data products are labeled with metadata.\n",
173+
"The available metadata includes `ln.Reference`, `bt.Tissue`, `bt.Disease`, `bt.ExperimentalFactor`, and many more.\n",
174+
"Please have a look at [the instance](https://lamin.ai/laminlabs/hubmap) for more details."
175+
]
176+
},
177+
{
178+
"cell_type": "code",
179+
"execution_count": null,
180+
"metadata": {
181+
"tags": [
182+
"hide-output"
183+
]
184+
},
185+
"outputs": [],
186+
"source": [
187+
"# Get one dataset with a specific type of heart failure\n",
188+
"heart_failure_adata = (\n",
189+
" ln.Artifact.filter(diseases__name=\"heart failure with reduced ejection fraction\")\n",
190+
" .first()\n",
191+
" .load()\n",
192+
")\n",
193+
"heart_failure_adata"
194+
]
195+
}
196+
],
197+
"metadata": {
198+
"kernelspec": {
199+
"display_name": "lamindb",
200+
"language": "python",
201+
"name": "python3"
202+
},
203+
"language_info": {
204+
"codemirror_mode": {
205+
"name": "ipython",
206+
"version": 3
207+
},
208+
"file_extension": ".py",
209+
"mimetype": "text/x-python",
210+
"name": "python",
211+
"nbconvert_exporter": "python",
212+
"pygments_lexer": "ipython3",
213+
"version": "3.12.8"
214+
}
215+
},
216+
"nbformat": 4,
217+
"nbformat_minor": 2
218+
}

noxfile.py

+1
Original file line numberDiff line numberDiff line change
@@ -269,6 +269,7 @@ def run_nbs(session):
269269
run_notebooks("docs/tutorial.ipynb")
270270
run_notebooks("docs/tutorial2.ipynb")
271271
run_notebooks("docs/arc-virtual-cell-atlas.ipynb")
272+
run_notebooks("docs/hubmap.ipynb")
272273

273274

274275
@nox.session

0 commit comments

Comments
 (0)