Skip to content

Commit a4501c1

Browse files
committed
update netzoo notebook
1 parent 42dce31 commit a4501c1

File tree

1 file changed

+104
-32
lines changed

1 file changed

+104
-32
lines changed

netbooks/netZooPy/ccle_analysis.ipynb

+104-32
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,9 @@
1515
"metadata": {},
1616
"source": [
1717
"# Introduction\n",
18-
"The Cancer Cell Line Encyclopedia (CCLE) has collected various omic data for more than a thousand cancer cell lines, representative of many lineages and tissue type. In this analysis, we will first use DRAGON to find associations between multiomic data types, and second, we will use PANDA-LIONESS-MONSTER to model a transition from primary to metastatic melanoma and identify drivers of this transition. \n",
19-
"# Import packages"
18+
"The Cancer Cell Line Encyclopedia (CCLE) has collected various omic data for more than a thousand cancer cell lines, representative of many lineages and tissue type. In this analysis, we will first use DRAGON to find associations between multiomic data types, and second, we will use PANDA-LIONESS-MONSTER to model a transition from primary to metastatic melanoma and identify drivers of this transition.<sup>1</sup>\n",
19+
"# Import packages\n",
20+
"First, we start by loading the packages required for the analysis."
2021
]
2122
},
2223
{
@@ -27,9 +28,9 @@
2728
"source": [
2829
"import numpy as np\n",
2930
"from scipy.stats import skew\n",
30-
"import matplotlib.pyplot as plt\n",
31+
"import matplotlib.pyplot as plt # For plottinh\n",
3132
"import os\n",
32-
"import pandas as pd\n",
33+
"import pandas as pd # To load data\n",
3334
"import seaborn as sns # To plot results\n",
3435
"from netZooPy import dragon # To import dragon"
3536
]
@@ -99,6 +100,13 @@
99100
" return r_exp_ppi, adj_p_vals_exp_ppi, p_vals_exp_ppi"
100101
]
101102
},
103+
{
104+
"cell_type": "markdown",
105+
"metadata": {},
106+
"source": [
107+
"This is a simple scaling function"
108+
]
109+
},
102110
{
103111
"cell_type": "code",
104112
"execution_count": null,
@@ -112,6 +120,13 @@
112120
" return (X_temp - X_mean) / X_std"
113121
]
114122
},
123+
{
124+
"cell_type": "markdown",
125+
"metadata": {},
126+
"source": [
127+
"Because we will use DRAGON to find associations in pairs of multiomic data, we need to align any 2 omic data types to have the same sample size by matching their cell line names using this function."
128+
]
129+
},
115130
{
116131
"cell_type": "code",
117132
"execution_count": null,
@@ -130,6 +145,13 @@
130145
" return expression, methyl"
131146
]
132147
},
148+
{
149+
"cell_type": "markdown",
150+
"metadata": {},
151+
"source": [
152+
"This function converts cell line names to a standard DepMap ID."
153+
]
154+
},
133155
{
134156
"cell_type": "code",
135157
"execution_count": null,
@@ -147,6 +169,13 @@
147169
" return methyl"
148170
]
149171
},
172+
{
173+
"cell_type": "markdown",
174+
"metadata": {},
175+
"source": [
176+
"This is a function to process dependency data using CRISPR screens."
177+
]
178+
},
150179
{
151180
"cell_type": "code",
152181
"execution_count": null,
@@ -160,6 +189,13 @@
160189
" return dep"
161190
]
162191
},
192+
{
193+
"cell_type": "markdown",
194+
"metadata": {},
195+
"source": [
196+
"A processing function for miRNA expression data."
197+
]
198+
},
163199
{
164200
"cell_type": "code",
165201
"execution_count": null,
@@ -176,6 +212,13 @@
176212
" return mirna"
177213
]
178214
},
215+
{
216+
"cell_type": "markdown",
217+
"metadata": {},
218+
"source": [
219+
"A processing function for drug viability data."
220+
]
221+
},
179222
{
180223
"cell_type": "code",
181224
"execution_count": null,
@@ -203,6 +246,13 @@
203246
" return drugs"
204247
]
205248
},
249+
{
250+
"cell_type": "markdown",
251+
"metadata": {},
252+
"source": [
253+
"A processing function for proteomic data."
254+
]
255+
},
206256
{
207257
"cell_type": "code",
208258
"execution_count": null,
@@ -237,6 +287,13 @@
237287
" return ppi"
238288
]
239289
},
290+
{
291+
"cell_type": "markdown",
292+
"metadata": {},
293+
"source": [
294+
"A processing function for metabolomic data."
295+
]
296+
},
240297
{
241298
"cell_type": "code",
242299
"execution_count": null,
@@ -319,6 +376,13 @@
319376
"plt.plot(sortedarray,'o',mfc='none', alpha=0.1, color='slategrey')"
320377
]
321378
},
379+
{
380+
"cell_type": "markdown",
381+
"metadata": {},
382+
"source": [
383+
"This plot represents correlations between dependency and miRNA expression. Correlation might imply that miRNA regulate these target genes."
384+
]
385+
},
322386
{
323387
"cell_type": "code",
324388
"execution_count": null,
@@ -331,11 +395,14 @@
331395
"numcol=tdindices[1][2]\n",
332396
"print(mir_dep_edges.iloc[numindex,numcol])\n",
333397
"print(mir_dep_edges.index[numindex])\n",
334-
"print(mir_dep_edges.columns[numcol])\n",
335-
"#for -1:-1 http://mirdb.org/cgi-bin/search.cgi\n",
336-
"#for 0:0\n",
337-
"# mirdb: http://mirdb.org/cgi-bin/search.cgi?searchType=miRNA&searchBox=hsa-miR-664a-3p&full=1\n",
338-
"# targetscan: http://www.targetscan.org/cgi-bin/targetscan/vert_71/targetscan.cgi?mirg=hsa-miR-664a-3p"
398+
"print(mir_dep_edges.columns[numcol])"
399+
]
400+
},
401+
{
402+
"cell_type": "markdown",
403+
"metadata": {},
404+
"source": [
405+
"We find that the pair GSR and miR-664a-3p which has a strong negative correlation (negative dependency being associated to decreased cell survival), this pair has been validated in [TargetScan](http://www.targetscan.org/cgi-bin/targetscan/vert_71/targetscan.cgi?mirg=hsa-miR-664a-3p) as a possible interaction based on various features."
339406
]
340407
},
341408
{
@@ -393,6 +460,13 @@
393460
"sns_plot = sns.boxplot(oncdep_drugs_edges['dabrafenib'], orient='v',width=.6,flierprops=flierprops)"
394461
]
395462
},
463+
{
464+
"cell_type": "markdown",
465+
"metadata": {},
466+
"source": [
467+
"We find that gene dependencies correlated with Dabrafenib are BRAF, MAPK1 and MAPK2, which are in the same pathway targeted by Dabrafenib."
468+
]
469+
},
396470
{
397471
"cell_type": "markdown",
398472
"metadata": {},
@@ -434,34 +508,16 @@
434508
"metadata": {},
435509
"outputs": [],
436510
"source": [
437-
"ppi_met_edges=estimateprotmet(cellNames)\n",
438-
"c=ppi_met_edges.loc['LDHA',].sort_values()\n",
439-
"d=ppi_met_edges.loc['LDHB',].sort_values()\n",
440-
"##warburg effect: neg corr g3p, PEP with LDHA and low corr with fumarate/maleate shows that TCA is not used\n",
441-
"sns_plot = sns.swarmplot([Scale(c.values),Scale(d.values], orient='v')"
511+
"f = {'LDHA': c.values, 'LDHB': d.values}\n",
512+
"dff=pd.DataFrame(data=f)\n",
513+
"sns_plot = sns.swarmplot(data=Scale(dff), orient='v')"
442514
]
443515
},
444516
{
445-
"cell_type": "code",
446-
"execution_count": null,
517+
"cell_type": "markdown",
447518
"metadata": {},
448-
"outputs": [],
449519
"source": [
450-
"f = {'LDHA': c.values, 'LDHB': d.values}\n",
451-
"dff=pd.DataFrame(data=f)\n",
452-
"sns_plot = sns.swarmplot(data=Scale(dff), orient='v')\n",
453-
"\n",
454-
"##warburg effect: neg corr g3p, PEP with LDHA and low corr with fumarate/maleate shows that TCA is not used\n",
455-
"\n",
456-
"\n",
457-
"np.where(d.index=='lactate')\n",
458-
"ll=np.zeros(225)\n",
459-
"ll[156]=1\n",
460-
"f = {'LDHA': c.values, 'LDHB': d.values, 'lactate':ll}\n",
461-
"dff=pd.DataFrame(data=f)\n",
462-
"sns_plot = sns.swarmplot(data=Scale(dff), y='LDHB', x=np.ones(len(dff)), hue=\"lactate\")\n",
463-
"\n",
464-
"ddd[0]=[3.7052299080735225e-05]"
520+
"We find that metabolites such as fumarate/maleate, PEP, and g3p has a negative correlation with LDHA levels, indicating production of lactate. We also see that LDHB levels have a positive partial correlation (3.705e-05) with lactate which indicates that LDHB works in the same direction as LDHA and further supporting lactate production in cancer cells (Warburg effect)."
465521
]
466522
},
467523
{
@@ -526,6 +582,13 @@
526582
"sns_plot = sns.boxplot(data=Scale(c.values))"
527583
]
528584
},
585+
{
586+
"cell_type": "markdown",
587+
"metadata": {},
588+
"source": [
589+
"We find that 2-HG disrupts binding of TP73, PPARg, and GLI4. These TFs have various roles in cancer; TP73 is a tumor supressor, PPARg mediates several oncogenic signaling processes, and GLI4 is a glioma-inducing oncogene. GLI4 is particularly interesting because glioma is the cancer subtype where 2HG induces a hypermethylator phenotype."
590+
]
591+
},
529592
{
530593
"cell_type": "markdown",
531594
"metadata": {},
@@ -539,6 +602,15 @@
539602
"metadata": {},
540603
"outputs": [],
541604
"source": []
605+
},
606+
{
607+
"cell_type": "markdown",
608+
"metadata": {},
609+
"source": [
610+
"# References\n",
611+
"\n",
612+
"1- Guebila, Marouen Ben, et al. \"The Network Zoo: a multilingual package for the inference and analysis of biological networks.\" bioRxiv (2022)."
613+
]
542614
}
543615
],
544616
"metadata": {

0 commit comments

Comments
 (0)