Skip to content

Commit bd22b18

Browse files
authored
Merge pull request #23 from soelmicheletti/main
Corrected file path in cobra
2 parents 7763a55 + e01b1df commit bd22b18

File tree

1 file changed

+33
-3
lines changed

1 file changed

+33
-3
lines changed

netbooks/netZooPy/cobra.ipynb

+33-3
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
"cells": [
33
{
44
"cell_type": "markdown",
5+
"id": "7e26ed88",
56
"metadata": {},
67
"source": [
78
"# Decomposing gene co-expression networks with COBRA (Python version)\n",
@@ -12,6 +13,7 @@
1213
},
1314
{
1415
"cell_type": "markdown",
16+
"id": "8520e757",
1517
"metadata": {},
1618
"source": [
1719
"## 1. Introduction\n",
@@ -29,6 +31,7 @@
2931
{
3032
"cell_type": "code",
3133
"execution_count": null,
34+
"id": "fe567bf2",
3235
"metadata": {},
3336
"outputs": [],
3437
"source": [
@@ -37,6 +40,7 @@
3740
},
3841
{
3942
"cell_type": "markdown",
43+
"id": "c27aff27",
4044
"metadata": {},
4145
"source": [
4246
"On the server, we need to change the working directory to the `data` folder of the current useer."
@@ -45,6 +49,7 @@
4549
{
4650
"cell_type": "code",
4751
"execution_count": null,
52+
"id": "7e5dc5ad",
4853
"metadata": {},
4954
"outputs": [],
5055
"source": [
@@ -55,6 +60,7 @@
5560
{
5661
"cell_type": "code",
5762
"execution_count": null,
63+
"id": "402463e2",
5864
"metadata": {},
5965
"outputs": [],
6066
"source": [
@@ -66,18 +72,20 @@
6672
{
6773
"cell_type": "code",
6874
"execution_count": null,
75+
"id": "1f43ac8c",
6976
"metadata": {},
7077
"outputs": [],
7178
"source": [
7279
"gene_expression = pd.read_csv(ppath+\"gene_expression_thca.csv\", index_col = 0).to_numpy()\n",
73-
"metadata = pd.read_csv(ppath+\"data/thca_metadata.csv\", index_col = 0)\n",
80+
"metadata = pd.read_csv(ppath+\"thca_metadata.csv\", index_col = 0)\n",
7481
"batch = metadata['batch'].to_numpy()\n",
7582
"cancer = metadata['status'].to_numpy()\n",
7683
"sex = metadata['sex'].to_numpy()"
7784
]
7885
},
7986
{
8087
"cell_type": "markdown",
88+
"id": "f01301b9",
8189
"metadata": {},
8290
"source": [
8391
"Here gene_expression is a gene expression matrix for 19711 genes and 572 samples. Batch, cancer, and sex are sample-specific metadata as vectors of length 572."
@@ -86,6 +94,7 @@
8694
{
8795
"cell_type": "code",
8896
"execution_count": null,
97+
"id": "eefe741a",
8998
"metadata": {},
9099
"outputs": [],
91100
"source": [
@@ -97,6 +106,7 @@
97106
},
98107
{
99108
"cell_type": "markdown",
109+
"id": "e23e09b2",
100110
"metadata": {},
101111
"source": [
102112
"## 2. Applications of COBRA\n",
@@ -114,6 +124,7 @@
114124
{
115125
"cell_type": "code",
116126
"execution_count": null,
127+
"id": "56c0ce1f",
117128
"metadata": {},
118129
"outputs": [],
119130
"source": [
@@ -122,6 +133,7 @@
122133
},
123134
{
124135
"cell_type": "markdown",
136+
"id": "19e30957",
125137
"metadata": {},
126138
"source": [
127139
"For batch correction, the design matrix must contain an intercept in the first column, and the batches (encoded usy dummy coding for identifiability) in the remaining columns. "
@@ -130,6 +142,7 @@
130142
{
131143
"cell_type": "code",
132144
"execution_count": null,
145+
"id": "008a3832",
133146
"metadata": {},
134147
"outputs": [],
135148
"source": [
@@ -140,6 +153,7 @@
140153
},
141154
{
142155
"cell_type": "markdown",
156+
"id": "db8d69f1",
143157
"metadata": {},
144158
"source": [
145159
"We get a design matrix with 17 covariates (an intercept and 16 for the dummy coding) for the 572 samples in our study. "
@@ -148,6 +162,7 @@
148162
{
149163
"cell_type": "code",
150164
"execution_count": null,
165+
"id": "45b0a40d",
151166
"metadata": {},
152167
"outputs": [],
153168
"source": [
@@ -156,6 +171,7 @@
156171
},
157172
{
158173
"cell_type": "markdown",
174+
"id": "9556f9b9",
159175
"metadata": {},
160176
"source": [
161177
"We are now ready to fit COBRA"
@@ -164,6 +180,7 @@
164180
{
165181
"cell_type": "code",
166182
"execution_count": null,
183+
"id": "c0b3776b",
167184
"metadata": {},
168185
"outputs": [],
169186
"source": [
@@ -172,6 +189,7 @@
172189
},
173190
{
174191
"cell_type": "markdown",
192+
"id": "f07cae8c",
175193
"metadata": {},
176194
"source": [
177195
"The batch corrected network consider only the mean effect after removing the contribution of the batch variables. It is computed as follows. "
@@ -180,6 +198,7 @@
180198
{
181199
"cell_type": "code",
182200
"execution_count": null,
201+
"id": "feb8b5a5",
183202
"metadata": {},
184203
"outputs": [],
185204
"source": [
@@ -188,6 +207,7 @@
188207
},
189208
{
190209
"cell_type": "markdown",
210+
"id": "3c577a75",
191211
"metadata": {},
192212
"source": [
193213
"### 3.2 Differential co-expression analysis\n",
@@ -197,6 +217,7 @@
197217
{
198218
"cell_type": "code",
199219
"execution_count": null,
220+
"id": "fc0a5747",
200221
"metadata": {},
201222
"outputs": [],
202223
"source": [
@@ -205,6 +226,7 @@
205226
},
206227
{
207228
"cell_type": "markdown",
229+
"id": "21a412df",
208230
"metadata": {},
209231
"source": [
210232
"In this case, the design matrix contains an intercept an a second column with an indicator for cancer/ healthy. The additional columns are for the variables we want to adjust for. Similarly as before, we consider the batch variable. "
@@ -213,6 +235,7 @@
213235
{
214236
"cell_type": "code",
215237
"execution_count": null,
238+
"id": "d26518b1",
216239
"metadata": {},
217240
"outputs": [],
218241
"source": [
@@ -224,6 +247,7 @@
224247
},
225248
{
226249
"cell_type": "markdown",
250+
"id": "0df93493",
227251
"metadata": {},
228252
"source": [
229253
"We are now ready to fit COBRA and extract the component corresponding to the differential co-expression. Since the indicator variable for cancer is the second column in our design matrix, the COBRA-adjusted differential co-expression network corresponds to the second component of COBRA's decomposition. "
@@ -232,6 +256,7 @@
232256
{
233257
"cell_type": "code",
234258
"execution_count": null,
259+
"id": "3887dba5",
235260
"metadata": {},
236261
"outputs": [],
237262
"source": [
@@ -241,6 +266,7 @@
241266
},
242267
{
243268
"cell_type": "markdown",
269+
"id": "15b8e757",
244270
"metadata": {},
245271
"source": [
246272
"### 3.3 Identifying the component for a covariate of interest\n",
@@ -251,6 +277,7 @@
251277
{
252278
"cell_type": "code",
253279
"execution_count": null,
280+
"id": "c3f136eb",
254281
"metadata": {},
255282
"outputs": [],
256283
"source": [
@@ -261,6 +288,7 @@
261288
},
262289
{
263290
"cell_type": "markdown",
291+
"id": "78f33e7b",
264292
"metadata": {},
265293
"source": [
266294
"With this design, the last component of COBRA's decomposition describes the sex differes in cancer between male and females. "
@@ -269,6 +297,7 @@
269297
{
270298
"cell_type": "code",
271299
"execution_count": null,
300+
"id": "a55644d4",
272301
"metadata": {},
273302
"outputs": [],
274303
"source": [
@@ -278,6 +307,7 @@
278307
},
279308
{
280309
"cell_type": "markdown",
310+
"id": "8286994c",
281311
"metadata": {},
282312
"source": [
283313
"## Reference\n",
@@ -288,7 +318,7 @@
288318
],
289319
"metadata": {
290320
"kernelspec": {
291-
"display_name": "Python 3",
321+
"display_name": "Python 3 (ipykernel)",
292322
"language": "python",
293323
"name": "python3"
294324
},
@@ -302,7 +332,7 @@
302332
"name": "python",
303333
"nbconvert_exporter": "python",
304334
"pygments_lexer": "ipython3",
305-
"version": "3.9.7"
335+
"version": "3.10.12"
306336
}
307337
},
308338
"nbformat": 4,

0 commit comments

Comments
 (0)