Skip to content

Commit d671ec3

Browse files
committed
update TB tutorial
1 parent b6e9a63 commit d671ec3

File tree

5 files changed

+273
-130
lines changed

5 files changed

+273
-130
lines changed

netbooks/Welcome_to_netBooks.ipynb

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,14 @@
3131
"However, since token are temporary user IDs, the working space and all the files will be deleted as soon as you logout from netbooks or after your session has been idle for 1 hour. Therefore, please don't import work on netbooks as the user files are not persistent on disk. netbooks is meant for learning and exploring application cases of network biology and to promote reproducibile analyses by providing containerized software tools. Please check how to run netbooks locally or consider [Google colab](https://colab.research.google.com/notebooks/intro.ipynb#recent=true) for persistent work spaces.\n",
3232
"\n",
3333
"#### Run netbooks locally\n",
34-
"To run the netbooks on your local machine, you can clone the netbooks [GitHub repository](https://github.com/netZoo/netbooks) and install the dependencies required in the beginning of each tutorial. We've also provided links to an [AWS S3 bucket](https://aws.amazon.com/) to download all the data needed to run the analysis.\n",
34+
"To run the netbooks on your local machine, you can clone the netbooks [GitHub repository](https://github.com/netZoo/netbooks) and install the dependencies required in the beginning of each tutorial. We've also provided links to a public AWS S3 bucket to download all the data needed to run the analysis. These files can be downloaded using the file URLs in the netbook as follows: `curl -O urlToFile`\n",
35+
"\n",
36+
"#### netbooks\n",
37+
"There are two types of netbooks:\n",
38+
"\n",
39+
"- **Vignettes** are brief code samples that demonstrate how the methods can be used, their inputs, and how to interpret their output\n",
40+
"\n",
41+
"- **Case studies and published studies** are investigatations that provide biological or methodlogical insights. Published studies netbooks allow to reproduce the numerical results and figures of published papers.\n",
3542
"\n",
3643
"### Requirements\n",
3744
"Netbooks works best on Google Chrome. Some network visualization features are only available through Google Chrome.\n",

netbooks/netZooPy/Building_a_regulation_prior_network.ipynb

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@
5252
"\n",
5353
"- Or you can directly use the final computed networks that we provide at the end.\n",
5454
"\n",
55-
"To demonstrate the output of the analysis, we will set a limit to the computation by defining the following parameter:"
55+
"To demonstrate the output of the analysis, we will set a limit to the computation by defining the `iterlimit` parameter to 50. To run the analysis on the full set of TFs please set `iterlimit` to -1."
5656
]
5757
},
5858
{
@@ -148,7 +148,7 @@
148148
" namelist=[]\n",
149149
" for fasta in fasta_sequences:\n",
150150
" k=k+1\n",
151-
" if k>iterlimit:\n",
151+
" if iterlimit>-1 & k>iterlimit:\n",
152152
" break\n",
153153
" name, sequence = fasta.id, str(fasta.seq)\n",
154154
" new_sequence = reduceSequence(sequence)\n",
@@ -240,7 +240,7 @@
240240
" k=0 # iteration counter\n",
241241
" for file in os.listdir():\n",
242242
" k=k+1\n",
243-
" if k>iterlimit:\n",
243+
" if iterlimit>-1 & k>iterlimit:\n",
244244
" break\n",
245245
" bashCommand = \"/home/ubuntu/meme/libexec/meme-5.4.1/matrix2meme <\" + file + \"> \" + file + \".meme\"\n",
246246
" res=os.system(bashCommand)\n",
@@ -584,7 +584,8 @@
584584
"outputs": [],
585585
"source": [
586586
"computeNetworks=1 # can be set to zero to skip the computation and go to the next section\n",
587-
"nTFs=iterlimit # Number of TFs is set ot the iteration limit defined earlier\n",
587+
"if iterlimit > -1:\n",
588+
" nTFs=iterlimit # Number of TFs is set ot the iteration limit defined earlier\n",
588589
"numPool=2 # the number of parallel workers"
589590
]
590591
},

netbooks/netZooR/ALPACA.ipynb

Lines changed: 107 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -12,57 +12,69 @@
1212
},
1313
{
1414
"cell_type": "markdown",
15-
"metadata": {
16-
"lines_to_next_cell": 0
17-
},
15+
"metadata": {},
16+
"source": [
17+
"# Introduction\n",
18+
"\n",
19+
"ALtered Partitions Across Community Architectures (ALPACA)<sup>1</sup> is a method that allows to compare a case and a control network by finding differences in their community structure. In this vignette, we will explore the structure of the input data to ALPACA and the interpretation of the output."
20+
]
21+
},
22+
{
23+
"cell_type": "markdown",
24+
"metadata": {},
1825
"source": [
19-
"Install and load netZooR package"
26+
"You need to set the `runserver` parameter to 1, if you're running this vignette on the server. Otherwise, if the vignette is ran locally this parameter has to be set to 0."
2027
]
2128
},
2229
{
2330
"cell_type": "code",
2431
"execution_count": null,
25-
"metadata": {
26-
"eval": false
27-
},
32+
"metadata": {},
2833
"outputs": [],
2934
"source": [
30-
"# install.packages(\"devtools\") \n",
31-
"#library(devtools)\n",
32-
"# install netZooR pkg with vignettes, otherwise remove the \"build_vignettes = TRUE\" argument.\n",
33-
"#devtools::install_github(\"netZoo/netZooR\", build_vignettes = TRUE)"
35+
"runserver=1"
36+
]
37+
},
38+
{
39+
"cell_type": "markdown",
40+
"metadata": {},
41+
"source": [
42+
"if you're running this vignette locally, you need to install ALPACA through the netZooR package by running the following lines."
3443
]
3544
},
3645
{
3746
"cell_type": "code",
3847
"execution_count": null,
39-
"metadata": {
40-
"message": false,
41-
"warning": false
42-
},
48+
"metadata": {},
4349
"outputs": [],
4450
"source": [
45-
"library(netZooR)"
51+
"if (runserver==0){\n",
52+
" is_netZooR_available <- require(\"netZooR\")\n",
53+
" if (is_netZooR_available==0){\n",
54+
" install.packages(\"remotes\") \n",
55+
" library(remotes)\n",
56+
" remotes::install_github(\"netZoo/netZooR\", build_vignettes = TRUE)\n",
57+
" }\n",
58+
" ppath=''\n",
59+
"}else{\n",
60+
" ppath='/opt/data/'\n",
61+
"}"
4662
]
4763
},
4864
{
4965
"cell_type": "markdown",
50-
"metadata": {
51-
"lines_to_next_cell": 0
52-
},
66+
"metadata": {},
5367
"source": [
54-
"This vignettes can be accessed in R by using below line. when netZoooR was installed with arguments *\"build_vignettes = TRUE\"*."
68+
"Then, we need to load the `netZooR` package to use ALPACA."
5569
]
5670
},
5771
{
5872
"cell_type": "code",
5973
"execution_count": null,
60-
"metadata": {
61-
"eval": false
62-
},
74+
"metadata": {},
6375
"outputs": [],
6476
"source": [
65-
"#vignette(\"ALPACA\",package=\"netZooR\")"
77+
"library(netZooR)"
6678
]
6779
},
6880
{
@@ -74,7 +86,27 @@
7486
"## A simple example with two node groups\n",
7587
"We will show how ALPACA can find changes in modular structure between two simulated networks. The networks both have 20 regulator nodes and 80 target nodes. The baseline network consists of two groups that are strongly connected to each other, whereas the perturbed network has weaker connections between the two groups. The two groups consist of nodes {A1-A10,B1-B40} and {A11-A20,B41-B80}. Contrasting the two networks using ALPACA identifies these two groups as being the modules that best characterize the perturbation.\n",
7688
"\n",
77-
"These simulated networks is available in our public AWS S3 bucket. Change the preferred working directory to store the Example_2comm.txt file, otherwise the store directory is current working directory."
89+
"These simulated networks are available in netbooks public AWS S3 bucket (s3://netzoo). Change the preferred working directory to store the Example_2comm.txt file, otherwise the store directory is current working directory.\n",
90+
"\n",
91+
"If you are running the netbook locally, please run the following command line to download the file from AWS."
92+
]
93+
},
94+
{
95+
"cell_type": "code",
96+
"execution_count": null,
97+
"metadata": {},
98+
"outputs": [],
99+
"source": [
100+
"if (runserver==0){\n",
101+
" system(\"curl -O https://netzoo.s3.us-east-2.amazonaws.com/netZooR/tutorial_datasets/Example_2comm.txt\")\n",
102+
"}"
103+
]
104+
},
105+
{
106+
"cell_type": "markdown",
107+
"metadata": {},
108+
"source": [
109+
"On the server, the file can be loaded as follows:"
78110
]
79111
},
80112
{
@@ -85,8 +117,39 @@
85117
},
86118
"outputs": [],
87119
"source": [
88-
"#system(\"curl -O https://netzoo.s3.us-east-2.amazonaws.com/netZooR/tutorial_datasets/Example_2comm.txt\")\n",
89-
"simp.mat <- read.table(\"/opt/data/Example_2comm.txt\",header=T) "
120+
"simp.mat <- read.table(paste0(ppath,\"Example_2comm.txt\"),header=T) \n",
121+
"simp.mat"
122+
]
123+
},
124+
{
125+
"cell_type": "markdown",
126+
"metadata": {},
127+
"source": [
128+
"The input to ALPACA `simp.mat` in this case is a 4-column dataframe that include:\n",
129+
"- Source nodes in column 1 (TFs)\n",
130+
"- Target nodes in column 2 (Genes)\n",
131+
"- Edge weight in control network (network 1)\n",
132+
"- Edge weight in case network (network 2)\n",
133+
"\n",
134+
"Now, we can run ALPACA on these 2 networks"
135+
]
136+
},
137+
{
138+
"cell_type": "code",
139+
"execution_count": null,
140+
"metadata": {},
141+
"outputs": [],
142+
"source": [
143+
"simp.alp <- alpaca(simp.mat,NULL,verbose=F)\n",
144+
"simp.alp"
145+
]
146+
},
147+
{
148+
"cell_type": "markdown",
149+
"metadata": {},
150+
"source": [
151+
"The result list `simp.alp` contains 2 slots. The first one is a community assignement for each node and the second one is a modularity score for each node, which indicates the contribution of each node to the modularity of the community that it belongs to. \n",
152+
"In the first slot, the node are not labeled, therefore, we need to label them as follows"
90153
]
91154
},
92155
{
@@ -99,13 +162,27 @@
99162
},
100163
"outputs": [],
101164
"source": [
102-
"simp.alp <- alpaca(simp.mat,NULL,verbose=F)\n",
103165
"simp.alp2 <- simp.alp[[1]]\n",
104166
"simp.memb <- as.vector(simp.alp2)\n",
105167
"names(simp.memb) <- names(simp.alp2)\n",
106-
"\n",
107168
"simp.memb"
108169
]
170+
},
171+
{
172+
"cell_type": "markdown",
173+
"metadata": {},
174+
"source": [
175+
"`simp.memb` has the labeled node memberships, for example `A1` belongs to community 1 and `A19` belongs to community 2."
176+
]
177+
},
178+
{
179+
"cell_type": "markdown",
180+
"metadata": {},
181+
"source": [
182+
"# References\n",
183+
"\n",
184+
"1- Padi, Megha, and John Quackenbush. \"Detecting phenotype-driven transitions in regulatory network structure.\" NPJ systems biology and applications 4.1 (2018): 1-12."
185+
]
109186
}
110187
],
111188
"metadata": {
@@ -125,7 +202,7 @@
125202
"mimetype": "text/x-r-source",
126203
"name": "R",
127204
"pygments_lexer": "r",
128-
"version": "3.6.2"
205+
"version": "4.1.1"
129206
}
130207
},
131208
"nbformat": 4,

0 commit comments

Comments
 (0)