You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This `.tree` file _can_ be opened with dendroscope (if you downloaded it locally) or [iTOL](https://itol.embl.de/) or another tree-visualization software **HOWEVER** the full terL tree we will produce will have 5000+ leaves, which will be difficult to load into tree-viewers and might crash your computer. Instead, we will post-process the tree using the `ete3` library in Python. We've built two python scripts using this library to shrink the size of tree outputs. The locations of the tree-pruning scripts will be in your python_scripts directory `python_scripts/1.5_collapse_non_target_clades.py` and `python_scripts/1.5_trim_tree_to_500_neighbors.py`. Below are explanations of these scripts.
76
+
This `.tree` file _can_ be opened with dendroscope (if you downloaded it locally) or [iTOL](https://itol.embl.de/) or another tree-visualization software **HOWEVER** the full terL tree we will produce will have 5000+ leaves, which will be difficult to load into tree-viewers and might crash your computer when trying to find your viruses.
77
77
78
-
### Post-process large tree --> smaller tree
78
+
Instead, you could post-process the tree using the `ete3` library in Python.
79
79
80
-
#### Prune the tree
81
-
82
-
`# 1.5_trim_tree_to_500_neighbors.py`
83
-
84
-
**User inputs:**
85
-
1. Newick `.tree` file
86
-
2. A user-selected viral contig of interest
87
-
88
-
**What the script does:**
89
-
Prunes the terL tree around your selected contig by selecting 500 of the closest neighbours (based on branch lengths)
80
+
>## Exercise - Post-process large tree --> smaller tree
81
+
>
82
+
> There are a few ways to post-process trees to make them viewable or to highlight our viruses of interest.
83
+
> One way would be build a script that takes the input of a single contig and the tree file and pruning the tree around the contig to a certain number of the closest reference viruses. Pruning a tree means only certain clades or leaves are left.
84
+
> Instead of pruning, you can also "collapse" clades to make the tree manageable to view. For this, clades are collapsed into a single leaf that replaces them. In very large trees, you will probably encounter large clades that are far away from your sequences of interest that can be collapsed and replace. For this, you might want to give an input of all your viruses of interest.
85
+
> The `ete3` library is a really useful resource to process trees.
86
+
>{: .source}
87
+
{: .challenge}
90
88
91
-
**Script outputs:**
92
-
1. A pruned tree with only 500 leaves
93
-
2. An itol annotation file
89
+
We've built two python scripts using this library to shrink the size of tree outputs. The locations of the tree-pruning scripts will be in your python_scripts directory `python_scripts/1.5_collapse_non_target_clades.py` and `python_scripts/1.5_trim_tree_to_500_neighbors.py`.
90
+
`1.5_trim_tree_to_500_neighbors.py` will trim your tree around a contig of your choice. `1.5_collapse_non_target_clades.py` will collapse clades that don't contain viruses from our dataset. These scripts also include a section to generate ITOL annotation files.
2. A file `user_leaves.txt` containing a list of contig ids of interest like below
151
-
152
-
```
153
-
# user_leaves.txt
154
-
contig_0001_CDS_0001
155
-
contig_0002_CDS_0005
156
-
contig_0003_CDS_0023
157
-
```
158
-
159
-
**What the script does:**
160
-
Collapses clades in the tree that are over 100 leaves and do not contain your contigs. *Note: these clades are no longer available to view in the collapsed tree.*
161
-
162
-
**Script outputs:**
163
-
1. A tree with collapsed clades
164
-
2. An itol annotation file for the user-selected contigs
> Use the below sbatch script make your tree and post-process it. You will have change the contig name in the sbatch script. **Note:** See exercise question #10 below - you might want to plot this 'pet contig' here, or the one from yesterday!
242
210
> **Please include a graphic of tree in your lab books with your own contigs labeled or highlighted somehow.**
211
+
>
212
+
> Please pause once you have made the trees! We will visualize the tree together with a demo in itol.
Please stop here! We will visualize the tree together with a demo in itol.
294
-
295
264
### vConTACT3
296
265
297
266
vConTACT3 has an underlying assumption that the fraction of shared genes between two viruses represents their evolutionary relationship. The vConTACT3 gene-sharing network closely correlates with the ICTV taxonomy.
0 commit comments