Skip to content

Commit da67055

Browse files
Swap steps 5 and 6 in README.md
1 parent 669f612 commit da67055

File tree

1 file changed

+29
-29
lines changed

1 file changed

+29
-29
lines changed

README.md

+29-29
Original file line numberDiff line numberDiff line change
@@ -102,35 +102,7 @@ The <code>--bind</code> (Singularity) or <code>--volume</code> (Docker) paramete
102102

103103
<li>
104104
<p>
105-
Inside the container, we will first set up our local instance of the <a href="https://imputationserver.readthedocs.io/en/latest/">Michigan Imputation Server</a>.
106-
</p>
107-
<p>
108-
The <code>setup-hadoop</code> command will start a <a href="https://hadoop.apache.org/">Hadoop</a> instance on your computer, which consists of four background processes. When you are finished processing all your samples, you can stop them with the <code>stop-hadoop</code> command. If you are using Docker, then these processes will be stopped automatically when you exit the container shell.
109-
</p>
110-
<p>
111-
The <code>setup-imputationserver</code> script will then verify that the Hadoop instance works, and then install the <a href="https://imputationserver.readthedocs.io/en/latest/reference-panels/#1000-genomes-phase-3-version-5">1000 Genomes Phase 3 v5</a> genome reference that will be used for imputation (around 15 GB of data, so it may take a while).
112-
</p>
113-
<p>
114-
If you are resuming analyses in an existing working directory, and do not still have the Hadoop background processes running, then you should re-run the setup commands. If they are still running, then you can skip this step.
115-
</p>
116-
117-
```bash
118-
setup-hadoop --n-cores 8
119-
setup-imputationserver
120-
```
121-
122-
<p>
123-
If you encounter any warnings or messages while running these commands, you should consult with an expert to find out what they mean and if they may be important. However, processing will usually complete without issues, even if some warnings occur.
124-
</p>
125-
<p>
126-
If something important goes wrong, then you will usually see a clear error message that contains the word "error". Please note that if the commands take more than an hour to run the setup, then that may also indicate that an error occurred.
127-
</p>
128-
129-
</li>
130-
131-
<li>
132-
<p>
133-
Next, go to <code>/data/mds</code>, and run the script <code>enigma-mds</code> for your <code>.bed</code> file set. The script creates the files <code>mdsplot.pdf</code> and <code>HM3_b37mds2R.mds.csv</code>, which are summary statistics that you will need to share with your working group as per the <a href="https://enigma.ini.usc.edu/wp-content/uploads/2020/02/ENIGMA-1KGP_p3v5-Cookbook_20170713.pdf">ENIGMA Imputation Protocol</a>.
105+
Inside the container, we will first go to <code>/data/mds</code>, and run the script <code>enigma-mds</code> for your <code>.bed</code> file set. The script creates the files <code>mdsplot.pdf</code> and <code>HM3_b37mds2R.mds.csv</code>, which are summary statistics that you will need to share with your working group as per the <a href="https://enigma.ini.usc.edu/wp-content/uploads/2020/02/ENIGMA-1KGP_p3v5-Cookbook_20170713.pdf">ENIGMA Imputation Protocol</a>.
134106
</p>
135107
<p>
136108
Note that this script will create all output files in the current folder, so you should use <code>cd</code> to change to the <code>/data/mds/sample</code> folder before running it.
@@ -167,6 +139,34 @@ enigma-mds --bfile /data/raw/sample_b
167139

168140
</li>
169141

142+
<li>
143+
<p>
144+
Next, we will set up our local instance of the <a href="https://imputationserver.readthedocs.io/en/latest/">Michigan Imputation Server</a>.
145+
</p>
146+
<p>
147+
The <code>setup-hadoop</code> command will start a <a href="https://hadoop.apache.org/">Hadoop</a> instance on your computer, which consists of four background processes. When you are finished processing all your samples, you can stop them with the <code>stop-hadoop</code> command. If you are using Docker, then these processes will be stopped automatically when you exit the container shell.
148+
</p>
149+
<p>
150+
The <code>setup-imputationserver</code> script will then verify that the Hadoop instance works, and then install the <a href="https://imputationserver.readthedocs.io/en/latest/reference-panels/#1000-genomes-phase-3-version-5">1000 Genomes Phase 3 v5</a> genome reference that will be used for imputation (around 15 GB of data, so it may take a while).
151+
</p>
152+
<p>
153+
If you are resuming analyses in an existing working directory, and do not still have the Hadoop background processes running, then you should re-run the setup commands. If they are still running, then you can skip this step.
154+
</p>
155+
156+
```bash
157+
setup-hadoop --n-cores 8
158+
setup-imputationserver
159+
```
160+
161+
<p>
162+
If you encounter any warnings or messages while running these commands, you should consult with an expert to find out what they mean and if they may be important. However, processing will usually complete without issues, even if some warnings occur.
163+
</p>
164+
<p>
165+
If something important goes wrong, then you will usually see a clear error message that contains the word "error". Please note that if the commands take more than an hour to run the setup, then that may also indicate that an error occurred.
166+
</p>
167+
168+
</li>
169+
170170
<li>
171171
<p>
172172
Next, go to <code>/data/qc</code>, and run <code>enigma-qc</code> for your <code>.bed</code> file sets. This will drop any strand ambiguous SNPs, then screen for low minor allele frequency, missingness and *Hardy-Weinberg equilibrium*, then remove duplicate SNPs (if necessary), and finally convert the data to sorted <code>.vcf.gz</code> format for imputation.

0 commit comments

Comments
 (0)