Skip to content

Commit 8bc41f1

Browse files
committed
v0.3.2 - bug fix and input check
1 parent 54c7b15 commit 8bc41f1

31 files changed

+1012
-788
lines changed

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ The Personal Cancer Genome Reporter (PCGR) is a stand-alone software package int
1515
[![Documentation Status](https://readthedocs.org/projects/pcgr/badge/?version=latest)](http://pcgr.readthedocs.io/en/latest/?badge=latest)
1616

1717

18-
### Annotation resources included in PCGR (v0.3)
18+
### Annotation resources included in PCGR (v0.3.2)
1919

2020
* [VEP v85](http://www.ensembl.org/info/docs/tools/vep/index.html) - Variant Effect Predictor release 85 (GENCODE v19 as the gene reference dataset)
2121
* [COSMIC v80](http://cancer.sanger.ac.uk/cosmic/) - Catalogue of somatic mutations in cancer (February 2017)
@@ -53,16 +53,16 @@ A local installation of Python (it has been tested with [version 2.7.13](https:/
5353

5454
#### STEP 2: Download PCGR
5555

56-
<font color="red"><b>April 14th 2017</b>: New release (0.3.1)</font>
56+
<font color="red"><b>April 19th 2017</b>: New release (0.3.2)</font>
5757

58-
1. Download and unpack the [latest release (0.3.1)](https://github.com/sigven/pcgr/releases/latest)
58+
1. Download and unpack the [latest release (0.3.2)](https://github.com/sigven/pcgr/releases/latest)
5959
2. Download and unpack the data bundle (approx. 17Gb) in the PCGR directory
60-
* Download [the latest data bundle](https://drive.google.com/file/d/0B8aYD2TJ472mQjZOMmg4djZfT1k/) from Google Drive to `~/pcgr-X.X` (replace _X.X_ with the version number, e.g `~/pcgr-0.3.1`)
60+
* Download [the latest data bundle](https://drive.google.com/file/d/0B8aYD2TJ472mQjZOMmg4djZfT1k/) from Google Drive to `~/pcgr-X.X` (replace _X.X_ with the version number, e.g `~/pcgr-0.3.2`)
6161
* Unpack the data bundle, e.g. through the following Unix command: `gzip -dc pcgr.databundle.GRCh37.YYYYMMDD.tgz | tar xvf -`
6262

6363
A _data/_ folder within the _pcgr-X.X_ software folder should now have been produced
64-
3. Pull the [PCGR Docker image (0.3.1)](https://hub.docker.com/r/sigven/pcgr/) from DockerHub (3.1Gb):
65-
* `docker pull sigven/pcgr:0.3.1` (PCGR annotation engine)
64+
3. Pull the [PCGR Docker image (0.3.2)](https://hub.docker.com/r/sigven/pcgr/) from DockerHub (3.1Gb):
65+
* `docker pull sigven/pcgr:0.3.2` (PCGR annotation engine)
6666

6767
#### STEP 3: Input preprocessing
6868

@@ -112,7 +112,7 @@ A tumor sample report is generated by calling the Python script __pcgr.py__ in t
112112

113113
positional arguments:
114114
pcgr_dir PCGR base directory with accompanying data directory,
115-
e.g. ~/pcgr-0.3.1
115+
e.g. ~/pcgr-0.3.2
116116
output_dir Output directory
117117
sample_id Tumor sample/cancer genome identifier - prefix for
118118
output files
@@ -146,7 +146,7 @@ A tumor sample report is generated by calling the Python script __pcgr.py__ in t
146146

147147
The _examples_ folder contain sample files from TCGA. A report for a colorectal tumor case can be generated through the following command:
148148

149-
`python pcgr.py --input_vcf tumor_sample.COAD.vcf.gz --input_cna_segments tumor_sample.COAD.cna.tsv ~/pcgr-0.3.1 ~/pcgr-0.3.1/examples tumor_sample.COAD`
149+
`python pcgr.py --input_vcf tumor_sample.COAD.vcf.gz --input_cna_segments tumor_sample.COAD.cna.tsv ~/pcgr-0.3.2 ~/pcgr-0.3.2/examples tumor_sample.COAD`
150150

151151
This command will run the Docker-based PCGR workflow and produce the following output files in the _examples_ folder:
152152

-4 Bytes
Binary file not shown.
0 Bytes
Binary file not shown.
4 Bytes
Binary file not shown.
1.4 KB
Binary file not shown.

docs/_build/html/_sources/annotation_resources.rst.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -79,19 +79,19 @@ A requirement for all variant annotation datasets used in PCGR is that
7979
they have been mapped unambiguously to the human genome (GRCh37). For
8080
most datasets this is already the case (i.e. dbSNP, COSMIC, ClinVar
8181
etc.). A significant proportion of variants in the annotation datasets
82-
related to clinical interpretation, CIViC and CBMDB, are however not
82+
related to clinical interpretation, CIViC and CBMDB, is however not
8383
mapped to the genome. Whenever possible, we have utilized
8484
`TransVar <http://bioinformatics.mdanderson.org/transvarweb/>`__ to
8585
identify the actual genomic variants (e.g. *g.chr7:140453136A>T*) that
86-
corresponds to variants reported with other HGVS nomenclature (e.g.
86+
correspond to variants reported with other HGVS nomenclature (e.g.
8787
*p.V600E*).
8888

8989
Other data quality concerns
9090
~~~~~~~~~~~~~~~~~~~~~~~~~~~
9191

9292
**Clinical biomarkers**
9393

94-
Clinical biomarkers included in PCGR is limited to the following:
94+
Clinical biomarkers included in PCGR are limited to the following:
9595

9696
- Markers reported at the variant level (e.g. **BRAF p.V600E**)
9797
- Markers reported at the codon level (e.g. **KRAS p.G12**)

docs/_build/html/_sources/getting_started.rst.txt

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -42,18 +42,18 @@ terminal window.
4242
Download PCGR
4343
^^^^^^^^^^^^^
4444

45-
**April 14th 2017**: New release (0.3.1)
45+
**April 19th 2017**: New release (0.3.2)
4646

4747
- Download and unpack the `latest release
48-
(0.3.1) <https://github.com/sigven/pcgr/releases/latest>`__
48+
(0.3.2) <https://github.com/sigven/pcgr/releases/latest>`__
4949

5050
- Download and unpack the data bundle (approx. 17Gb) in the PCGR
5151
directory
5252

5353
- Download `the latest data
5454
bundle <https://drive.google.com/file/d/0B8aYD2TJ472mQjZOMmg4djZfT1k/>`__
5555
from Google Drive to ``~/pcgr-X.X`` (replace *X.X* with the
56-
version number, e.g. ``~/pcgr-0.3.1``)
56+
version number, e.g. ``~/pcgr-0.3.2``)
5757
- Decompress and untar the bundle, e.g. through the following Unix
5858
command:
5959
``gzip -dc pcgr.databundle.GRCh37.YYYYMMDD.tgz | tar xvf -``
@@ -62,10 +62,10 @@ Download PCGR
6262
have been produced
6363

6464
- Pull the `PCGR Docker image -
65-
0.3.1 <https://hub.docker.com/r/sigven/pcgr/>`__ from DockerHub
65+
0.3.2 <https://hub.docker.com/r/sigven/pcgr/>`__ from DockerHub
6666
(3.1Gb) :
6767

68-
- ``docker pull sigven/pcgr:0.3.1`` (PCGR annotation engine)
68+
- ``docker pull sigven/pcgr:0.3.2`` (PCGR annotation engine)
6969

7070
Run test - generation of clinical report for a cancer genome
7171
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -89,7 +89,7 @@ A tumor sample report is generated by calling the Python script
8989

9090
positional arguments:
9191
pcgr_dir PCGR base directory with accompanying data directory,
92-
e.g. ~/pcgr-0.3
92+
e.g. ~/pcgr-0.3.2
9393
output_dir Output directory
9494
sample_id Tumor sample/cancer genome identifier - prefix for
9595
output files
@@ -125,7 +125,7 @@ sequenced within TCGA. A report for a colorectal tumor case can be
125125
generated by running the following command in your terminal window:
126126

127127
``python pcgr.py --input_vcf examples/tumor_sample.COAD.vcf.gz --input_cna_segments``
128-
``examples/tumor_sample.COAD.cna.tsv ~/pcgr-0.3.1 ~/pcgr-0.3.1/examples tumor_sample.COAD``
128+
``examples/tumor_sample.COAD.cna.tsv ~/pcgr-0.3.2 ~/pcgr-0.3.2/examples tumor_sample.COAD``
129129

130130
This command will run the Docker-based PCGR workflow and produce the
131131
following output files in the *examples* folder:

docs/_build/html/_sources/output.rst.txt

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -36,17 +36,24 @@ work properly:
3636
`tabix <http://www.htslib.org/doc/tabix.html>`__
3737
- 'chr' must be stripped from the chromosome names
3838

39-
**IMPORTANT NOTE**: Considering the VCF output for the `numerous somatic
40-
SNV/InDel callers <https://www.biostars.org/p/19104/>`__ that have been
41-
developed, we have a experienced a general lack of uniformity and
42-
robustness for the representation of somatic variant genotype data (e.g.
43-
variant allelic depths (tumor/normal), genotype quality etc.). In the
44-
output results provided within the current version of PCGR, we are
39+
**IMPORTANT NOTE 1**: Considering the VCF output for the `numerous
40+
somatic SNV/InDel callers <https://www.biostars.org/p/19104/>`__ that
41+
have been developed, we have a experienced a general lack of uniformity
42+
and robustness for the representation of somatic variant genotype data
43+
(e.g. variant allelic depths (tumor/normal), genotype quality etc.). In
44+
the output results provided within the current version of PCGR, we are
4545
considering PASSed variants only, and variant genotype data (i.e. as
4646
found in the VCF SAMPLE columns) are not handled or parsed. As improved
4747
standards for this matter may emerge, we will strive to include this
4848
information in the annotated output files.
4949

50+
**IMPORTANT NOTE 2**: PCGR generates a number of VCF INFO annotation
51+
tags that is appended to the query VCF. We will therefore encourage the
52+
users to submit query VCF files that have not been subject to
53+
annotations by other means, but rather a VCF file that comes directly
54+
from variant calling. If not, there are likely to be INFO tags in the
55+
query VCF file that coincide with those produced by PCGR.
56+
5057
Copy number segments
5158
^^^^^^^^^^^^^^^^^^^^
5259

docs/_build/html/annotation_resources.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -234,17 +234,17 @@ <h2>Genome mapping<a class="headerlink" href="#genome-mapping" title="Permalink
234234
they have been mapped unambiguously to the human genome (GRCh37). For
235235
most datasets this is already the case (i.e. dbSNP, COSMIC, ClinVar
236236
etc.). A significant proportion of variants in the annotation datasets
237-
related to clinical interpretation, CIViC and CBMDB, are however not
237+
related to clinical interpretation, CIViC and CBMDB, is however not
238238
mapped to the genome. Whenever possible, we have utilized
239239
<a class="reference external" href="http://bioinformatics.mdanderson.org/transvarweb/">TransVar</a> to
240240
identify the actual genomic variants (e.g. <em>g.chr7:140453136A&gt;T</em>) that
241-
corresponds to variants reported with other HGVS nomenclature (e.g.
241+
correspond to variants reported with other HGVS nomenclature (e.g.
242242
<em>p.V600E</em>).</p>
243243
</div>
244244
<div class="section" id="other-data-quality-concerns">
245245
<h2>Other data quality concerns<a class="headerlink" href="#other-data-quality-concerns" title="Permalink to this headline"></a></h2>
246246
<p><strong>Clinical biomarkers</strong></p>
247-
<p>Clinical biomarkers included in PCGR is limited to the following:</p>
247+
<p>Clinical biomarkers included in PCGR are limited to the following:</p>
248248
<ul class="simple">
249249
<li>Markers reported at the variant level (e.g. <strong>BRAF p.V600E</strong>)</li>
250250
<li>Markers reported at the codon level (e.g. <strong>KRAS p.G12</strong>)</li>

docs/_build/html/getting_started.html

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -189,18 +189,18 @@ <h3>Python<a class="headerlink" href="#python" title="Permalink to this headline
189189
</div>
190190
<div class="section" id="download-pcgr">
191191
<h3>Download PCGR<a class="headerlink" href="#download-pcgr" title="Permalink to this headline"></a></h3>
192-
<p><strong>April 14th 2017</strong>: New release (0.3.1)</p>
192+
<p><strong>April 19th 2017</strong>: New release (0.3.2)</p>
193193
<ul>
194194
<li><p class="first">Download and unpack the <a class="reference external" href="https://github.com/sigven/pcgr/releases/latest">latest release
195-
(0.3.1)</a></p>
195+
(0.3.2)</a></p>
196196
</li>
197197
<li><p class="first">Download and unpack the data bundle (approx. 17Gb) in the PCGR
198198
directory</p>
199199
<ul class="simple">
200200
<li>Download <a class="reference external" href="https://drive.google.com/file/d/0B8aYD2TJ472mQjZOMmg4djZfT1k/">the latest data
201201
bundle</a>
202202
from Google Drive to <code class="docutils literal"><span class="pre">~/pcgr-X.X</span></code> (replace <em>X.X</em> with the
203-
version number, e.g. <code class="docutils literal"><span class="pre">~/pcgr-0.3.1</span></code>)</li>
203+
version number, e.g. <code class="docutils literal"><span class="pre">~/pcgr-0.3.2</span></code>)</li>
204204
<li>Decompress and untar the bundle, e.g. through the following Unix
205205
command:
206206
<code class="docutils literal"><span class="pre">gzip</span> <span class="pre">-dc</span> <span class="pre">pcgr.databundle.GRCh37.YYYYMMDD.tgz</span> <span class="pre">|</span> <span class="pre">tar</span> <span class="pre">xvf</span> <span class="pre">-</span></code></li>
@@ -209,10 +209,10 @@ <h3>Download PCGR<a class="headerlink" href="#download-pcgr" title="Permalink to
209209
have been produced</p>
210210
</li>
211211
<li><p class="first">Pull the <a class="reference external" href="https://hub.docker.com/r/sigven/pcgr/">PCGR Docker image -
212-
0.3.1</a> from DockerHub
212+
0.3.2</a> from DockerHub
213213
(3.1Gb) :</p>
214214
<ul class="simple">
215-
<li><code class="docutils literal"><span class="pre">docker</span> <span class="pre">pull</span> <span class="pre">sigven/pcgr:0.3.1</span></code> (PCGR annotation engine)</li>
215+
<li><code class="docutils literal"><span class="pre">docker</span> <span class="pre">pull</span> <span class="pre">sigven/pcgr:0.3.2</span></code> (PCGR annotation engine)</li>
216216
</ul>
217217
</li>
218218
</ul>
@@ -236,7 +236,7 @@ <h2>Run test - generation of clinical report for a cancer genome<a class="header
236236

237237
<span class="n">positional</span> <span class="n">arguments</span><span class="p">:</span>
238238
<span class="n">pcgr_dir</span> <span class="n">PCGR</span> <span class="n">base</span> <span class="n">directory</span> <span class="k">with</span> <span class="n">accompanying</span> <span class="n">data</span> <span class="n">directory</span><span class="p">,</span>
239-
<span class="n">e</span><span class="o">.</span><span class="n">g</span><span class="o">.</span> <span class="o">~/</span><span class="n">pcgr</span><span class="o">-</span><span class="mf">0.3</span>
239+
<span class="n">e</span><span class="o">.</span><span class="n">g</span><span class="o">.</span> <span class="o">~/</span><span class="n">pcgr</span><span class="o">-</span><span class="mf">0.3</span><span class="o">.</span><span class="mi">2</span>
240240
<span class="n">output_dir</span> <span class="n">Output</span> <span class="n">directory</span>
241241
<span class="n">sample_id</span> <span class="n">Tumor</span> <span class="n">sample</span><span class="o">/</span><span class="n">cancer</span> <span class="n">genome</span> <span class="n">identifier</span> <span class="o">-</span> <span class="n">prefix</span> <span class="k">for</span>
242242
<span class="n">output</span> <span class="n">files</span>
@@ -272,7 +272,7 @@ <h2>Run test - generation of clinical report for a cancer genome<a class="header
272272
sequenced within TCGA. A report for a colorectal tumor case can be
273273
generated by running the following command in your terminal window:</p>
274274
<p><code class="docutils literal"><span class="pre">python</span> <span class="pre">pcgr.py</span> <span class="pre">--input_vcf</span> <span class="pre">examples/tumor_sample.COAD.vcf.gz</span> <span class="pre">--input_cna_segments</span></code>
275-
<code class="docutils literal"><span class="pre">examples/tumor_sample.COAD.cna.tsv</span> <span class="pre">~/pcgr-0.3.1</span> <span class="pre">~/pcgr-0.3.1/examples</span> <span class="pre">tumor_sample.COAD</span></code></p>
275+
<code class="docutils literal"><span class="pre">examples/tumor_sample.COAD.cna.tsv</span> <span class="pre">~/pcgr-0.3.2</span> <span class="pre">~/pcgr-0.3.2/examples</span> <span class="pre">tumor_sample.COAD</span></code></p>
276276
<p>This command will run the Docker-based PCGR workflow and produce the
277277
following output files in the <em>examples</em> folder:</p>
278278
<ol class="arabic simple">

0 commit comments

Comments
 (0)