Skip to content

Commit 1a258be

Browse files
committed
0.9.1 release
1 parent 575295b commit 1a258be

File tree

229 files changed

+10994
-6262
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

229 files changed

+10994
-6262
lines changed

README.md

Lines changed: 36 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
- [News](#news)
77
- [Example reports](#example-reports)
88
- [PCGR Documentation](#documentation)
9-
- [Annotation resources](#annotation-resources-included-in-pcgr---0.9.0)
9+
- [Annotation resources](#annotation-resources-included-in-pcgr---0.9.1)
1010
- [Getting started](#getting-started)
1111
- [FAQ](#faq)
1212
- [Contact](#contact)
@@ -21,6 +21,9 @@ A few screenshots of the dashboard-type HTML output (new in 0.9.0) is shown belo
2121
![PCGR overview](pcgr_dashboard_views.png)
2222

2323
### News
24+
* _November 30th 2020_: **0.9.1 release**
25+
* Data bundle updates (CIViC, ClinVar, CancerMine, UniProt KB)
26+
* [CHANGELOG](http://pcgr.readthedocs.io/en/latest/CHANGELOG.html)
2427
* _Sep 24th 2020_: **0.9.0rc release**
2528
* Major data bundle updates (CIViC, ClinVar, CancerMine, UniProt KB, Open Targets Platform, Pfam, DisGeNET, GENCODE)
2629
* VEP v101
@@ -49,10 +52,10 @@ A few screenshots of the dashboard-type HTML output (new in 0.9.0) is shown belo
4952

5053
### Example reports
5154

52-
* [Cervical cancer sample (tumor-only)](http://insilico.hpc.uio.no/pcgr/example_reports/0.9.0rc/TCGA-FU-A3HZ-01A_TO.pcgr_acmg.grch37.flexdb.html)
53-
* [Lung cancer sample (tumor-control)](http://insilico.hpc.uio.no/pcgr/example_reports/0.9.0rc/TCGA-95-7039-01A.pcgr_acmg.grch37.flexdb.html)
54-
* [Breast cancer sample (tumor-control)](http://insilico.hpc.uio.no/pcgr/example_reports/0.9.0rc/TCGA-EW-A1J5-01A.pcgr_acmg.grch37.flexdb.html)
55-
* [Brain cancer sample (tumor-control)](http://insilico.hpc.uio.no/pcgr/example_reports/0.9.0rc/TCGA-14-0866-01B.pcgr_acmg.grch37.flexdb.html)
55+
* [Cervical cancer sample (tumor-only)](http://insilico.hpc.uio.no/pcgr/example_reports/0.9.1/TCGA-EA-A410-01A_TO.pcgr_acmg.grch37.flexdb.html)
56+
* [Lung cancer sample (tumor-control)](http://insilico.hpc.uio.no/pcgr/example_reports/0.9.1/TCGA-05-4427-01A.pcgr_acmg.grch37.flexdb.html)
57+
* [Colorectal cancer sample (tumor-control)](http://insilico.hpc.uio.no/pcgr/example_reports/0.9.1/TCGA-AD-5900-01A.pcgr_acmg.grch37.flexdb.html)
58+
* [Brain cancer sample (tumor-control)](http://insilico.hpc.uio.no/pcgr/example_reports/0.9.1/TCGA-QH-A6CU-01A.pcgr_acmg.grch37.flexdb.html)
5659

5760
(to view the rmarkdown-based reports, simply remove _.flexdb._ in the file names for the flexdashboard reports)
5861

@@ -68,23 +71,22 @@ A few screenshots of the dashboard-type HTML output (new in 0.9.0) is shown belo
6871

6972
Sigve Nakken, Ghislain Fournous, Daniel Vodák, Lars Birger Aaasheim, Ola Myklebost, and Eivind Hovig. __Personal Cancer Genome Reporter: variant interpretation report for precision oncology__ (2017). _Bioinformatics_. 34(10):1778–1780. doi:[10.1093/bioinformatics/btx817](https://doi.org/10.1093/bioinformatics/btx817)
7073

71-
### Annotation resources included in PCGR - 0.9.0
74+
### Annotation resources included in PCGR - 0.9.1
7275

7376
* [VEP](http://www.ensembl.org/info/docs/tools/vep/index.html) - Variant Effect Predictor v101 (GENCODE v35/v19 as the gene reference dataset)
74-
* [CIViC](http://civic.genome.wustl.edu) - Clinical interpretations of variants in cancer (September 20th 2020)
75-
* [ClinVar](http://www.ncbi.nlm.nih.gov/clinvar/) - Database of variants with clinical significance (August 2020)
77+
* [CIViC](http://civic.genome.wustl.edu) - Clinical interpretations of variants in cancer (November 18th 2020)
78+
* [ClinVar](http://www.ncbi.nlm.nih.gov/clinvar/) - Database of variants with clinical significance (November 2020)
7679
* [DoCM](http://docm.genome.wustl.edu) - Database of curated mutations (v3.2, Apr 2016)
7780
* [CGI](http://www.cancergenomeinterpreter.org/biomarkers) - Cancer Biomarkers database (Jan 17th 2018)
78-
* [DisGeNET](http://www.disgenet.org) - Database of gene-tumor type associations (v7.0, May 2020)
7981
* [Cancer Hotspots](http://cancerhotspots.org) - Resource for statistically significant mutations in cancer (v2 - 2017)
8082
* [dBNSFP](https://sites.google.com/site/jpopgen/dbNSFP) - Database of non-synonymous functional predictions (v4.1, June 2020)
8183
* [TCGA](https://portal.gdc.cancer.gov/) - somatic mutations discovered across 33 tumor type cohorts (The Cancer Genome Atlas (TCGA), release 25, July 2020)
8284
* [CHASMplus](https://karchinlab.github.io/CHASMplus/) - predicted driver mutations across 33 tumor type cohorts in TCGA
83-
* [UniProt/SwissProt KnowledgeBase](http://www.uniprot.org) - Resource on protein sequence and functional information (2020_04, August 2020)
85+
* [UniProt/SwissProt KnowledgeBase](http://www.uniprot.org) - Resource on protein sequence and functional information (2020_05, October 2020)
8486
* [Pfam](http://pfam.xfam.org) - Database of protein families and domains (v33, May 2020)
85-
* [Open Targets Platform](https://targetvalidation.org) - Target-disease and target-drug associations (2020_04, June 2020)
87+
* [Open Targets Platform](https://targetvalidation.org) - Target-disease and target-drug associations (2020_09, September 2020)
8688
* [ChEMBL](https://www.ebi.ac.uk/chembl/) - Manually curated database of bioactive molecules (v27, May 2020)
87-
* [CancerMine](https://zenodo.org/record/3472758#.XZjCqeczaL4) - Literature-mined database of tumor suppressor genes/proto-oncogenes (v28, September 2020)
89+
* [CancerMine](https://zenodo.org/record/4270451#.X7t43qpKiHE) - Literature-mined database of tumor suppressor genes/proto-oncogenes (v30, November 2020)
8890

8991

9092
### Getting started
@@ -95,6 +97,10 @@ An installation of Python (version _3.6_) is required to run PCGR. Check that Py
9597

9698
pip install toml
9799

100+
101+
**IMPORTANT NOTE**: STEP 1 & 2 below outline installation guidelines for running PCGR with Docker. If you want to install and run PCGR without the use of Docker (i.e. through Conda), follow [these instructions](install_no_docker/README.md)
102+
103+
98104
#### STEP 1: Installation of Docker
99105

100106
1. [Install the Docker engine](https://docs.docker.com/engine/installation/) on your preferred platform
@@ -107,33 +113,35 @@ An installation of Python (version _3.6_) is required to run PCGR. Check that Py
107113
- CPUs: minimum 4
108114
- [How to - Mac OS X](https://docs.docker.com/docker-for-mac/#advanced)
109115

116+
110117
#### STEP 2: Download PCGR and data bundle
111118

112119
##### Development version
113120

114121
a. Clone the PCGR GitHub repository (includes run script and default configuration file): `git clone https://github.com/sigven/pcgr.git`
115122

116123
b. Download and unpack the latest data bundles in the PCGR directory
117-
* [grch37 data bundle - 20200920](http://insilico.hpc.uio.no/pcgr/pcgr.databundle.grch37.20200920.tgz) (approx 17Gb)
118-
* [grch38 data bundle - 20200920](http://insilico.hpc.uio.no/pcgr/pcgr.databundle.grch38.20200920.tgz) (approx 18Gb)
124+
* [grch37 data bundle - 20201123](http://insilico.hpc.uio.no/pcgr/pcgr.databundle.grch37.20201123.tgz) (approx 17Gb)
125+
* [grch38 data bundle - 20201123](http://insilico.hpc.uio.no/pcgr/pcgr.databundle.grch38.20201123.tgz) (approx 18Gb)
119126
* *Unpacking*: `gzip -dc pcgr.databundle.grch37.YYYYMMDD.tgz | tar xvf -`
120127

121-
c. Pull the [PCGR Docker image (*dev*)](https://hub.docker.com/r/sigven/pcgr/) from DockerHub (approx 6.8Gb):
128+
c. Pull the [PCGR Docker image (*dev*)](https://hub.docker.com/r/sigven/pcgr/) from DockerHub (approx 5.1Gb):
122129
* `docker pull sigven/pcgr:dev` (PCGR annotation engine)
123130

124131
##### Latest release
125132

126-
a. Download and unpack the [latest software release (0.9.0rc)](https://github.com/sigven/pcgr/releases/tag/v0.9.0rc)
133+
a. Download and unpack the [latest software release (0.9.1)](https://github.com/sigven/pcgr/releases/tag/v0.9.1)
127134

128135
b. Download and unpack the assembly-specific data bundle in the PCGR directory
129-
* [grch37 data bundle - 20200920](http://insilico.hpc.uio.no/pcgr/pcgr.databundle.grch37.20200920.tgz) (approx 17Gb)
130-
* [grch38 data bundle - 20200920](http://insilico.hpc.uio.no/pcgr/pcgr.databundle.grch38.20200920.tgz) (approx 18Gb)
136+
* [grch37 data bundle - 20201123](http://insilico.hpc.uio.no/pcgr/pcgr.databundle.grch37.20201123.tgz) (approx 17Gb)
137+
* [grch38 data bundle - 20201123](http://insilico.hpc.uio.no/pcgr/pcgr.databundle.grch38.20201123.tgz) (approx 18Gb)
131138
* *Unpacking*: `gzip -dc pcgr.databundle.grch37.YYYYMMDD.tgz | tar xvf -`
132139

133140
A _data/_ folder within the _pcgr-X.X_ software folder should now have been produced
134141

135-
c. Pull the [PCGR Docker image (0.9.0rc)](https://hub.docker.com/r/sigven/pcgr/) from DockerHub (approx 6.8Gb):
136-
* `docker pull sigven/pcgr:0.9.0rc` (PCGR annotation engine)
142+
c. Pull the [PCGR Docker image (0.9.1)](https://hub.docker.com/r/sigven/pcgr/) from DockerHub (approx 5.1Gb):
143+
* `docker pull sigven/pcgr:0.9.1` (PCGR annotation engine)
144+
137145

138146
#### STEP 3: Input preprocessing
139147

@@ -270,6 +278,8 @@ A tumor sample report is generated by calling the Python script __pcgr.py__, whi
270278
Predict microsatellite instability status from patterns of somatic mutations/indels, default: False
271279
--estimate_signatures
272280
Estimate relative contributions of reference mutational signatures in query sample and detect potential kataegis events), default: False
281+
--tmb_algorithm {all_coding,nonsyn}
282+
Method for calculation of TMB, all coding variants (Chalmers et al., Genome Medicine, 2017), or non-synonymous variants only, default: all_coding
273283
--min_mutations_signatures MIN_MUTATIONS_SIGNATURES
274284
Minimum number of SNVs required for reconstruction of mutational signatures (SBS) by MutationalPatterns (default: 200, minimum n = 100)
275285
--all_reference_signatures
@@ -285,15 +295,15 @@ A tumor sample report is generated by calling the Python script __pcgr.py__, whi
285295

286296
The _examples_ folder contain input VCF files from two tumor samples sequenced within TCGA (**GRCh37** only). It also contains a PCGR configuration file customized for these VCFs. A report for a colorectal tumor case can be generated by running the following command in your terminal window:
287297

288-
python ~/pcgr-0.9.0rc/pcgr.py
289-
--pcgr_dir ~/pcgr-0.9.0rc
290-
--output_dir ~/pcgr-0.9.0rc
298+
python ~/pcgr-0.9.1/pcgr.py
299+
--pcgr_dir ~/pcgr-0.9.1
300+
--output_dir ~/pcgr-0.9.1
291301
--sample_id tumor_sample.COAD
292302
--genome_assembly grch37
293-
--conf ~/pcgr-0.9.0rc/examples/example_COAD.toml
294-
--input_vcf ~/pcgr-0.9.0rc/examples/tumor_sample.COAD.vcf.gz
303+
--conf ~/pcgr-0.9.1/examples/example_COAD.toml
304+
--input_vcf ~/pcgr-0.9.1/examples/tumor_sample.COAD.vcf.gz
295305
--tumor_site 9
296-
--input_cna ~/pcgr-0.9.0rc/examples/tumor_sample.COAD.cna.tsv
306+
--input_cna ~/pcgr-0.9.1/examples/tumor_sample.COAD.cna.tsv
297307
--tumor_purity 0.9
298308
--tumor_ploidy 2.0
299309
--include_trials

docs/CHANGELOG.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,21 @@
11

22
## CHANGELOG
33

4+
#### 0.9.1 - November 30th 2020
5+
* Data updates: ClinVar, GWAS catalog, CIViC, CancerMine, dbNSFP, KEGG, ChEMBL/DGIdb, Disease Ontology, Experimental Factor Ontology
6+
7+
##### Added
8+
* added possibility to configure algorithm for TMB calculation, optional argument `tmb_algorithm` - all coding variants (__all_coding__) or non-synonymous variants only (__nonsyn__)
9+
* R code subject to static analysis with [lintr](https://github.com/jimhester/lintr)
10+
* Improved Conda recipe (i.e. `meta.yaml`) with version pinning of all package dependencies
11+
12+
##### Changed
13+
* Removed DisGeNET annotations from output (associations from Open Targets Platform serve same purpose)
14+
* Version pinning of software dependencies in Dockerfile:
15+
* All R packages necessary for PCGR is installed using the [renv framework](https://rstudio.github.io/renv/index.html), ensuring improved versioning and reproducibility
16+
* Other tools/utilities and Python libraries that have been version pinned:
17+
* bedtools, samtools, numpy, cython, scipy, cyvcf2, toml, pandas
18+
419

520
#### 0.9.0rc - September 24th 2020
621

@@ -10,7 +25,7 @@
1025
##### Fixed
1126
* An extra comma was mistakenly present in the template for tier 2 variants, issue [#96](https://github.com/sigven/pcgr/issues/96)
1227
* Missing protein domain annotations for grch38, issue [#116](https://github.com/sigven/pcgr/issues/96)
13-
28+
1429
##### Changed
1530
* All arguments to `pcgr.py` is now non-positional
1631
* Arguments to `pcgr.py` are divided into two groups: _required_ and _optional_

docs/CHANGELOG.rst

Lines changed: 47 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,38 @@
11
CHANGELOG
22
---------
33

4+
0.9.1 - November 30th 2020
5+
^^^^^^^^^^^^^^^^^^^^^^^^^^
6+
7+
- Data updates: ClinVar, GWAS catalog, CIViC, CancerMine, dbNSFP, KEGG,
8+
ChEMBL/DGIdb, Disease Ontology, Experimental Factor Ontology
9+
10+
Added
11+
'''''
12+
13+
- added possibility to configure algorithm for TMB calculation,
14+
optional argument ``tmb_algorithm`` - all coding variants
15+
(**all_coding**) or non-synonymous variants only (**nonsyn**)
16+
- R code subject to static analysis with
17+
`lintr <https://github.com/jimhester/lintr>`__
18+
- Improved Conda recipe (i.e. ``meta.yaml``) with version pinning of
19+
all package dependencies
20+
21+
Changed
22+
'''''''
23+
24+
- Removed DisGeNET annotations from output (associations from Open
25+
Targets Platform serve same purpose)
26+
- Version pinning of software dependencies in Dockerfile:
27+
28+
- All R packages necessary for PCGR is installed using the `renv
29+
framework <https://rstudio.github.io/renv/index.html>`__, ensuring
30+
improved versioning and reproducibility
31+
- Other tools/utilities and Python libraries that have been version
32+
pinned:
33+
34+
- bedtools, samtools, numpy, cython, scipy, cyvcf2, toml, pandas
35+
436
0.9.0rc - September 24th 2020
537
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
638

@@ -16,6 +48,8 @@ Fixed
1648
- Missing protein domain annotations for grch38, issue
1749
`#116 <https://github.com/sigven/pcgr/issues/96>`__
1850

51+
.. _changed-1:
52+
1953
Changed
2054
'''''''
2155

@@ -56,6 +90,8 @@ Changed
5690
- Metadata - sample and sequencing assay
5791
- Report configuration
5892

93+
.. _added-1:
94+
5995
Added
6096
'''''
6197

@@ -106,15 +142,15 @@ Fixed
106142
- More improved mapping between Ensembl transcripts and UniProt
107143
accessions (using also RefSeq accessions where available)
108144

109-
.. _added-1:
145+
.. _added-2:
110146

111147
Added
112148
'''''
113149

114150
- Possibility to filter evidence items by RATING in interactive data
115151
tables
116152

117-
.. _changed-1:
153+
.. _changed-2:
118154

119155
Changed
120156
'''''''
@@ -151,7 +187,7 @@ Fixed
151187
- Bug in UpSetPlot for cases where filtering produce less than two
152188
intersecting sets
153189

154-
.. _added-2:
190+
.. _added-3:
155191

156192
Added
157193
'''''
@@ -169,7 +205,7 @@ Added
169205
0.8.1 - May 22nd 2019
170206
^^^^^^^^^^^^^^^^^^^^^
171207

172-
.. _added-3:
208+
.. _added-4:
173209

174210
Added
175211
'''''
@@ -187,7 +223,7 @@ Fixed
187223
- Bug in value box for Tier 2 variants (new line carriage) `Issue
188224
#73 <https://github.com/sigven/pcgr/issues/73>`__
189225

190-
.. _added-4:
226+
.. _added-5:
191227

192228
Added
193229
'''''
@@ -310,7 +346,7 @@ Added
310346
- Rating of the ClinVar variant (0-4 stars) with respect to level of
311347
review
312348

313-
.. _changed-2:
349+
.. _changed-3:
314350

315351
Changed
316352
'''''''
@@ -364,7 +400,7 @@ Fixed
364400
- Removed ‘COSM’ prefix in COSMIC mutation links
365401
- Bug in retrieval of splice site predictions from dbscSNV
366402

367-
.. _added-5:
403+
.. _added-6:
368404

369405
Added
370406
'''''
@@ -403,7 +439,7 @@ Added
403439

404440
- Upgraded VEP to v94
405441

406-
.. _changed-3:
442+
.. _changed-4:
407443

408444
Changed
409445
'''''''
@@ -449,7 +485,7 @@ Fixed
449485
- Bug in copy number annotation (missing protein-coding transcripts)
450486
- Updated MSI prediction (variable importance, performance measures)
451487

452-
.. _added-6:
488+
.. _added-7:
453489

454490
Added
455491
'''''
@@ -481,7 +517,7 @@ Fixed
481517
0.6.0 - April 25th 2018
482518
^^^^^^^^^^^^^^^^^^^^^^^
483519

484-
.. _added-7:
520+
.. _added-8:
485521

486522
Added
487523
'''''
@@ -642,7 +678,7 @@ Removed
642678
https://github.com/mskcc/vcf2maf will be incorporated in the
643679
next release
644680

645-
.. _changed-4:
681+
.. _changed-5:
646682

647683
Changed
648684
'''''''
7.91 KB
Binary file not shown.

docs/_build/doctrees/about.doctree

-24 Bytes
Binary file not shown.
-2.01 KB
Binary file not shown.
1.34 KB
Binary file not shown.
1.88 KB
Binary file not shown.
393 Bytes
Binary file not shown.

docs/_build/html/.buildinfo

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
# Sphinx build info version 1
22
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
3-
config: be8b6accd9ec6222abb85d34228511b5
3+
config: 5beb36e19b46cb37857c83ef6e4c61da
44
tags: 645f666f9bcd5a90fca523b33c5a78b7

0 commit comments

Comments
 (0)