|
99 | 99 | <li class="toctree-l2"><a class="reference internal" href="#cancer-gene-knowledge-bases">Cancer gene knowledge bases</a></li> |
100 | 100 | </ul> |
101 | 101 | </li> |
| 102 | +<li class="toctree-l1"><a class="reference internal" href="#notes-on-variant-annotation-datasets">Notes on variant annotation datasets</a><ul> |
| 103 | +<li class="toctree-l2"><a class="reference internal" href="#genome-mapping">Genome mapping</a></li> |
| 104 | +<li class="toctree-l2"><a class="reference internal" href="#other-data-quality-concerns">Other data quality concerns</a></li> |
| 105 | +</ul> |
| 106 | +</li> |
102 | 107 | <li class="toctree-l1"><a class="reference internal" href="output.html">Input & output</a></li> |
103 | 108 | </ul> |
104 | 109 |
|
@@ -223,6 +228,40 @@ <h2>Cancer gene knowledge bases<a class="headerlink" href="#cancer-gene-knowledg |
223 | 228 | (February 2017)</li> |
224 | 229 | </ul> |
225 | 230 | </div> |
| 231 | +</div> |
| 232 | +<div class="section" id="notes-on-variant-annotation-datasets"> |
| 233 | +<h1>Notes on variant annotation datasets<a class="headerlink" href="#notes-on-variant-annotation-datasets" title="Permalink to this headline">¶</a></h1> |
| 234 | +<div class="section" id="genome-mapping"> |
| 235 | +<h2>Genome mapping<a class="headerlink" href="#genome-mapping" title="Permalink to this headline">¶</a></h2> |
| 236 | +<p>A requirement for all variant annotation datasets used in PCGR is that |
| 237 | +they have been mapped unambiguously to the human genome (GRCh37). For |
| 238 | +most datasets this is already the case (i.e. dbSNP, COSMIC, ClinVar |
| 239 | +etc.). A significant proportion of variants in the annotation datasets |
| 240 | +related to clinical interpretation, CIViC and CBMDB, are however not |
| 241 | +mapped to the genome. Whenever possible, we have utilized |
| 242 | +<a class="reference external" href="http://bioinformatics.mdanderson.org/transvarweb/">TransVar</a> to |
| 243 | +identify the actual genomic variants (e.g. <em>g.chr7:140453136A>T</em>) that |
| 244 | +corresponds to variants reported with other HGVS nomenclature (e.g. |
| 245 | +<em>p.V600E</em>).</p> |
| 246 | +</div> |
| 247 | +<div class="section" id="other-data-quality-concerns"> |
| 248 | +<h2>Other data quality concerns<a class="headerlink" href="#other-data-quality-concerns" title="Permalink to this headline">¶</a></h2> |
| 249 | +<p><strong>Clinical biomarkers</strong> Clinical biomarkers included in PCGR is limited |
| 250 | +to the following: * Markers reported at the variant level (e.g. <strong>BRAF |
| 251 | +p.V600E</strong>) * Markers reported at the codon level (e.g. <strong>KRAS p.G12</strong>) |
| 252 | +* Markers reported at the exon level (e.g. <strong>KIT exon 11 mutation</strong>) * |
| 253 | +Within CBMDB, only markers collected from FDA/NCCN guidelines, |
| 254 | +scientific literature and clinical trials are included (markers |
| 255 | +collected from conference abstracts are not included)</p> |
| 256 | +<p><strong>COSMIC variants</strong> The COSMIC dataset that is part of the PCGR |
| 257 | +annotation bundle is the subset of variants that satisfy the following |
| 258 | +criteria: * <strong>Mutation somatic status</strong> is either |
| 259 | +‘<em>confirmed_somatic</em>‘ or |
| 260 | +‘<em>reported_in_another_cancer_sample_as_somatic</em>‘. * |
| 261 | +<strong>Site/histology</strong> must be known and the sample must come from a |
| 262 | +malignant tumor (i.e. not polyps/adenomas, which are also found in |
| 263 | +COSMIC)</p> |
| 264 | +</div> |
226 | 265 | </div> |
227 | 266 |
|
228 | 267 |
|
|
0 commit comments