Skip to content

Commit 0cd0892

Browse files
Merge pull request #326 from naupaka/main
Add links to descriptions of VCF; resolves #281
2 parents 91742d6 + ecb7653 commit 0cd0892

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

episodes/03-basics-factors-dataframes.Rmd

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -187,6 +187,13 @@ a view of the data in a new tab.
187187

188188
![RStudio data frame view](fig/rstudio_dataframeview.png)
189189

190+
The majority of the columns in the data frame correspond to standard fields found in a
191+
*Variant Call Format (VCF)* file, while others were added during our data processing. The VCF
192+
format is a standard format for storing variant calls (also known as Single Nucleotide Polymorphisms or SNPs),
193+
and you can read more about it, including a description of the fields we have here
194+
in [the VCF specification](https://samtools.github.io/hts-specs/VCFv4.2.pdf)
195+
or [on wikipedia](https://en.wikipedia.org/wiki/Variant_Call_Format).
196+
190197
We can also quickly query the dimensions of the variable using `dim()`. You'll see that the first number `801` shows the number of rows, then `29` the number of columns
191198

192199
```{r, purl=FALSE}

0 commit comments

Comments
 (0)