Skip to content

Commit 9307309

Browse files
committed
update of janno definition file
1 parent 1b403e9 commit 9307309

File tree

1 file changed

+37
-36
lines changed

1 file changed

+37
-36
lines changed

janno_columns.tsv

Lines changed: 37 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,37 @@
1-
janno_column_name description column_type choice_options mandatory range_lower range_upper
2-
Individual_ID String TRUE
3-
Collection_ID String TRUE
4-
Source_Tissue skeletal/tissue/source elements, multiple values separated by ; in case of merged libraries String list FALSE
5-
Country present-day political country String FALSE
6-
Location city or village nearby the site String FALSE
7-
Site site name String FALSE
8-
Latitude latitude with up to 5 places after the decimal point Float FALSE -90 90
9-
Longitude longitude with up to 5 places after the decimal point Float FALSE -180 180
10-
Date_C14_Labnr labnr of C14 date, multiple values in case of multiple dates String list FALSE
11-
Date_C14_Uncal_BP uncalibrated years BP (as in before 1950AD), as reported by C14 labs, multiple values separated by ; in the same order as Date_C14_Labnr in case of multiple dates Integer list FALSE 0 Inf
12-
Date_C14_Uncal_BP_Err standard deviation (1 sigma ±), as reported by C14 labs, multiple values separated by ; in the same order as Date_C14_Labnr in case of multiple dates Integer list FALSE 0 Inf
13-
Date_BC_AD_Median calibrated median age for C14 dates, or simple mid-points for archaeological intervals, 2000 for modern samples Integer FALSE -Inf 2050
14-
Date_BC_AD_Start lower (older) bound for the age, negative numbers for BC, positive numbers for AD, in case of C14 dates 95% interval post calibration, 2000 for modern samples Integer FALSE -Inf 2050
15-
Date_BC_AD_Stop upper (more recent) bound for the age, negative numbers for BC, positive numbers for AD, in case of C14 dates 95% interval post calibration, 2000 for modern samples Integer FALSE -Inf 2050
16-
Date_Type """C14"" if directly from the individual, ""contextual"" if based on archaeology or other C14 dates from the site, “modern” for present-day individuals" String choice C14;contextual;modern FALSE
17-
No_of_Libraries number of libraries Integer FALSE
18-
Data_Type String choice Shotgun;1240K;OtherCapture;ReferenceGenome FALSE
19-
Genotype_Ploidy String choice diploid;haploid FALSE
20-
Group_Name ideally Eisenmann rule + underscore flags, e.g. to annotate relatives or outliers or low coverage, multiple entries separated by ; to accommodate different labels String list FALSE
21-
Genetic_Sex """F"", ""M"" or ""U"" because eigenstrat and plink formats only support these three. Edge cases (XXY, XYY, X0) are undefined and should be grouped as F, M or U, with a note added" Char choice F;M;U FALSE
22-
Nr_autosomal_SNPs number of autosomal SNPs covered for 1240K capture or SG data pulldown Integer FALSE
23-
Coverage_1240K average X-fold coverage across 1240K SNP sites after quality filtering (internal data), NOT the % SNPs of 1.2M possible Float FALSE
24-
MT_Haplogroup mitochondrial haplogroup after phylotree.org as reported by Haplofind or Haplogrep String FALSE
25-
Y_Haplogroup Y-chromosome haplogroup reported as published, for internal data, please follow syntax with main branch + most terminal derived Y-SNP (e.g. R1b-P312) String FALSE
26-
Endogenous % endogenous DNA for the SG libraries, as estimated by EAGER for the best library (in percent), not on target and no quality filter Float FALSE 0 100
27-
UDG “mixed” in case multiple libraries with different UDG treatment were merged String choice minus;half;plus;mixed FALSE
28-
Library_Built “ds” for double stranded, “ss” for single stranded String choice ds;ss;other FALSE
29-
Damage % damage on 5' end for the main shotgun library used for sequencing and/or capture Float FALSE 0 100
30-
Xcontam if male for captured library Float FALSE 0 1
31-
Xcontam_stderr standard error of ANGSD X contamination estimate Float FALSE 0 Inf
32-
mtContam mitochondrial contamination rate as estimated by ContamMix and/or Schmutzi Float FALSE 0 1
33-
mtContam_stderr Standard error of ContamMix/Schmutzi estimate Float FALSE 0 Inf
34-
Primary_Contact Project lead or first author String FALSE
35-
Publication_Status """AuthorYearJournal"" or ""unpublished""" String FALSE
36-
Note wildcard comments. e.g. note down aneuploidies here String FALSE
1+
janno_column_name description column_type choice_options mandatory unique range_lower range_upper
2+
Individual_ID id as defined by the genetics laboratory, needs to be unique (e.g. I1234, BOT001), if multiple datasets exist for the same individual different IDs are required (e.g. loschbour_snpAD) String TRUE TRUE
3+
Collection_ID id as defined by the provider/owner of a sample (e.g. grave 40 skeleton 2) String FALSE FALSE
4+
Source_Tissue skeletal/tissue/source elements, multiple values separated by ; in case of merged libraries String list FALSE FALSE
5+
Country present-day political country String FALSE FALSE
6+
Location city or village nearby the site String FALSE FALSE
7+
Site site name String FALSE FALSE
8+
Latitude latitude with up to 5 places after the decimal point Float FALSE FALSE -90 90
9+
Longitude longitude with up to 5 places after the decimal point Float FALSE FALSE -180 180
10+
Date_C14_Labnr labnr of C14 date, multiple values in case of multiple dates String list FALSE FALSE
11+
Date_C14_Uncal_BP uncalibrated years BP (as in before 1950AD), as reported by C14 labs, multiple values separated by ; in the same order as Date_C14_Labnr in case of multiple dates Integer list FALSE FALSE 0 Inf
12+
Date_C14_Uncal_BP_Err standard deviation (1 sigma ±), as reported by C14 labs, multiple values separated by ; in the same order as Date_C14_Labnr in case of multiple dates Integer list FALSE FALSE 0 Inf
13+
Date_BC_AD_Median calibrated median age for C14 dates, or simple mid-points for archaeological intervals, 2000 for modern samples Integer FALSE FALSE -Inf 2050
14+
Date_BC_AD_Start lower (older) bound for the age, negative numbers for BC, positive numbers for AD, in case of C14 dates 95% interval post calibration, 2000 for modern samples Integer FALSE FALSE -Inf 2050
15+
Date_BC_AD_Stop upper (more recent) bound for the age, negative numbers for BC, positive numbers for AD, in case of C14 dates 95% interval post calibration, 2000 for modern samples Integer FALSE FALSE -Inf 2050
16+
Date_Type """C14"" if directly from the individual, ""contextual"" if based on archaeology or other C14 dates from the site, “modern” for present-day individuals" String choice C14;contextual;modern FALSE FALSE
17+
No_of_Libraries number of libraries Integer FALSE FALSE
18+
Data_Type specifics of data generation method String choice Shotgun;1240K;OtherCapture;ReferenceGenome FALSE FALSE
19+
Genotype_Ploidy ploidy of the genotypes String choice diploid;haploid FALSE FALSE
20+
Group_Name ideally Eisenmann rule + underscore flags, e.g. to annotate relatives or outliers or low coverage, multiple entries separated by ; to accommodate different labels, in case of multiple entries the first one must equal the group name in the .fam file String list TRUE FALSE
21+
Genetic_Sex """F"", ""M"" or ""U"" because eigenstrat and plink formats only support these three. Edge cases (XXY, XYY, X0) are undefined and should be grouped as F, M or U, with a note added" Char choice F;M;U TRUE FALSE
22+
Nr_autosomal_SNPs number of autosomal SNPs covered for 1240K capture or SG data pulldown Integer FALSE FALSE
23+
Coverage_1240K average X-fold coverage across 1240K SNP sites after quality filtering (internal data), NOT the % SNPs of 1.2M possible Float FALSE FALSE
24+
MT_Haplogroup mitochondrial haplogroup after phylotree.org as reported by Haplofind or Haplogrep String FALSE FALSE
25+
Y_Haplogroup Y-chromosome haplogroup reported as published, for internal data, please follow syntax with main branch + most terminal derived Y-SNP (e.g. R1b-P312) String FALSE FALSE
26+
Endogenous % endogenous DNA for the SG libraries, as estimated by EAGER for the best library (in percent), not on target and no quality filter Float FALSE FALSE 0 100
27+
UDG “mixed” in case multiple libraries with different UDG treatment were merged String choice minus;half;plus;mixed FALSE FALSE
28+
Library_Built “ds” for double stranded, “ss” for single stranded String choice ds;ss;other FALSE FALSE
29+
Damage % damage on 5' end for the main shotgun library used for sequencing and/or capture Float FALSE FALSE 0 100
30+
Xcontam if male for captured library Float FALSE FALSE 0 1
31+
Xcontam_stderr standard error of ANGSD X contamination estimate Float FALSE FALSE 0 Inf
32+
mtContam mitochondrial contamination rate as estimated by ContamMix and/or Schmutzi Float FALSE FALSE 0 1
33+
mtContam_stderr Standard error of ContamMix/Schmutzi estimate Float FALSE FALSE 0 Inf
34+
Primary_Contact Project lead or first author String FALSE FALSE
35+
Publication_Status "bibtex key (e.g. ""@AuthorYearJournal"") or ""unpublished""" String FALSE FALSE
36+
Note wildcard comments. e.g. note down aneuploidies here String FALSE FALSE
37+
Keywords Arbitrary tags separated by ; (e.g. for custom sorting purposes) String list FALSE FALSE

0 commit comments

Comments
 (0)