You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our team has been using PHGv2 to build databases and we're currently working with whole-genome sequencing (WGS) data to impute, resequence, and call rare alleles. At this point, we have the imputed fasta and two main files:
A filtered VCF file that calls variants against the imputed FASTA:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT default
chr1H 675 . T C 21.8 PASS . GT:GQ:DP:AD:VAF:PL 1/1:17:9:0,8:0.888889:21,18,0
chr1H 687 . C G 23 PASS . GT:GQ:DP:AD:VAF:PL 1/1:18:8:0,8:1:22,19,0
chr1H 757 . C A 25.3 PASS . GT:GQ:DP:AD:VAF:PL 1/1:17:7:0,7:1:25,17,0
...
And a VCF file, generated after running composite-to-haplotype-coords, that's based on pangenome ranges:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT imputed_sample_name
8bf782a4db9b969d6dba6d5857992c11 675 . T C 21.80 PASS . GT:GQ:DP:AD:VAF:PL 1/1:17:9:0,8:0.888889:21,18,0
...
ab2e1a850d5df054d3ce046a645cf487 3295 . G A 54 PASS . GT:GQ:DP:AD:VAF:PL 1/1:50:34:0,34:1:53,52,0
...
From this point:
(Sorry if I am misunderstanding concepts)
The imputed genome is the most probable reconstruction from the WGS data using the pangenome "pieces" (ranges).
We are trying to determine if it is possible to use this data to 1) improve the imputed FASTA and 2) enrich our PHG database with this new genetic diversity.
We believe that the imputation and resequencing utilities have significant potential, but we are a bit lost on how to proceed. It would be great if we can somehow enrich the high-quality assemblies database with landraces WGS data.
Please let us know if you need more information or it is not clear enough. We really appreciate your time and help.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
Our team has been using PHGv2 to build databases and we're currently working with whole-genome sequencing (WGS) data to impute, resequence, and call rare alleles. At this point, we have the imputed fasta and two main files:
A filtered VCF file that calls variants against the imputed FASTA:
And a VCF file, generated after running
composite-to-haplotype-coords, that's based on pangenome ranges:From this point:
(Sorry if I am misunderstanding concepts)
The imputed genome is the most probable reconstruction from the WGS data using the pangenome "pieces" (ranges).
We are trying to determine if it is possible to use this data to 1) improve the imputed FASTA and 2) enrich our PHG database with this new genetic diversity.
We believe that the imputation and resequencing utilities have significant potential, but we are a bit lost on how to proceed. It would be great if we can somehow enrich the high-quality assemblies database with landraces WGS data.
Please let us know if you need more information or it is not clear enough. We really appreciate your time and help.
Beta Was this translation helpful? Give feedback.
All reactions