-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Hi,
I am trying to run and replicate the giraffe SV paper results for GIAB 0.6 sample, to genotype SVs from the Illumina paired reads. I am currently following the steps from the giraffe paper, where instead of using the toil-vg scripts, the steps mentioned in the Snakefile for giraffe paper are the ones i am referring to (https://github.com/vgteam/vg_snakemake/blob/master/Snakefile).
Input ref : hs37d5.fa.gz and its corresponding index
input vcf : HG002_SVs_Tier1_v0.6.vcf.gz and its index file
After running the "vg construct" step without pre-processing input VCF, I am getting the following error in the log files,
"stderr": "Restricting to 1 from 1 to end\n building graph for 1 [ ] 0.0%\rwarning:[vg::Constructor] Lowercase characters found in 1; coercing to uppercase.\nwarning:[vg::Constructor] Unsupported IUPAC ambiguity codes found in 1; coercing to N.\nerror:[vg::Constructor] unacceptable characters found in 1.\nerror[VPKG::load_one]: Correct input type not found in standard input while loading handlegraph::MutablePathMutableHandleGraph\n",
->> I figured out the initial warnings, but the error towards the end re occurs, even if I re-Index the the input files.
->> Hickey et al 2019, uses the GIAB 0.5 samples as here, (https://github.com/vgteam/sv-genotyping-paper/tree/master/human/giab),
but the pre-processing step for preparing the input SV catalog file, is what gives me null VCF with only headers when running for GIAB 0.6, similar to steps below:
wget -nc ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/analysis/NIST_UnionSVs_12122017/svanalyzer_union_171212_v0.5.0_annotated.vcf.gz
wget ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/analysis/NIST_UnionSVs_12122017/svanalyzer_union_171212_v0.5.0_annotated.vcf.gz.tbi
vcfkeepinfo svanalyzer_union_171212_v0.5.0_annotated.vcf.gz NA | vcffixup - | bgzip > giab-0.5.vcf.gz
tabix -f -p vcf giab-0.5.vcf.gz
->> What I am trying to understand is how to prepare the input SV catalog file for GIAB 0.6 to be used in the "vg construct - ..." step, so as to avoid the above errors ?