Skip to content

Most called variants have very low quality score #936

Open
@Jiayi-Wang-Joey

Description

@Jiayi-Wang-Joey

Dear DeepVariant team,

Thanks for your great work. I am running DeepVariant on Pacbio Mas-seq scRNA-seq (on pseudo bulk level).
This is my command:

singularity exec --bind /usr/lib/locale/ deepvariant-1.8.0.simg /opt/deepvariant/bin/run_deepvariant\
             --model_type MASSEQ \
             --ref {input.ref} \
             --reads {input.bam} \
             --output_vcf {output} \
             --num_shards {params.threads} \
             --intermediate_results_dir /home/jiayiwang/tmp/{wildcards.sample} \
         > {log} 2>&1

For example, for one sample, after some filters (coverage, dbSNP etc.) I got 214241 variants and then I set a filter with QUAL >= 10, I only got 28 variants. When I set the filter to be PASS, I also only get 355 variants. Other samples have similar passing rates. I used the same filters on the results from Clair3-RNA, there are still 150830 variants left. Therefore, I assume the very small number of high quality (or PASS) variants from DeepVariant is somehow problematic.

(Is it possible because that I didn't run splitNC and flagCorrection on my bams? I tried to run these but it seems to take ages, that's why I decided to try without these.)

Do you have any idea about this?

Thanks in advance!

Kind regards,
Jiayi

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions