Skip to content

[bug] Uncaught exceptions lead to unclear job failures #107

@abought

Description

@abought

Summary

Certain kinds of QC errors are not handled in the code, and lead to mysterious job failures that a user cannot diagnose or fix. This is one of our most frequent helpdesk inquiries.

Actual behavior

Non-admin users are not allowed to see the full job logs tab. Thus, they cannot inspect the stack trace to see the error description. Th actual information they are presented with is rather opaque.
Screen Shot 2023-06-23 at 3 29 57 PM

Screen Shot 2023-06-23 at 3 43 05 PM

Common scenarios

  • htsjdk does not support VCF4.3, and files in this format fail to parse.

Task 'Calculating QC Statistics' failed.
Exception:java.lang.IllegalArgumentException: Writing VCF version VCF4_3 is not implemented
at htsjdk.variant.variantcontext.writer.VCFWriter.rejectVCFV43Headers(VCFWriter.java:275)

  • It appears that certain VCF fields are required. This isn't captured in the data preparation docs, and some users have triggered an error they cannot see.

Task 'Calculating QC Statistics' failed.
Exception:java.io.IOException: /mnt/jobs/job-20230623-145718-031/input/files/split.chr1.vcf.gz: Line 7812: No GT field found in FORMAT column.
at genepi.io.text.AbstractLineReader.next(AbstractLineReader.java:46)

  • In the newest Minimac 4.1.x series, Minimac has been changed to stop the job if too many allele swaps are detected. QC does not check this, and the error is indicated only in minimac stdout (--> not captured by the admin or user level job logs)

Imputing chrREDACTED:x-y ...
Loading target haplotypes ...
Loading target haplotypes took 0 seconds
Loading reference haplotypes ...
Loading reference haplotypes took 22 seconds
Typed sites to imputed sites ratio: 0 (0/redacted)
Error: not enough target variants are available to impute this chunk. The --min-ratio, --chunk, or --region options may need to be altered.

Expected behavior

  • Document required VCF fields such as GT in the data preparation docs. (if appropriate)
  • Handle the two exception cases noted above, and provide helpful messages that will appear in the part of the job report visible to regular users
  • Provide a fallback message for any other unhandled error types, indicating that a user should reach out to the helpdesk.
  • Consider adding some sort of logging event for unhandled error cases that stop the QC flow, so that developers can identify future edge cases that might be confusing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions