Skip to content

Reproducibility of ZymoR10.3 reads classification #21

@anaome

Description

@anaome

Dear Jens-Uwe,

I am trying to reproduce the ONT reads classification you present in your paper.

I downloaded the Zymo reads dataset (https://nanopore.s3.climb.ac.uk/mock/Zymo-GridION-EVEN-3Peaks-R103-merged.fq.gz) and the RefSeq index you provide.

I filtered the reads dataset to only keep the 426213 reads present in the supplementary file (ZymoR103-groundTruth.binning).

I ran the commands described in the supplemental methods (with version 0.1.3):

taxor search --index-file refseq-abfv-k22-s12.hixf --query-file ZymoR103-groundTruth.reads.fq --output-file zymo_refseq_mapped.search.txt --error-rate 0.15
taxor profile --search-file zymo_refseq_mapped.search.txt --cami-report-file zymo_refseq_mapped.report --seq-abundance-file zymo_refseq_mapped.abundance --binning-file zymo_refseq_mapped.binning --sample-id zymo_mapped

And this is the top of the abundance file produced:

@SAMPLEID:zymo_mapped
@Version:0.10.0
@Ranks:superkingdom|phylum|class|order|family|genus|species
@@TaXiD RANK TAXPATH TAXPATHSN PERCENTAGE
unclassified no rank - - 34.8245
2 superkingdom 2 Bacteria 62.8096
2759 superkingdom 2759 Eukaryota 1.18561
1224 phylum 2|1224 Bacteria|Pseudomonadota 38.0304
1239 phylum 2|1239 Bacteria|Bacillota 24.7792
4890 phylum 2759|4890 Eukaryota|Ascomycota 0.800521

Compared to the file ZymoR103-groundTruth.abundance:

@SAMPLEID:ZymoR10.3
@Version:0.10.0
@Ranks:superkingdom|phylum|class|order|family|genus|species
@@TaXiD RANK TAXPATH TAXPATHSN PERCENTAGE
unclassified no rank - - 8.22434
2 superkingdom 2 Bacteria 88.6767
2759 superkingdom 2759 Eukaryota 2.25609
1224 phylum 2|1224 Bacteria|Pseudomonadota 38.9176
1239 phylum 2|1239 Bacteria|Bacillota 49.7591
4890 phylum 2759|4890 Eukaryota|Ascomycota 1.51353

How to explain the much higher rate of unclassified reads in my attempt to repeat your analysis ?

Best regards !

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions