Skip to content

Anomalous results in using RGI bwt for metagenomics #319

@AlePetroni

Description

@AlePetroni

Input features:
DNA metagenomic samples sequenced with Illumina (2x150), pre-processed with BBDuk, rendering a median of 262 Mreads (both R1 and R2).
RGI features: RGI v6.0.5 (conda); CARD database v4.0.1, with and without Wildcard (+Wcard, -Wcard, respectively).

Commands used:
rgi bwt -n 24 -1 sample1_R1.fastq.gz -2 sample1_R2.fastq.gz --output_file sample1_output --local
rgi bwt -n 24 -1 sample1_R1.fastq.gz -2 sample1_R2.fastq.gz --output_file sample1_output --local --include_wildcard

Outputs:
a) WARNINGS -Wcard: about 1,300, all of them with text " has few mapped reads to make consensus sequence skipping: 'ARO ID' ".
b) WARNINGS +Wcard: about 14,400; almost all with text " has few mapped reads to make consensus sequence skipping: 'ARO ID' "; a few of them with text Exception: '2219'.
c) File *_output.overall_mapping_stats.txt:
-. Proper-pairs: about 27 Mreads (-Wcard), 107 Mreads (+Wcard), i.e., much more higher than numbers of Mapped reads --> anomalous result!!
-. Mapped reads: about 226,000 reads (-Wcard), 445,000 reads (+Wcard).
-.Both pairs mapped: about 214,000 reads (-Wcard), 424,000 reads (+Wcard).
-.Singletons: about 12,000 reads (-Wcard), 21,000 reads (+Wcard).
d) Files *_output.gene_mapping_data.txt and *_output.allele_mapping_data.txt: values of Completely Mapped Reads (206,000 and 300,000, for -Wcard and +Wcard) and Mapped Reads with Flanking Sequence (0 for both runs) were different from those recorded in the overall statistics --> anomalous results !!

Similar anomalous results were also obtained by using another sample of the same sequence set.

Any idea why these anomalous results were obtained?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions