-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Hello-
I see in the Supplemental Methods that an error-rate of 0.15 was used for all reads (nanopore and PacBio).
Can you suggest a good starting point for error-rate for recently generated (highly accurate) PacBio hifi reads?
I have data that I know are highly accurate so I initially set error-rate to .01. However in my 14 samples I got very high unclassified/no rank (from 79-94% unclassified across the 14 samples). I used the refseq-abfv-k22-s12.hixf index. When increasing error-rate to 0.05 and again to 0.15 there are still fairly high rates of unclassified reads:
sample | --error-rate .01 | --error-rate .05 | --error-rate .15
1 | 86.1 | 76.2 | 66.3
2 | 79.4 | 65.1 | 49.2
3 | 92.6 | 84.0 | 71.5
4 | 92.7 | 83.2 | 69.5
5 | 87.4 | 75.5 | 63.2
6 | 93.4 | 82.2 | 67.3
7 | 91.1 | 80.7 | 67.4
8 | 93.4 | 76.1 | 57.5
9 | 91.9 | 84.1 | 74.1
10 | 88.0 | 77.8 | 64.7
11 | 81.1 | 63.6 | 46.1
12 | 84.9 | 74.0 | 61.4
13 | 94.4 | 87.0 | 75.7
14 | 89.7 | 80.0 | 68.7
I will download the GTDB Release 220 index and try that but I thought I would seek out suggestions to increase % classified.
Thank you.
Metadata
Metadata
Assignees
Labels
No labels