suggested error-rate setting for PacBio hifi reads

Hello-

I see in the [Supplemental Methods](https://genome.cshlp.org/content/suppl/2024/07/09/gr.278623.123.DC1/Supplemental_Methods.pdf) that an error-rate of 0.15 was used for all reads (nanopore and PacBio).  

Can you suggest a good starting point for error-rate for recently generated (highly accurate)  PacBio hifi reads?

I have data that I know are highly accurate so I initially set error-rate to .01.  However in my 14 samples I got very high unclassified/no rank (from 79-94% unclassified across the 14 samples).  I used the refseq-abfv-k22-s12.hixf index. When increasing error-rate to 0.05 and again to 0.15 there are still fairly high rates of unclassified reads:


```
sample | --error-rate .01 | --error-rate .05 | --error-rate .15

1 | 86.1 | 76.2 | 66.3
2 | 79.4 | 65.1 | 49.2
3 | 92.6 | 84.0 | 71.5
4 | 92.7 | 83.2 | 69.5
5 | 87.4 | 75.5 | 63.2
6 | 93.4 | 82.2 | 67.3
7 | 91.1 | 80.7 | 67.4
8 | 93.4 | 76.1 | 57.5
9 | 91.9 | 84.1 | 74.1
10 | 88.0 | 77.8 | 64.7
11 | 81.1 | 63.6 | 46.1
12 | 84.9 | 74.0 | 61.4
13 | 94.4 | 87.0 | 75.7
14 | 89.7 | 80.0 | 68.7
```



I will download the GTDB Release 220 index and try that but I thought I would seek out suggestions to increase % classified.

Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

suggested error-rate setting for PacBio hifi reads #19

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

suggested error-rate setting for PacBio hifi reads #19

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions