-
Notifications
You must be signed in to change notification settings - Fork 54
Description
Is there a simple explanation of what the seed_depth parameter means? It seems quite important, yet I am not grasping its significance despite having read the manuscript a couple of times.
I am interested in trying NextDenovo for some bacterial genomes from Nanopore data. I have low sequencing depth (20-50x), but a preliminary run of a single genome with 20x sequencing depth gives me 5 contigs for one 5.5Mbp sized genome. This is in line with what I get from Canu and Flye with the same data (also 5 contigs from each of these assemblers).
I am specifically interested in NextDenovo's read correction, but its assembly would be a plus. I understand it may not have been created for this purpose (i.e., bacterial genomes).
Its performance so far seems adequate, but I do not fully understand how it works, and the significance of the various parameters, most importantly seed_depth.
-
If I understand, seed_depth is the sequencing depth value which is used to select the longest reads that amount to <seed_depth> for error correction? So, 45X means starting with the longest read in order of decreasing length, reads will continue to be selected from the input dataset until 45X sequencing depth is reached. These are then used in error correction?
-
Where does seed_cutoff come in? Is this the read length below which reads will not be selected in the above step, no matter the current depth?
-
Lastly, --blacklist is used to somehow keep more reads in the corrected reads set, yes? I saw in another issue that setting this is what I want if I want to also assemble the corrected reads using another assembler?
Thank you kindly,
Conrad