Open
Description
Recently, I've been running modkit pileup with default settings on a metagenomic promethion run (130Gb).
The assembled metagenome consists of about 400Mbp of contigs, ranging in size from a couple 1000bp to 5Mbp, with coverage ranging from 1X to 1000X or more.
The modkit pileup is the bottleneck in my pipeline, requiring very high memory (peak about 350GB), and very long compute times (e.g. >12h on 24 cores).
In the documentation it's stated that --interval-size, --sampling-interval-size, and --chunk-size can be modified to improve parallelism.
What would be the best settings for my usecase?
Thanks!
Bram