Description
Hello, I was wondering if we could have the ability to pass arguments to the indexers for both minimap2 and starlong. We have been trying to run some wheat samples with IsoQuant but they fail during indexing due to needing special parameters to handle the larger size of the wheat genome. Alternatively, IsoQuant could check to see if the genome is unusually large and supply the parameters yourself. On disk, wheat genomes are roughly 16-17GB.
We were still able to process the data by running the aligners outside of IsoQuant and then feeding in the BAMs but would be nice to have it nicely handle the failures.
starlong failed hard as shown below, minimap2 actually tried to run and then produced bams where the symptom being complained of was no headers as in Heng's FAQ link below
minimap2: see item 3: https://github.com/lh3/minimap2/blob/master/FAQ.md#3-the-output-sam-doesnt-have-a-header
when running minimap2 outside of IsoQuant for this particular wheat assembly, a minimap2 index was able to be created in a single batch that holds all in RAM (using the recommended -I flag) with-I16G that would hold the whole 14.5 Gb genome
It took a peak of 93Gb of RAM
starlong: wants --limitGenomeGenerateRAM to be increased
STARlong --runMode genomeGenerate --runThreadN 16 --genomeDir wheat_k15_idx --genomeFastaFiles wheat.fasta
STAR version: 2.7.11b compiled: 2025-02-13T20:44:13+00:00 :/STAR-2.7.11b/source
Feb 13 20:59:42 ..... started STAR run
Feb 13 20:59:42 ... starting to generate Genome files
EXITING because of FATAL PARAMETER ERROR: limitGenomeGenerateRAM=31000000000is too small for your genome
SOLUTION: please specify --limitGenomeGenerateRAM not less than 38818982837 and make that much RAM available
Is large genome handling an enhancement you might consider incorporating or an advanced option to pass through parameters to the aligners?
Cheers
Activity