Description of feature
Add an opt-in parameter that drops unmapped reads from downstream alignment outputs to reduce storage and I/O overhead during polishing and consensus refinement.
See origial proposal in #280
Current behavior
Alignment outputs currently retain both mapped and unmapped reads. This preserves compatibility with existing runs, but it increases storage usage substantially for large datasets.
Proposal
- Add a new pipeline parameter, for example
--remove_unmapped, defaulting to false.
- When enabled, filter downstream alignment outputs to mapped reads only, for example with
samtools view -b -F 4, and index the filtered files.
- Keep the current behavior as the default for backwards compatibility.
- Update tests and documentation accordingly.
Description of feature
Add an opt-in parameter that drops unmapped reads from downstream alignment outputs to reduce storage and I/O overhead during polishing and consensus refinement.
See origial proposal in #280
Current behavior
Alignment outputs currently retain both mapped and unmapped reads. This preserves compatibility with existing runs, but it increases storage usage substantially for large datasets.
Proposal
--remove_unmapped, defaulting tofalse.samtools view -b -F 4, and index the filtered files.