Skip to content

Optionally remove unmapped reads from downstream alignment outputs #282

@Joon-Klaps

Description

@Joon-Klaps

Description of feature

Add an opt-in parameter that drops unmapped reads from downstream alignment outputs to reduce storage and I/O overhead during polishing and consensus refinement.

See origial proposal in #280

Current behavior

Alignment outputs currently retain both mapped and unmapped reads. This preserves compatibility with existing runs, but it increases storage usage substantially for large datasets.

Proposal

  • Add a new pipeline parameter, for example --remove_unmapped, defaulting to false.
  • When enabled, filter downstream alignment outputs to mapped reads only, for example with samtools view -b -F 4, and index the filtered files.
  • Keep the current behavior as the default for backwards compatibility.
  • Update tests and documentation accordingly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions