New pipeline: nf-core/hapmal

### Pipeline title/name

hapmal

### Keywords

Malaria, genomics, haplotype calling, Nextflow 

### What is it about?

It unifies processing of both short-read (Illumina) and long-read (Oxford Nanopore Technologies, ONT) data in a single, reproducible, containerized workflow for compositional profiling in multiclonal P. falciparum samples

### Please provide a schematic diagram of the proposed pipeline

<img width="1672" height="941" alt="Image" src="https://github.com/user-attachments/assets/2308b934-0eb2-48d9-bd07-17158208fc84" />

### What would a minimal first release of this pipeline include?

Quality control of raw FASTQ files (FastQC)
Host read removal (BBDuk)
Alignment to the P. falciparum reference genome (Minimap2, works for both short and long reads)
Variant calling optimized for Plasmodium (Clair3 for long reads and basic Snippy support for short reads)
Basic haplotype profiling and compositional analysis of multiclonal infections (PHARE-inspired module)
Generation of a simple summary report (Nextflow and MultiQC-style output)

### I confirm my proposed pipeline will follow nf-core guidelines. Most importantly, my pipeline will:

- [x] be built with Nextflow.
- [x] pass nf-core lint tests and use standardized parameters.
- [x] be community-owned and developed within the nf-core organization.
- [x] open source under the MIT license with proper credits and acknowledgments.
- [x] have a descriptive, all lowercase, and without punctuation name.
- [x] use the nf-core pipeline template and predominantly use official nf-core modules.
- [x] focus on a specific data/analysis type with appropriate scope.
- [x] have properly maintained documentation.
- [x] be bundled using versioned Docker/Singularity containers.

### Why do we need a new pipeline?

Existing nf-core pipelines such as pathogensurveillance provide excellent general frameworks for pathogen identification and variant detection. However, they do not specifically address the unique biological and technical challenges of malaria genomic surveillance.
HapMal will fill a critical gap by delivering the first nf-core pipeline dedicated to Plasmodium species (primarily P. falciparum), with a strong focus on hybrid short-read (Illumina) and long-read (ONT) data in a single unified workflow for for compositional profiling in multiclonal P. falciparum samples

### Who would be interested?

Malaria genomic surveillance researchers and laboratories in malaria-endemic countries, Public health agencies and national malaria control programmes, Bioinformaticians and computational biologists working on Malaria and Field-based or low-resource setting teams who need portable, offline-capable, containerized workflows

### What has been done so far

We have not yet started the development of the HapMal pipeline code.
The project is currently at the concept and planning stage. We have carefully reviewed the existing literature on malaria genomic surveillance, identified key gaps in hybrid short- and long-read analysis, and designed the overall architecture of the pipeline with a strong emphasis on compositional profiling of multiclonal P. falciparum infections.
We have assembled a small core team consisting of researchers and bioinformaticians with experience in  basic to intermediate Nextflow skills. The team includes Nextflow ambassadors who are familiar with nf-core best practices, containerization (Docker/Singularity), and workflow development. This gives us a solid foundation to follow nf-core standards from the very beginning.

### URL to existing work (if applicable)

_No response_

### Are there any similar existing nf-core pipelines?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New pipeline: nf-core/hapmal #134

Pipeline title/name

Keywords

What is it about?

Please provide a schematic diagram of the proposed pipeline

What would a minimal first release of this pipeline include?

I confirm my proposed pipeline will follow nf-core guidelines. Most importantly, my pipeline will:

Why do we need a new pipeline?

Who would be interested?

What has been done so far

URL to existing work (if applicable)

Are there any similar existing nf-core pipelines?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

New pipeline: nf-core/hapmal #134

Description

Pipeline title/name

Keywords

What is it about?

Please provide a schematic diagram of the proposed pipeline

What would a minimal first release of this pipeline include?

I confirm my proposed pipeline will follow nf-core guidelines. Most importantly, my pipeline will:

Why do we need a new pipeline?

Who would be interested?

What has been done so far

URL to existing work (if applicable)

Are there any similar existing nf-core pipelines?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions