Pipeline title/name
hapmal
Keywords
Malaria, genomics, haplotype calling, Nextflow
What is it about?
It unifies processing of both short-read (Illumina) and long-read (Oxford Nanopore Technologies, ONT) data in a single, reproducible, containerized workflow for compositional profiling in multiclonal P. falciparum samples
Please provide a schematic diagram of the proposed pipeline
What would a minimal first release of this pipeline include?
Quality control of raw FASTQ files (FastQC)
Host read removal (BBDuk)
Alignment to the P. falciparum reference genome (Minimap2, works for both short and long reads)
Variant calling optimized for Plasmodium (Clair3 for long reads and basic Snippy support for short reads)
Basic haplotype profiling and compositional analysis of multiclonal infections (PHARE-inspired module)
Generation of a simple summary report (Nextflow and MultiQC-style output)
I confirm my proposed pipeline will follow nf-core guidelines. Most importantly, my pipeline will:
Why do we need a new pipeline?
Existing nf-core pipelines such as pathogensurveillance provide excellent general frameworks for pathogen identification and variant detection. However, they do not specifically address the unique biological and technical challenges of malaria genomic surveillance.
HapMal will fill a critical gap by delivering the first nf-core pipeline dedicated to Plasmodium species (primarily P. falciparum), with a strong focus on hybrid short-read (Illumina) and long-read (ONT) data in a single unified workflow for for compositional profiling in multiclonal P. falciparum samples
Who would be interested?
Malaria genomic surveillance researchers and laboratories in malaria-endemic countries, Public health agencies and national malaria control programmes, Bioinformaticians and computational biologists working on Malaria and Field-based or low-resource setting teams who need portable, offline-capable, containerized workflows
What has been done so far
We have not yet started the development of the HapMal pipeline code.
The project is currently at the concept and planning stage. We have carefully reviewed the existing literature on malaria genomic surveillance, identified key gaps in hybrid short- and long-read analysis, and designed the overall architecture of the pipeline with a strong emphasis on compositional profiling of multiclonal P. falciparum infections.
We have assembled a small core team consisting of researchers and bioinformaticians with experience in basic to intermediate Nextflow skills. The team includes Nextflow ambassadors who are familiar with nf-core best practices, containerization (Docker/Singularity), and workflow development. This gives us a solid foundation to follow nf-core standards from the very beginning.
URL to existing work (if applicable)
No response
Are there any similar existing nf-core pipelines?
No response
Pipeline title/name
hapmal
Keywords
Malaria, genomics, haplotype calling, Nextflow
What is it about?
It unifies processing of both short-read (Illumina) and long-read (Oxford Nanopore Technologies, ONT) data in a single, reproducible, containerized workflow for compositional profiling in multiclonal P. falciparum samples
Please provide a schematic diagram of the proposed pipeline
What would a minimal first release of this pipeline include?
Quality control of raw FASTQ files (FastQC)
Host read removal (BBDuk)
Alignment to the P. falciparum reference genome (Minimap2, works for both short and long reads)
Variant calling optimized for Plasmodium (Clair3 for long reads and basic Snippy support for short reads)
Basic haplotype profiling and compositional analysis of multiclonal infections (PHARE-inspired module)
Generation of a simple summary report (Nextflow and MultiQC-style output)
I confirm my proposed pipeline will follow nf-core guidelines. Most importantly, my pipeline will:
Why do we need a new pipeline?
Existing nf-core pipelines such as pathogensurveillance provide excellent general frameworks for pathogen identification and variant detection. However, they do not specifically address the unique biological and technical challenges of malaria genomic surveillance.
HapMal will fill a critical gap by delivering the first nf-core pipeline dedicated to Plasmodium species (primarily P. falciparum), with a strong focus on hybrid short-read (Illumina) and long-read (ONT) data in a single unified workflow for for compositional profiling in multiclonal P. falciparum samples
Who would be interested?
Malaria genomic surveillance researchers and laboratories in malaria-endemic countries, Public health agencies and national malaria control programmes, Bioinformaticians and computational biologists working on Malaria and Field-based or low-resource setting teams who need portable, offline-capable, containerized workflows
What has been done so far
We have not yet started the development of the HapMal pipeline code.
The project is currently at the concept and planning stage. We have carefully reviewed the existing literature on malaria genomic surveillance, identified key gaps in hybrid short- and long-read analysis, and designed the overall architecture of the pipeline with a strong emphasis on compositional profiling of multiclonal P. falciparum infections.
We have assembled a small core team consisting of researchers and bioinformaticians with experience in basic to intermediate Nextflow skills. The team includes Nextflow ambassadors who are familiar with nf-core best practices, containerization (Docker/Singularity), and workflow development. This gives us a solid foundation to follow nf-core standards from the very beginning.
URL to existing work (if applicable)
No response
Are there any similar existing nf-core pipelines?
No response