This project is part of the Google Summer of Code 2024 program. It provides a Nextflow-based pipeline for processing DNA sequences and generating BAM files. The pipeline supports multiple mapping tools and various configurations to suit different research needs.
- Supports multiple mapping programs (e.g., BWA, SMALT, SSAHA)
- Handles paired-end and single-end reads
- Quality filtering and duplicate marking
- Generates pseudosequences
- Supports indel calling and variant detection
- Configurable parameters for advanced usage
To install the necessary dependencies, ensure you have Nextflow installed. You can refer to the Nextflow installation guide for detailed instructions.
curl -s https://get.nextflow.io | bash
Additionally, you need to install Docker or Singularity. Follow the respective installation guides:
Docker Installation Guide
Singularity Installation Guide
Clone the repository and navigate to the project directory:
git clone
cd multiple-mappings-to-bam
Use the input CLI to pass the parameters. Run the CLI:
nextflow run main.nf --ref absolute/path/to/ref/file --program BWA --read_dir <path/to/directory/containing/reads>
The pipeline can be run without having to directly clone this code repository, using the following syntax (will execute the latest pipeline version):
nextflow run sanger-pathogens/Multiple-mappings-to-bam --ref absolute/path/to/ref/file --program BWA --read_dir <path/to/directory/containing/reads>
The outputs would be generated in the directory specified by the outdir
param (default results/
)