TORCH-Consortium
diff --git a/‎CHANGELOG.md‎
Lines changed: 2 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 24 additions & 24 deletions b/‎README.md‎
Lines changed: 24 additions & 24 deletions
diff --git a/‎conf/apptainer.config‎
Lines changed: 65 additions & 0 deletions b/‎conf/apptainer.config‎
Lines changed: 65 additions & 0 deletions
diff --git a/‎conf/docker.config‎
Lines changed: 12 additions & 2 deletions b/‎conf/docker.config‎
Lines changed: 12 additions & 2 deletions
diff --git a/‎conf/singularity.config‎
Lines changed: 12 additions & 0 deletions b/‎conf/singularity.config‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎default_params.config‎
Lines changed: 54 additions & 5 deletions b/‎default_params.config‎
Lines changed: 54 additions & 5 deletions
@@ -1,4 +1,6 @@
 # CHANGELOG FOR THE MAGMA PIPELINE VERSIONS
+<!-- https://keepachangelog.com/en/1.1.0/ -->
+
 
 ## v2.0.0
 
 
@@ -33,7 +33,7 @@ The `java` version should NOT be an `internal jdk` release! You can check the re
 Notice the `LTS` next to `OpenJDK` line.
 
 
-```bash 
+```bash
 
 $ java -version
 openjdk version "17.0.7" 2023-04-18 LTS
@@ -90,7 +90,7 @@ S0002,/full_path_to_directory_of_fastq_files/S0002_01_R1.fastq.gz,full_path_to_d
 S0003,/full_path_to_directory_of_fastq_files/S0003_01_R1.fastq.gz,
 ```
 
-If you have the metadata from sequencing instrument, you can specify further information in the samplesheet 
+If you have the metadata from sequencing instrument, you can specify further information in the samplesheet
 
 ```csv
 Study,Sample,Library,Attempt,R1,R2,Flowcell,Lane,Index Sequence
@@ -156,15 +156,15 @@ Which could be provided to the pipeline using `-params-file` parameter as shown
 
 ```console
 nextflow run 'https://github.com/TORCH-Consortium/MAGMA' \
-		 -profile conda_local, server \ 
-		 -r v1.1.1 \
-		 -params-file  my_parameters_1.yml
+         -profile conda_local, server \
+         -r v1.1.1 \
+         -params-file  my_parameters_1.yml
 
 ```
 
 # Analysis
 
-## Running MAGMA using Nextflow Tower 
+## Running MAGMA using Nextflow Tower
 
 You can also use Seqera Platform (aka Nextflow Tower) to run the pipeline on any of the supported cloud platforms and monitoring the pipeline execution.
 
@@ -181,11 +181,11 @@ You can run the pipeline using Conda, Mamba or Micromamba package managers to in
 You can find out the location of conda environments using `conda env list`. [Here's](https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf) a useful cheatsheet for conda operations.
 
 
-You can use the `conda` based setup for the pipeline for running MAGMA 
+You can use the `conda` based setup for the pipeline for running MAGMA
 - On a local linux machine(e.g. your laptop or a university server)
-- On an HPC cluster (e.g. SLURM, PBS) in case you don't have access to container systems like Singularity, Podman or Docker 
+- On an HPC cluster (e.g. SLURM, PBS) in case you don't have access to container systems like Singularity, Podman or Docker
 
-All the requisite softwares have been provided as a `conda` recipe (i.e. `yml` files) 
+All the requisite softwares have been provided as a `conda` recipe (i.e. `yml` files)
 - [magma-env-1.yml](./conda_envs/magma-env-1.yml)
 - [magma-env-2.yml](./conda_envs/magma-env-2.yml)
 
@@ -208,7 +208,7 @@ $ conda env create -n magma-env-2 --file magma-env-2.yml
 
 Once the environments are created, you can make use of the pipeline parameter `conda_envs_location` to inform the pipeline of the names and location of the conda envs.
 
-Next, you need to load the WHO Resistance Catalog within `tb-profiler`; basically the [instructions](https://github.com/TORCH-Consortium/MAGMA/blob/master/conda_envs/setup_conda_envs.sh#L20-L23), which are used to build the necessary containers. 
+Next, you need to load the WHO Resistance Catalog within `tb-profiler`; basically the [instructions](https://github.com/TORCH-Consortium/MAGMA/blob/master/conda_envs/setup_conda_envs.sh#L20-L23), which are used to build the necessary containers.
 
 1. Download [magma_resistance_db_who_v1.zip](https://github.com/TORCH-Consortium/MAGMA/files/14559680/resistance_db_who_v1.zip)  and unzip it
 
@@ -250,7 +250,7 @@ We provide [two docker containers](https://github.com/orgs/TORCH-Consortium/pack
 
 > 🚧 **Container build script**: The script used to build these containers is provided [here](./containers/build.sh).
 
-Although, you don't need to pull the containers manually, but should you need to, you could use the following commands to pull the pre-built and provided containers 
+Although, you don't need to pull the containers manually, but should you need to, you could use the following commands to pull the pre-built and provided containers
 
 ```console
 docker pull ghcr.io/torch-consortium/magma/magma-container-1:1.1.1
@@ -262,13 +262,13 @@ docker pull ghcr.io/torch-consortium/magma/magma-container-2:1.1.1
 > :memo: **Have singularity or podman instead?**: <br>
 If you do have access to Singularity or Podman, then owing to their compatibility with Docker, you can still use the provided docker containers.
 
-Here's the command which should be used 
+Here's the command which should be used
 
 ```console
 nextflow run 'https://github.com/torch-consortium/magma' \
-		 -params-file my_parameters_2.yml \
-		 -profile docker,pbs \
-		 -r v1.1.1 
+         -params-file my_parameters_2.yml \
+         -profile docker,pbs \
+         -r v1.1.1
 ```
 
 > :bulb: **Hint**: <br>
@@ -307,7 +307,7 @@ errors. Including these is optional, if unknown or irrelevant,
 just fill in with a '1' as shown in example_MAGMA_samplesheet.csv)
 ```
 
-## (Optional) GVCF datasets 
+## (Optional) GVCF datasets
 
 We also provide some reference GVCF files which you could use for specific use-cases.
 
@@ -319,7 +319,7 @@ containing GVCF reference dataset for ~600 samples is provided for augmenting sm
 
 ```
 use_ref_gvcf = false
-ref_gvcf =  "/path/to/FILE.g.vcf.gz" 
+ref_gvcf =  "/path/to/FILE.g.vcf.gz"
 ref_gvcf_tbi =  "/path/to/FILE.g.vcf.gz.tbi"
 ```
 
@@ -335,7 +335,7 @@ Tim Huepink and Lennert Verboven created an in-depth tutorial of the features of
 
 We have also included a presentation (in PDF format) of the logic and workflow of the MAGMA pipeline as well as posters that have been presented at conferences. Please refer the [docs](./docs) folder.
 
-# Interpretation 
+# Interpretation
 
 The results directory produced by MAGMA is as follows:
 
@@ -347,7 +347,7 @@ The results directory produced by MAGMA is as follows:
 └── vcf_files
 ```
 
-## QC Statistics Directory 
+## QC Statistics Directory
 
 In this directory you will find files related to the quality control carried out by the MAGMA pipeline. The structure is as follows:
 
@@ -412,7 +412,7 @@ MAGMA also notes the presence of all variants in in tier 1 and tier 2 drug resis
 
 - **Phylogeny**
 
-Contains the outputs of the IQTree phylogenetic tree construction. 
+Contains the outputs of the IQTree phylogenetic tree construction.
 
 > :memo: By default we recommend that you use the **ExDRIncComplex** files as MAGMA was optimized to be able to accurately call positions on the edges of complex regions in the *Mtb* genome
 
@@ -422,7 +422,7 @@ Contains the SNP distance tables.
 
 > :memo: By default we recommend that you use the **ExDRIncComplex** files as MAGMA was optimized to be able to accurately call positions on the edges of complex regions in the *Mtb* genome
 
-## `vcf_files` Directory 
+## `vcf_files` Directory
 
 ```bash
 /path/to/results_dir/vcf_files
@@ -463,7 +463,7 @@ Contains the SNP distance tables.
 
 > Unfiltered structural variants detected by the MAGMA pipeline
 
-## Libraries Directory 
+## Libraries Directory
 
 > Contains files related to FASTQ validation and FASTQC analysis
 
@@ -472,9 +472,9 @@ Contains the SNP distance tables.
 > Contains vcf files for major|minor|structural variants for each individual samples
 
 
-# Citations 
+# Citations
 
-The MAGMA paper has been published here: https://doi.org/10.1371/journal.pcbi.1011648 
+The MAGMA paper has been published here: https://doi.org/10.1371/journal.pcbi.1011648
 
 The XBS variant calling core was published here: https://doi.org/10.1099%2Fmgen.0.000689
 
 
@@ -0,0 +1,65 @@
+/*
+ * Copyright (c) 2021-2024 MAGMA pipeline authors, see https://doi.org/10.1371/journal.pcbi.1011648
+ *
+ * This file is part of MAGMA pipeline, see https://github.com/TORCH-Consortium/MAGMA
+ *
+ * For quick overview of GPL-3 license, please refer
+ * https://www.tldrlegal.com/license/gnu-general-public-license-v3-gpl-3
+ *
+ * - You MUST keep this license with original authors in your copy
+ * - You MUST acknowledge the original source of this software
+ * - You MUST state significant changes made to the original software
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program . If not, see <http://www.gnu.org/licenses/>.
+ */
+process {
+
+
+    withName:
+    '.*SPOTYPING.*' {
+        container = "quay.io/biocontainers/spotyping:2.1--hdfd78af_4"
+    }
+
+    withName:
+    '.*RDANALYZER.*' {
+        container = "quay.io/biocontainers/rd-analyzer:1.01--hdfd78af_0"
+    }
+
+
+    withName:
+    '.*TBPROFILER.*' {
+        container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0--1"
+    }
+
+    withName:
+    'NTMPROFILER.*' {
+        container = "ghcr.io/torch-consortium/magma/biocontainer-ntmprofiler:0.4.0"
+    }
+
+    withName:
+    'ISMAPPER.*|GATK.*|LOFREQ.*|DELLY.*|MULTIQC.*|FASTQC.*|UTILS.*|FASTQ.*|SAMPLESHEET.*' {
+        container = "ghcr.io/torch-consortium/magma/magma-container-1:2.0.0"
+    }
+
+    withName:
+    'BWA.*|IQTREE.*|SNPDISTS.*|SNPSITES.*|BCFTOOLS.*|BGZIP.*|SAMTOOLS.*|SNPEFF.*|CLUSTERPICKER.*' {
+        container = "ghcr.io/torch-consortium/magma/magma-container-2:1.1.1"
+    }
+
+}
+
+
+apptainer {
+    enabled = true
+}
@@ -26,8 +26,18 @@
 process {
 
     withName:
-    'TBPROFILER.*' {
-        container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0"
+    '.*SPOTYPING.*' {
+        container = "quay.io/biocontainers/spotyping:2.1--hdfd78af_4"
+    }
+
+    withName:
+    '.*RDANALYZER.*' {
+        container = "quay.io/biocontainers/rd-analyzer:1.01--hdfd78af_0"
+    }
+
+    withName:
+    '.*TBPROFILER.*' {
+        container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0--1"
     }
 
     withName:
 
@@ -25,6 +25,18 @@
  */
 process {
 
+
+    withName:
+    '.*SPOTYPING.*' {
+        container = "quay.io/biocontainers/spotyping:2.1--hdfd78af_4"
+    }
+
+    withName:
+    '.*RDANALYZER.*' {
+        container = "quay.io/biocontainers/rd-analyzer:1.01--hdfd78af_0"
+    }
+
+
     withName:
     'TBPROFILER.*' {
         container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0"
 
@@ -18,6 +18,7 @@
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+
  * GNU General Public License for more details.
  *
  * You should have received a copy of the GNU General Public License
@@ -101,6 +102,24 @@ skip_phylogeny_and_clustering = false //OR true
 skip_complex_regions = false //OR true
 
 
+
+// Enable execution of MAGMA's tbprofiler container (with who+ database) on
+// FASTQ files
+
+skip_ntmprofiler = false // OR true
+
+skip_tbprofiler_fastq = true // OR false
+
+skip_spotyping = true
+
+// Flags for experimental features
+
+//NOTE: NOT working yet
+skip_rdanalyzer = true
+ref_fasta_rdanalyzer = "${projectDir}/resources/rdanalyzer/RDs30.fasta"
+
+
+
 //NOTE: PICK ONE of the following parameters related to IQTREE.
 iqtree_standard_bootstrap= false
 iqtree_fast_ml_only= false
@@ -186,7 +205,7 @@ fastq_validator_path = "fastq_validator.sh"
 
 
 //NOTE:Control the global publishing behavior, which is used as default in case there is no process specific config provided
-save_mode = 'symlink'
+save_mode = 'symlink' // 'copy'
 should_publish = true
 
 //NOTE: If enabled, the BAM results from HaplotypeCaller processes would be published
@@ -371,7 +390,7 @@ DELLY_CALL {
 }
 
 NTMPROFILER_PROFILE {
-    results_dir = "${params.outdir}/non-tuberculous_mycobacteria/per_sample/"
+    results_dir = "${params.outdir}/analyses/non-tuberculous_mycobacteria/per_sample/"
 }
 
 
@@ -443,7 +462,7 @@ UTILS_MERGE_COHORT_STATS {
 //-----------------------
 
 NTMPROFILER_COLLATE {
-    results_dir = "${params.outdir}/non-tuberculous_mycobacteria/cohort"
+    results_dir = "${params.outdir}/analyses/non-tuberculous_mycobacteria/cohort"
 
     prefix = "ntmprofiler.collate"
 }
@@ -461,7 +480,7 @@ GATK_GENOTYPE_GVCFS {
 
     arguments = " -G StandardAnnotation -G AS_StandardAnnotation --sample-ploidy 1 "
 
-    should_publish = false
+    should_publish = true
 }
 
 
@@ -470,7 +489,7 @@ SNPEFF {
 
     arguments = " -nostats -ud 100 Mycobacterium_tuberculosis_h37rv "
 
-    should_publish = false
+    should_publish = true
 }
 
 
@@ -678,6 +697,36 @@ TBPROFILER_COLLATE__COHORT {
     prefix = "major_variants"
 }
 
+
+TBPROFILER_FASTQ_PROFILE {
+    results_dir = "${params.outdir}/analyses/others/per_sample/tbprofiler_fastq/"
+    arguments = "--csv"
+    should_publish = false
+}
+
+TBPROFILER_FASTQ_COLLATE {
+    results_dir = "${params.outdir}/analyses/drug_resistance/tbprofiler_fastq/"
+    prefix = "fastq"
+}
+
+
+SPOTYPING {
+    results_dir = "${params.outdir}/analyses/spotyping/results_excel"
+    arguments = "" // Or "--noQuery"
+}
+
+UTILS_CAT_SPOTYPING {
+    results_dir = "${params.outdir}/analyses/spotyping/"
+    arguments = ""
+}
+
+
+RDANALYZER {
+    results_dir = "${params.outdir}/analyses/others/per_sample/rdanalyzer/"
+    arguments = ""
+}
+
+
 TBPROFILER_VCF_PROFILE__LOFREQ {
     results_dir = "${params.outdir}/analyses/drug_resistance/minor_variants_lofreq/"
     arguments = " --depth 0,0 --af 0,0 --strand 0 --sv_depth 0,0 --sv_af 0,0 --sv_len 100000,50000 "