Skip to content

Commit ee4fece

Browse files
committed
Merge branch 'hotfix/fastq-tbprofiler-and-spotyping'
2 parents 013614f + 7925d4d commit ee4fece

File tree

13 files changed

+972
-34
lines changed

13 files changed

+972
-34
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
# CHANGELOG FOR THE MAGMA PIPELINE VERSIONS
2+
<!-- https://keepachangelog.com/en/1.1.0/ -->
3+
24

35
## v2.0.0
46

README.md

Lines changed: 24 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ The `java` version should NOT be an `internal jdk` release! You can check the re
3333
Notice the `LTS` next to `OpenJDK` line.
3434

3535

36-
```bash
36+
```bash
3737

3838
$ java -version
3939
openjdk version "17.0.7" 2023-04-18 LTS
@@ -90,7 +90,7 @@ S0002,/full_path_to_directory_of_fastq_files/S0002_01_R1.fastq.gz,full_path_to_d
9090
S0003,/full_path_to_directory_of_fastq_files/S0003_01_R1.fastq.gz,
9191
```
9292

93-
If you have the metadata from sequencing instrument, you can specify further information in the samplesheet
93+
If you have the metadata from sequencing instrument, you can specify further information in the samplesheet
9494

9595
```csv
9696
Study,Sample,Library,Attempt,R1,R2,Flowcell,Lane,Index Sequence
@@ -156,15 +156,15 @@ Which could be provided to the pipeline using `-params-file` parameter as shown
156156

157157
```console
158158
nextflow run 'https://github.com/TORCH-Consortium/MAGMA' \
159-
-profile conda_local, server \
160-
-r v1.1.1 \
161-
-params-file my_parameters_1.yml
159+
-profile conda_local, server \
160+
-r v1.1.1 \
161+
-params-file my_parameters_1.yml
162162
163163
```
164164

165165
# Analysis
166166

167-
## Running MAGMA using Nextflow Tower
167+
## Running MAGMA using Nextflow Tower
168168

169169
You can also use Seqera Platform (aka Nextflow Tower) to run the pipeline on any of the supported cloud platforms and monitoring the pipeline execution.
170170

@@ -181,11 +181,11 @@ You can run the pipeline using Conda, Mamba or Micromamba package managers to in
181181
You can find out the location of conda environments using `conda env list`. [Here's](https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf) a useful cheatsheet for conda operations.
182182

183183

184-
You can use the `conda` based setup for the pipeline for running MAGMA
184+
You can use the `conda` based setup for the pipeline for running MAGMA
185185
- On a local linux machine(e.g. your laptop or a university server)
186-
- On an HPC cluster (e.g. SLURM, PBS) in case you don't have access to container systems like Singularity, Podman or Docker
186+
- On an HPC cluster (e.g. SLURM, PBS) in case you don't have access to container systems like Singularity, Podman or Docker
187187

188-
All the requisite softwares have been provided as a `conda` recipe (i.e. `yml` files)
188+
All the requisite softwares have been provided as a `conda` recipe (i.e. `yml` files)
189189
- [magma-env-1.yml](./conda_envs/magma-env-1.yml)
190190
- [magma-env-2.yml](./conda_envs/magma-env-2.yml)
191191

@@ -208,7 +208,7 @@ $ conda env create -n magma-env-2 --file magma-env-2.yml
208208

209209
Once the environments are created, you can make use of the pipeline parameter `conda_envs_location` to inform the pipeline of the names and location of the conda envs.
210210

211-
Next, you need to load the WHO Resistance Catalog within `tb-profiler`; basically the [instructions](https://github.com/TORCH-Consortium/MAGMA/blob/master/conda_envs/setup_conda_envs.sh#L20-L23), which are used to build the necessary containers.
211+
Next, you need to load the WHO Resistance Catalog within `tb-profiler`; basically the [instructions](https://github.com/TORCH-Consortium/MAGMA/blob/master/conda_envs/setup_conda_envs.sh#L20-L23), which are used to build the necessary containers.
212212

213213
1. Download [magma_resistance_db_who_v1.zip](https://github.com/TORCH-Consortium/MAGMA/files/14559680/resistance_db_who_v1.zip) and unzip it
214214

@@ -250,7 +250,7 @@ We provide [two docker containers](https://github.com/orgs/TORCH-Consortium/pack
250250

251251
> 🚧 **Container build script**: The script used to build these containers is provided [here](./containers/build.sh).
252252

253-
Although, you don't need to pull the containers manually, but should you need to, you could use the following commands to pull the pre-built and provided containers
253+
Although, you don't need to pull the containers manually, but should you need to, you could use the following commands to pull the pre-built and provided containers
254254

255255
```console
256256
docker pull ghcr.io/torch-consortium/magma/magma-container-1:1.1.1
@@ -262,13 +262,13 @@ docker pull ghcr.io/torch-consortium/magma/magma-container-2:1.1.1
262262
> :memo: **Have singularity or podman instead?**: <br>
263263
If you do have access to Singularity or Podman, then owing to their compatibility with Docker, you can still use the provided docker containers.
264264

265-
Here's the command which should be used
265+
Here's the command which should be used
266266

267267
```console
268268
nextflow run 'https://github.com/torch-consortium/magma' \
269-
-params-file my_parameters_2.yml \
270-
-profile docker,pbs \
271-
-r v1.1.1
269+
-params-file my_parameters_2.yml \
270+
-profile docker,pbs \
271+
-r v1.1.1
272272
```
273273

274274
> :bulb: **Hint**: <br>
@@ -307,7 +307,7 @@ errors. Including these is optional, if unknown or irrelevant,
307307
just fill in with a '1' as shown in example_MAGMA_samplesheet.csv)
308308
```
309309

310-
## (Optional) GVCF datasets
310+
## (Optional) GVCF datasets
311311

312312
We also provide some reference GVCF files which you could use for specific use-cases.
313313

@@ -319,7 +319,7 @@ containing GVCF reference dataset for ~600 samples is provided for augmenting sm
319319

320320
```
321321
use_ref_gvcf = false
322-
ref_gvcf = "/path/to/FILE.g.vcf.gz"
322+
ref_gvcf = "/path/to/FILE.g.vcf.gz"
323323
ref_gvcf_tbi = "/path/to/FILE.g.vcf.gz.tbi"
324324
```
325325

@@ -335,7 +335,7 @@ Tim Huepink and Lennert Verboven created an in-depth tutorial of the features of
335335

336336
We have also included a presentation (in PDF format) of the logic and workflow of the MAGMA pipeline as well as posters that have been presented at conferences. Please refer the [docs](./docs) folder.
337337

338-
# Interpretation
338+
# Interpretation
339339

340340
The results directory produced by MAGMA is as follows:
341341

@@ -347,7 +347,7 @@ The results directory produced by MAGMA is as follows:
347347
└── vcf_files
348348
```
349349

350-
## QC Statistics Directory
350+
## QC Statistics Directory
351351

352352
In this directory you will find files related to the quality control carried out by the MAGMA pipeline. The structure is as follows:
353353

@@ -412,7 +412,7 @@ MAGMA also notes the presence of all variants in in tier 1 and tier 2 drug resis
412412

413413
- **Phylogeny**
414414

415-
Contains the outputs of the IQTree phylogenetic tree construction.
415+
Contains the outputs of the IQTree phylogenetic tree construction.
416416

417417
> :memo: By default we recommend that you use the **ExDRIncComplex** files as MAGMA was optimized to be able to accurately call positions on the edges of complex regions in the *Mtb* genome
418418

@@ -422,7 +422,7 @@ Contains the SNP distance tables.
422422

423423
> :memo: By default we recommend that you use the **ExDRIncComplex** files as MAGMA was optimized to be able to accurately call positions on the edges of complex regions in the *Mtb* genome
424424

425-
## `vcf_files` Directory
425+
## `vcf_files` Directory
426426

427427
```bash
428428
/path/to/results_dir/vcf_files
@@ -463,7 +463,7 @@ Contains the SNP distance tables.
463463

464464
> Unfiltered structural variants detected by the MAGMA pipeline
465465

466-
## Libraries Directory
466+
## Libraries Directory
467467

468468
> Contains files related to FASTQ validation and FASTQC analysis
469469

@@ -472,9 +472,9 @@ Contains the SNP distance tables.
472472
> Contains vcf files for major|minor|structural variants for each individual samples
473473

474474

475-
# Citations
475+
# Citations
476476

477-
The MAGMA paper has been published here: https://doi.org/10.1371/journal.pcbi.1011648
477+
The MAGMA paper has been published here: https://doi.org/10.1371/journal.pcbi.1011648
478478

479479
The XBS variant calling core was published here: https://doi.org/10.1099%2Fmgen.0.000689
480480

conf/apptainer.config

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
/*
2+
* Copyright (c) 2021-2024 MAGMA pipeline authors, see https://doi.org/10.1371/journal.pcbi.1011648
3+
*
4+
* This file is part of MAGMA pipeline, see https://github.com/TORCH-Consortium/MAGMA
5+
*
6+
* For quick overview of GPL-3 license, please refer
7+
* https://www.tldrlegal.com/license/gnu-general-public-license-v3-gpl-3
8+
*
9+
* - You MUST keep this license with original authors in your copy
10+
* - You MUST acknowledge the original source of this software
11+
* - You MUST state significant changes made to the original software
12+
*
13+
* This program is free software: you can redistribute it and/or modify
14+
* it under the terms of the GNU General Public License as published by
15+
* the Free Software Foundation, either version 3 of the License, or
16+
* (at your option) any later version.
17+
*
18+
* This program is distributed in the hope that it will be useful,
19+
* but WITHOUT ANY WARRANTY; without even the implied warranty of
20+
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
21+
* GNU General Public License for more details.
22+
*
23+
* You should have received a copy of the GNU General Public License
24+
* along with this program . If not, see <http://www.gnu.org/licenses/>.
25+
*/
26+
process {
27+
28+
29+
withName:
30+
'.*SPOTYPING.*' {
31+
container = "quay.io/biocontainers/spotyping:2.1--hdfd78af_4"
32+
}
33+
34+
withName:
35+
'.*RDANALYZER.*' {
36+
container = "quay.io/biocontainers/rd-analyzer:1.01--hdfd78af_0"
37+
}
38+
39+
40+
withName:
41+
'.*TBPROFILER.*' {
42+
container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0--1"
43+
}
44+
45+
withName:
46+
'NTMPROFILER.*' {
47+
container = "ghcr.io/torch-consortium/magma/biocontainer-ntmprofiler:0.4.0"
48+
}
49+
50+
withName:
51+
'ISMAPPER.*|GATK.*|LOFREQ.*|DELLY.*|MULTIQC.*|FASTQC.*|UTILS.*|FASTQ.*|SAMPLESHEET.*' {
52+
container = "ghcr.io/torch-consortium/magma/magma-container-1:2.0.0"
53+
}
54+
55+
withName:
56+
'BWA.*|IQTREE.*|SNPDISTS.*|SNPSITES.*|BCFTOOLS.*|BGZIP.*|SAMTOOLS.*|SNPEFF.*|CLUSTERPICKER.*' {
57+
container = "ghcr.io/torch-consortium/magma/magma-container-2:1.1.1"
58+
}
59+
60+
}
61+
62+
63+
apptainer {
64+
enabled = true
65+
}

conf/docker.config

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,18 @@
2626
process {
2727

2828
withName:
29-
'TBPROFILER.*' {
30-
container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0"
29+
'.*SPOTYPING.*' {
30+
container = "quay.io/biocontainers/spotyping:2.1--hdfd78af_4"
31+
}
32+
33+
withName:
34+
'.*RDANALYZER.*' {
35+
container = "quay.io/biocontainers/rd-analyzer:1.01--hdfd78af_0"
36+
}
37+
38+
withName:
39+
'.*TBPROFILER.*' {
40+
container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0--1"
3141
}
3242

3343
withName:

conf/singularity.config

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,18 @@
2525
*/
2626
process {
2727

28+
29+
withName:
30+
'.*SPOTYPING.*' {
31+
container = "quay.io/biocontainers/spotyping:2.1--hdfd78af_4"
32+
}
33+
34+
withName:
35+
'.*RDANALYZER.*' {
36+
container = "quay.io/biocontainers/rd-analyzer:1.01--hdfd78af_0"
37+
}
38+
39+
2840
withName:
2941
'TBPROFILER.*' {
3042
container = "ghcr.io/torch-consortium/magma/biocontainer-tbprofiler:6.3.0"

default_params.config

Lines changed: 54 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
* This program is distributed in the hope that it will be useful,
1919
* but WITHOUT ANY WARRANTY; without even the implied warranty of
2020
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
21+
2122
* GNU General Public License for more details.
2223
*
2324
* You should have received a copy of the GNU General Public License
@@ -101,6 +102,24 @@ skip_phylogeny_and_clustering = false //OR true
101102
skip_complex_regions = false //OR true
102103

103104

105+
106+
// Enable execution of MAGMA's tbprofiler container (with who+ database) on
107+
// FASTQ files
108+
109+
skip_ntmprofiler = false // OR true
110+
111+
skip_tbprofiler_fastq = true // OR false
112+
113+
skip_spotyping = true
114+
115+
// Flags for experimental features
116+
117+
//NOTE: NOT working yet
118+
skip_rdanalyzer = true
119+
ref_fasta_rdanalyzer = "${projectDir}/resources/rdanalyzer/RDs30.fasta"
120+
121+
122+
104123
//NOTE: PICK ONE of the following parameters related to IQTREE.
105124
iqtree_standard_bootstrap= false
106125
iqtree_fast_ml_only= false
@@ -186,7 +205,7 @@ fastq_validator_path = "fastq_validator.sh"
186205

187206

188207
//NOTE:Control the global publishing behavior, which is used as default in case there is no process specific config provided
189-
save_mode = 'symlink'
208+
save_mode = 'symlink' // 'copy'
190209
should_publish = true
191210

192211
//NOTE: If enabled, the BAM results from HaplotypeCaller processes would be published
@@ -371,7 +390,7 @@ DELLY_CALL {
371390
}
372391

373392
NTMPROFILER_PROFILE {
374-
results_dir = "${params.outdir}/non-tuberculous_mycobacteria/per_sample/"
393+
results_dir = "${params.outdir}/analyses/non-tuberculous_mycobacteria/per_sample/"
375394
}
376395

377396

@@ -443,7 +462,7 @@ UTILS_MERGE_COHORT_STATS {
443462
//-----------------------
444463

445464
NTMPROFILER_COLLATE {
446-
results_dir = "${params.outdir}/non-tuberculous_mycobacteria/cohort"
465+
results_dir = "${params.outdir}/analyses/non-tuberculous_mycobacteria/cohort"
447466

448467
prefix = "ntmprofiler.collate"
449468
}
@@ -461,7 +480,7 @@ GATK_GENOTYPE_GVCFS {
461480

462481
arguments = " -G StandardAnnotation -G AS_StandardAnnotation --sample-ploidy 1 "
463482

464-
should_publish = false
483+
should_publish = true
465484
}
466485

467486

@@ -470,7 +489,7 @@ SNPEFF {
470489

471490
arguments = " -nostats -ud 100 Mycobacterium_tuberculosis_h37rv "
472491

473-
should_publish = false
492+
should_publish = true
474493
}
475494

476495

@@ -678,6 +697,36 @@ TBPROFILER_COLLATE__COHORT {
678697
prefix = "major_variants"
679698
}
680699

700+
701+
TBPROFILER_FASTQ_PROFILE {
702+
results_dir = "${params.outdir}/analyses/others/per_sample/tbprofiler_fastq/"
703+
arguments = "--csv"
704+
should_publish = false
705+
}
706+
707+
TBPROFILER_FASTQ_COLLATE {
708+
results_dir = "${params.outdir}/analyses/drug_resistance/tbprofiler_fastq/"
709+
prefix = "fastq"
710+
}
711+
712+
713+
SPOTYPING {
714+
results_dir = "${params.outdir}/analyses/spotyping/results_excel"
715+
arguments = "" // Or "--noQuery"
716+
}
717+
718+
UTILS_CAT_SPOTYPING {
719+
results_dir = "${params.outdir}/analyses/spotyping/"
720+
arguments = ""
721+
}
722+
723+
724+
RDANALYZER {
725+
results_dir = "${params.outdir}/analyses/others/per_sample/rdanalyzer/"
726+
arguments = ""
727+
}
728+
729+
681730
TBPROFILER_VCF_PROFILE__LOFREQ {
682731
results_dir = "${params.outdir}/analyses/drug_resistance/minor_variants_lofreq/"
683732
arguments = " --depth 0,0 --af 0,0 --strand 0 --sv_depth 0,0 --sv_af 0,0 --sv_len 100000,50000 "

0 commit comments

Comments
 (0)