Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
456f943
Draft of Minimac4 adding
gichas Jul 7, 2025
35494eb
Draft Minimac4
gichas Jul 8, 2025
85bcf02
Draft of the adding of minimac4
gichas Jul 9, 2025
ff25e84
Draft MINIMAC4
gichas Jul 10, 2025
8fc74aa
Adding Minimac4 to phaseimpute
gichas Jul 15, 2025
9b4310d
Removing useless files
gichas Jul 15, 2025
72b512b
Update files
gichas Jul 15, 2025
e14a3b6
Delete main_dev.nf
gichas Jul 15, 2025
41a5f9f
Address review comments
gichas Jul 16, 2025
a99b706
Update of README.md
gichas Jul 16, 2025
33818f2
Merge branch 'minimac4' of github.com:gichas/phaseimpute into minimac4
gichas Jul 16, 2025
683b90b
Update CHANGELOG.md
gichas Jul 17, 2025
292c225
Update the Minimac4 subworkflow logic
gichas Jul 17, 2025
bed0e3d
Quick fix
gichas Jul 17, 2025
3904358
Merge remote-tracking branch 'upstream/dev' into minimac4
gichas Jul 18, 2025
0287a3f
Update subworkflows/local/utils_nfcore_phaseimpute_pipeline/main.nf
LouisLeNezet Jul 18, 2025
6c9832e
Fix channel combination and update tests
gichas Jul 20, 2025
b8ca221
Fix last changes
gichas Jul 20, 2025
913c810
Fix indentation
gichas Jul 20, 2025
5021c01
Removing extra space in CHANGELOG.md
gichas Jul 21, 2025
0ecffd2
Update usage.md
gichas Jul 21, 2025
ce90cf0
Allow minimac without posfile input
Jul 22, 2025
8466356
Update workflows/phaseimpute/main.nf
gichas Jul 22, 2025
397b988
Update minimac4 tests and channel handling
gichas Jul 22, 2025
82cb194
Resolve merge conflict
gichas Jul 22, 2025
fcd8c7f
Fix Minimac4
gichas Jul 22, 2025
23de5d3
Fix Minimac4 tests
gichas Jul 22, 2025
014bdb1
Removing extra spaces
gichas Jul 22, 2025
57e35a1
Remove extra indentation
Jul 22, 2025
54d1714
Update CHANGELOG.md
LouisLeNezet Jul 25, 2025
551a368
Update subworkflows/local/utils_nfcore_phaseimpute_pipeline/main.nf
LouisLeNezet Jul 25, 2025
9501cb7
local changes, has to be erased
gichas Jul 25, 2025
c73b22d
Update map file
gichas Jul 28, 2025
cb7383c
Update gunzip module
gichas Jul 28, 2025
f7c57f5
Merge branch 'minimac4' of github.com:gichas/phaseimpute into minimac4
gichas Jul 28, 2025
24afeb8
Fix workflow
gichas Jul 28, 2025
cc03b83
Fix workflow
gichas Jul 28, 2025
a12a08d
Update subworkflows/local/utils_nfcore_phaseimpute_pipeline/main.nf
LouisLeNezet Aug 1, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#181](https://github.com/nf-core/phaseimpute/pull/181) - Add nf-co2footprint plugin to the config file.
- [#184](https://github.com/nf-core/phaseimpute/pull/184) - Add support `.csi` index for `.bam` files.
- [#188](https://github.com/nf-core/phaseimpute/pull/188) - Add documentation for all subworkflows.
- [#204](https://github.com/nf-core/phaseimpute/pull/204) - Add MINIMAC4 support for genotype imputation.

### `Changed`

Expand Down Expand Up @@ -44,6 +45,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
| `r-stitch` | 1.6.10 | 1.7.3 |
| `shapeit5` | 1.0.0 | 5.1.1 |
| `vcflib` | 1.0.3 | 1.0.14 |
| `minimac4` | | 4.1.6 |

### `Contributors`

[Gaspard Ichas](https://github.com/gichas)

## v1.0.0 - Black Labrador [2024-12-09]

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ The whole pipeline consists of five main steps, each of which can be run separat
- **Position Extraction** for targeted imputation sites.

4. **Imputation (`--impute`)**: This is the primary step, where genotypes in the target dataset are imputed using the prepared reference panel. The main steps are:
- **Imputation** of the target dataset using tools like [Glimpse1](https://odelaneau.github.io/GLIMPSE/glimpse1/index.html), [Glimpse2](https://odelaneau.github.io/GLIMPSE/), [Stitch](https://github.com/rwdavies/stitch), or [Quilt](https://github.com/rwdavies/QUILT).
- **Imputation** of the target dataset using tools like [Glimpse1](https://odelaneau.github.io/GLIMPSE/glimpse1/index.html), [Glimpse2](https://odelaneau.github.io/GLIMPSE/), [Stitch](https://github.com/rwdavies/stitch), [Quilt](https://github.com/rwdavies/QUILT) or [Minimac4](https://github.com/statgen/Minimac4).
- **Ligation** of imputed chunks to produce a final VCF file per sample, with all chromosomes unified.

5. **Validation (`--validate`)**: Assesses imputation accuracy by comparing the imputed dataset to a truth dataset. This step leverages the [Glimpse2](https://odelaneau.github.io/GLIMPSE/) concordance process to summarize differences between two VCF files.
Expand Down
50 changes: 50 additions & 0 deletions conf/steps/imputation_minimac4.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Config file for defining DSL2 per module options and publishing paths
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Available keys to override module options:
ext.args = Additional arguments appended to command in module.
ext.args2 = Second set of arguments appended to command in module (multi-tool modules).
ext.args3 = Third set of arguments appended to command in module (multi-tool modules).
ext.prefix = File name prefix for output files.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

process {
withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:VCF_IMPUTE_MINIMAC4:.*' {
publishDir = [enabled: false]
tag = { "${meta.id} ${meta.chr}" }
}

withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:VCF_IMPUTE_MINIMAC4:MINIMAC4_COMPRESSREF' {
ext.args = ''
ext.prefix = { "${meta.id}.${meta.chr}.minimac4" }
publishDir = [enabled: false]
tag = { "${meta.id} ${meta.chr}" }
}

withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:VCF_IMPUTE_MINIMAC4:MINIMAC4_IMPUTE' {
ext.args = { "--output-format vcf.gz" }
ext.prefix = { "${meta.id}.${meta.chr}.minimac4" }
tag = { "${meta.id} ${meta.chr}" }
}

withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:VCF_IMPUTE_MINIMAC4:BCFTOOLS_INDEX' {
ext.args = ''
publishDir = [enabled: false]
tag = { "${meta.id} ${meta.chr}" }
}

withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:CONCAT_MINIMAC4:.*' {
publishDir = [
path: { "${params.outdir}/imputation/minimac4/concat" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: 'NFCORE_PHASEIMPUTE:PHASEIMPUTE:CONCAT_MINIMAC4:BCFTOOLS_CONCAT' {
ext.args = ["--ligate", "--output-type z", "--write-index=tbi"].join(' ')
ext.prefix = { "${meta.id}.minimac4" }
}
}
46 changes: 46 additions & 0 deletions conf/test_minimac4.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines input files and everything required to run a fast and simple pipeline test.

Use as follows:
nextflow run nf-core/phaseimpute -profile test_minimac4,<docker/singularity> --outdir <OUTDIR>

----------------------------------------------------------------------------------------
*/

process {
resourceLimits = [
cpus: 4,
memory: '4.GB',
time: '1.h'
]
}

params {
config_profile_name = 'Test profile'
config_profile_description = 'Minimal test dataset to check pipeline function with MINIMAC4'

// Input data
input = "${projectDir}/tests/csv/sample_vcf.csv"

// Genome references
fasta = params.pipelines_testdata_base_path + "hum_data/reference_genome/GRCh38.s.fa.gz"
panel = "${projectDir}/tests/csv/panel.csv"

// Region file
input_region = "${projectDir}/tests/csv/region.csv"

// Map file
map = "${projectDir}/tests/csv/map.csv"

// Position file
posfile = "${projectDir}/tests/csv/posfile.csv"

// Pipeline steps
steps = "impute"

// Impute tools
tools = "minimac4"
}
2 changes: 1 addition & 1 deletion conf/test_validate.config
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ params {
// Genome references
fasta = params.pipelines_testdata_base_path + "hum_data/reference_genome/GRCh38.s.fa.gz"
posfile = "${projectDir}/tests/csv/posfile_vcf_index.csv"
map = "${projectDir}/tests/csv/map.csv"
map = "${projectDir}/tests/csv/map_glimpse.csv"

// Pipeline steps
steps = "validate"
Expand Down
27 changes: 24 additions & 3 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ For starting from the imputation steps, the required flags are:
- `--steps impute`
- `--input input.csv`: The samplesheet containing the input sample files in `bam`, `cram` or `vcf`, `bcf` format.
- `--genome` or `--fasta`: The reference genome of the samples.
- `--tools [glimpse1, quilt, stitch]`: A selection of one or more of the available imputation tools. Each imputation tool has their own set of specific flags and input files. These required files are produced by `--steps panelprep` and used as input in:
- `--tools [glimpse1, quilt, stitch, minimac4]`: A selection of one or more of the available imputation tools. Each imputation tool has their own set of specific flags and input files. These required files are produced by `--steps panelprep` and used as input in:
- `--chunks chunks.csv`: A samplesheet containing chunks per chromosome. These are produced by `--steps panelprep` using `GLIMPSE1`.
- `--posfile posfile.csv`: A samplesheet containing a `.legend.gz` file with the list of positions to genotype per chromosome. These are required by tools ( QUILT/STITCH/GLIMPSE1). It can also contain the `hap.gz` files (required by QUILT). The posfile can be generated with `--steps panelprep`.
- `--panel panel.csv`: A samplesheet containing the post-processed reference panel VCF (required by GLIMPSE1, GLIMPSE2). These files can be obtained with `--steps panelprep`.
Expand All @@ -307,6 +307,7 @@ For starting from the imputation steps, the required flags are:
| `GLIMPSE2` | ✅ | ✅ ¹ | ✅ | ✅ | ✅ | ❌ |
| `QUILT` | ✅ | ✅ ² | ✅ | ❌ | ✅ | ✅ ⁴ |
| `STITCH` | ✅ | ✅ ² | ✅ | ❌ | ❌ | ✅ ³ |
| `MINIMAC4` | ✅ | ✅ ¹ | ✅ | ✅ | ❌ | ❌ |

> ¹ Alignment files as well as variant calling format (i.e. BAM, CRAM, VCF or BCF)
> ² Alignment files only (i.e. BAM or CRAM)
Expand All @@ -332,12 +333,12 @@ When the number of samples exceeds the batch size, the pipeline will split the s

To summarize:

- If you have Variant Calling Format (VCF) files, join them into a single file and choose either GLIMPSE1 or GLIMPSE2.
- If you have Variant Calling Format (VCF) files, join them into a single file and choose either GLIMPSE1, GLIMPSE2 or MINIMAC4.
- If you have alignment files (e.g., BAM or CRAM), all tools are available, and processing will occur in `batch_size`:
- GLIMPSE1 and STITCH may induce batch effects, so all samples need to be imputed together.
- GLIMPSE2 and QUILT can process samples in separate batches.

## Imputation tools `--steps impute --tools [glimpse1, glimpse2, quilt, stitch]`
## Imputation tools `--steps impute --tools [glimpse1, glimpse2, quilt, stitch, minimac4]`

You can choose different software to perform the imputation. In the following sections, the typical commands for running the pipeline with each software are included. Multiple tools can be selected by separating them with a comma (eg. `--tools glimpse1,quilt`).

Expand Down Expand Up @@ -477,6 +478,26 @@ nextflow run nf-core/phaseimpute \

Make sure the CSV file with the input panel is the output from `--step panelprep` or has been previously prepared.

### MINIMAC4

[MINIMAC4](https://github.com/statgen/Minimac4) is a low memory, computationally efficient implementation of the MaCH algorithm for genotype imputation. It is designed to work on phased haplotypes and can handle very large reference panels.

```bash
nextflow run nf-core/phaseimpute \
--input samplesheet.csv \
--panel samplesheet_reference.csv \
--steps impute \
--tool minimac4 \
--outdir results \
--genome GRCh37 \
-profile docker \
--posfile posfile.csv
```

The CSV file provided in `--panel` must be prepared with `--steps panelprep` and must contain four columns [panel, chr, vcf, index].

MINIMAC4 works only with variant calling format files (VCF or BCF) as input.

## Start with validation `--steps validate`

<img src="images/metro/Validate.png" alt="concordance_metro" width="600"/>
Expand Down
10 changes: 10 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,16 @@
"git_sha": "41dfa3f7c0ffabb96a6a813fe321c6d1cc5b6e46",
"installed_by": ["modules"]
},
"minimac4/compressref": {
"branch": "master",
"git_sha": "41dfa3f7c0ffabb96a6a813fe321c6d1cc5b6e46",
"installed_by": ["modules"]
},
"minimac4/impute": {
"branch": "master",
"git_sha": "41dfa3f7c0ffabb96a6a813fe321c6d1cc5b6e46",
"installed_by": ["modules"]
},
"multiqc": {
"branch": "master",
"git_sha": "41dfa3f7c0ffabb96a6a813fe321c6d1cc5b6e46",
Expand Down
7 changes: 7 additions & 0 deletions modules/nf-core/minimac4/compressref/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

47 changes: 47 additions & 0 deletions modules/nf-core/minimac4/compressref/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

62 changes: 62 additions & 0 deletions modules/nf-core/minimac4/compressref/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

65 changes: 65 additions & 0 deletions modules/nf-core/minimac4/compressref/tests/main.nf.test

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading