Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,17 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [v2.2.0dev](https://github.com/nf-core/pairgenomealign/releases/tag/2.2.0)
## [v2.2.0](https://github.com/nf-core/pairgenomealign/releases/tag/2.2.0) "Chagara ponzu" - [May 29th 2025]

### `Added`

- Support for export to BAM and CRAM formats ([#31](https://github.com/nf-core/pairgenomealign/issues/31)) ([#43](https://github.com/nf-core/pairgenomealign/issues/43)).
- SAM/BAM/CRAM alignments files are sorted and their header features all sequences of the _target_ genome.
- Report ungapped percent identity ([#46](https://github.com/nf-core/pairgenomealign/issues/46)).
- Update full-size test genomes to feature more T2T assemblies ([#59](https://github.com/nf-core/pairgenomealign/issues/59)).
- Use a single mulled container for LAST, Samtools and open-fonts, to save ~280 Mb of downloads ([#58](https://github.com/nf-core/pairgenomealign/issues/58)).
- Allow export to multiple formats (comma-separated list) ([#42](https://github.com/nf-core/pairgenomealign/issues/42)).
- Allow skipping of the assembly QC with `--skip_assembly_qc` ([#53](https://github.com/nf-core/pairgenomealign/issues/53)).

### `Dependencies`

Expand All @@ -20,12 +23,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
| `SAMTOOLS_DICT` | | 1.21 |
| `SAMTOOLS_FAIDX` | | 1.21 |

### `Parameters`

| Old parameter | New parameter |
| ------------- | -------------------- |
| | `--skip_assembly_qc` |

### `Fixed`

- Remove noisy tag in the `MULTIQC_ASSEMBLYSCAN_PLOT_DATA` local module ([#64](https://github.com/nf-core/pairgenomealign/issues/64)).
- Restore BED format support ([#56](https://github.com/nf-core/pairgenomealign/issues/56)).
- Document the `multiqc_train.txt` and `multiqc_last_o2o.txt` aggregating alignment statistics ([#52](https://github.com/nf-core/pairgenomealign/issues/52)).
- Point the test configs samplesheets to `nf-core/test-datasets` in order to run the AWS full tests ([#62](https://github.com/nf-core/pairgenomealign/issues/62)).
- Update metro map, in white background ([#71](https://github.com/nf-core/pairgenomealign/issues/71)).

## [v2.1.0](https://github.com/nf-core/pairgenomealign/releases/tag/2.1.0) "Goya champuru" - [May 16th 2025]

Expand Down
4 changes: 2 additions & 2 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
report_comment: >
This report has been generated by the <a href="https://github.com/nf-core/pairgenomealign/tree/dev"
This report has been generated by the <a href="https://github.com/nf-core/pairgenomealign/releases/tag/2.2.0"
target="_blank">nf-core/pairgenomealign</a> analysis pipeline. For information about
how to interpret these results, please see the <a href="https://nf-co.re/pairgenomealign/dev/docs/output"
how to interpret these results, please see the <a href="https://nf-co.re/pairgenomealign/2.2.0/docs/output"
target="_blank">documentation</a>.
report_section_order:
"nf-core-pairgenomealign-methods-description":
Expand Down
9 changes: 9 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -147,4 +147,13 @@ process {
]
}

// Use a single mulled container for all LAST and SAMTOOLS modules
// instead of the overlapping last, samtools, last_samtools and last_open-fonts images
// to save ~280 Mb of container download.
withName:'ALIGNMENT_.*' {
container = { "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/06/06beccfa4d48e5daf30dd8cee4f7e06fd51594963db0d5087ab695365b79903b/data' :
'community.wave.seqera.io/library/last_samtools_open-fonts:176a6ab0c8171057'}" }
}

}
Binary file modified docs/images/pairgenomealign-tubemap.png

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, who was no bioinformatics background, this is a bit confusing.
It is unclear what these .maf files are (inputs or outputs).
Adding module names may also improve the alignment with the short pipeline description in the readme file.
Not a blocker though as this is likely caused by my limited knowledge on bioinformatics.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
909 changes: 386 additions & 523 deletions docs/images/pairgenomealign-tubemap.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ To change the resource requests, please see the [max resources](https://nf-co.re

In some cases, you may wish to change the container or conda environment used by a pipeline steps for a particular tool. By default, nf-core pipelines use containers and software from the [biocontainers](https://biocontainers.pro/) or [bioconda](https://bioconda.github.io/) projects. However, in some cases the pipeline specified version maybe out of date.

To use a different container from the default container or conda environment specified in a pipeline, please see the [updating tool versions](https://nf-co.re/docs/usage/configuration#updating-tool-versions) section of the nf-core website.
To use a different container from the default container or conda environment specified in a pipeline, please see the [updating tool versions](https://nf-co.re/docs/usage/configuration#updating-tool-versions) section of the nf-core website. Also please note that this pipeline uses a [Wave container](https://seqera.io/containers/) combining LAST, Samtools, and open fonts, to save the user from downloading multiple overlapping images (LAST alone, Samtools alone, LAST plus Samtools, and LAST plus open fonts). Changing this container is the simplest way to update LAST if you wish so.

### Custom Tool Arguments

Expand Down
4 changes: 3 additions & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ params {
seed = 'YASS'
softmask = 'tantan'

// Misc options
skip_assembly_qc = false
targetName = 'target'
m2m = false

Expand Down Expand Up @@ -274,7 +276,7 @@ manifest {
mainScript = 'main.nf'
defaultBranch = 'master'
nextflowVersion = '!>=24.10.1'
version = '2.2.0dev'
version = '2.2.0'
doi = ''
}

Expand Down
28 changes: 10 additions & 18 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,13 @@
"help_text": "By default the _target_ genome is named `target` and this name is concatenated with the sample IDs using `___` as a separator to construct alignment file names. Use this option to provide a more informative name for the target genome.",
"description": "Target genome name."
},
"skip_assembly_qc": {
"type": "boolean",
"default": "false",
"help_text": "If the assembly QC was already done before this pipeline is started, you can skip it with this option.",
"description": "Skip assembly QC.",
"fa_icon": "fas fa-forward"
},
"outdir": {
"type": "string",
"format": "directory-path",
Expand Down Expand Up @@ -90,25 +97,10 @@
"export_aln_to": {
"type": "string",
"default": "no_export",
"description": "Convert output to a different format than MAF.",
"enum": [
"no_export",
"axt",
"bam",
"bed",
"blast",
"blasttab",
"blasttab+",
"chain",
"cram",
"gff",
"html",
"psl",
"sam",
"tab"
],
"description": "Convert the final _one-to-one_ alignment to a different format than MAF.",
"pattern": "^((no_export|axt|bam|bed|blast|blasttab|blasttab+|chain|cram|gff|html|psl|sam|tab)?,?)*(?<!,)$",
"fa_icon": "fas fa-file-export",
"help_text": "Output extra files for the final _one-to-one_ alignment results in AXT, GFF or SAM format. This is useful for downstream tools that do not parse MAF. The files are always compressed with `gzip`."
"help_text": "Multiple formats separated with commas. Supported formats are `axt`, `bam`, `bed`, `blast`, `blasttab`, `blasttab+`, `chain`, `cram`, `gff`, `html`, `psl`, `sam` and `tab`. This is useful for downstream tools that do not parse MAF. The files in text format are always compressed with `gzip`."
},
"m2m": {
"type": "boolean",
Expand Down
20 changes: 10 additions & 10 deletions ro-crate-metadata.json

Large diffs are not rendered by default.

24 changes: 12 additions & 12 deletions workflows/pairgenomealign.nf
Original file line number Diff line number Diff line change
Expand Up @@ -44,15 +44,16 @@ workflow PAIRGENOMEALIGN {
ch_samplesheet
)

// Extract statistics on contig length and GC content
// Allow to skip statistics on contig length and GC content
//
ASSEMBLYSCAN (
ch_samplesheet
)
// Parse assembly-scan's JSON for MultiQC
MULTIQC_ASSEMBLYSCAN_PLOT_DATA (
ASSEMBLYSCAN.out.json.collect{it[1]}
)
if (! params.skip_assembly_qc ) {
ASSEMBLYSCAN ( ch_samplesheet )
ch_versions = ch_versions.mix(ASSEMBLYSCAN.out.versions)
MULTIQC_ASSEMBLYSCAN_PLOT_DATA (
ASSEMBLYSCAN.out.json.collect{it[1]}
)
ch_multiqc_files = ch_multiqc_files.mix(MULTIQC_ASSEMBLYSCAN_PLOT_DATA.out.tsv)
}

// Prefix query ids with target genome name before producing alignment files
//
Expand Down Expand Up @@ -90,7 +91,8 @@ workflow PAIRGENOMEALIGN {
ch_targetgenome_gzi = [[],[]]
ch_targetgenome_dic = [[],[]]

if (params.export_aln_to.contains('cram') | params.export_aln_to.contains('bam')) {
export_formats = params.export_aln_to.tokenize(',')
if (export_formats.contains('cram') | export_formats.contains('bam')) {
FASTA_BGZIP_INDEX_DICT_SAMTOOLS( ch_targetgenome )
ch_targetgenome_faz = FASTA_BGZIP_INDEX_DICT_SAMTOOLS.out.fasta_gz
ch_targetgenome_fai = FASTA_BGZIP_INDEX_DICT_SAMTOOLS.out.fai
Expand All @@ -101,7 +103,7 @@ workflow PAIRGENOMEALIGN {

if (!(params.export_aln_to == "no_export")) {
ALIGNMENT_EXP(
pairalign_out.o2o. map {it + params.export_aln_to},
pairalign_out.o2o.combine(Channel.fromList(export_formats)),
ch_targetgenome_faz,
ch_targetgenome_fai,
ch_targetgenome_gzi,
Expand All @@ -114,7 +116,6 @@ workflow PAIRGENOMEALIGN {

ch_versions = ch_versions
.mix( CUTN_TARGET.out.versions)
.mix(ASSEMBLYSCAN.out.versions)
.mix( pairalign_out.versions)

softwareVersionsToYAML(ch_versions)
Expand Down Expand Up @@ -151,7 +152,6 @@ workflow PAIRGENOMEALIGN {

ch_multiqc_files = ch_multiqc_files
.mix(ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml'))
.mix(MULTIQC_ASSEMBLYSCAN_PLOT_DATA.out.tsv)
.mix(pairalign_out.multiqc)
.mix(ch_collated_versions)
.mix(
Expand Down