Release v2.3.0#117
Open
charles-plessy wants to merge 42 commits into
Open
Conversation
Closes #97 To speed up alignment, both strands of the target genome are indexed. This doubles memory usage and may produce output files containing `-/+` alignments, which are not supported by some downstream pipelines. To disable this behavior, the `--strand forward` option is given.
Adds a new option `--multiqc_thumb` that defines a pixel size for alignment thumbnails to be displayed in the MultiQC report. Defaults to zero for no plots. Closes #93
The option `-w` is not available on Macintosh. Thanks @piplus2 for catching this issue.
Optional alignment thumbnails in the MultiQC report.
Allow single strand indexing.
Closes #112 This is inspired by nf-co.re/demultiplex, which also allows to bypass --input and provide single files directly.
Thansk @piplus2 for the suggestion.
Add a `--query` option for when there is only one query
The merged CRAM file is neither a pangenome nor a multiple sequence alignment, but I find it very useful. Temporarly CRAM files are produced but not exported. Their header indicates only the name of the query genomes in the read group fields. The files are merged in a single CRAM file, where each read group represents one genome. Each target-query alignment is a one-to-one relationship so a base in the target is aligned at most once to each query. Care is taken to ensure that the path to the reference genome is relative to the current directory. The multi-query CRAM file is output in the same directory as its index and the BGZIpped genome, indexed too. Thus the multi-query CRAM file can be loaded and visualised in the IGV. The coverage plot shows how many query genomes align to the target at a given location. Expanded track view allows to visualise all the sequence differences. You can stabilise the order of the genomes, but IGV enforces alphanumeric sorting. You can work around this limitation by prefixing the sample IDs with numbers in the sample sheet. Custom scripts can (and have) be written to slice a pieces of the multi-query CRAM file and turn these pieces into real MSAs…
Will change to CRAM 3.1 in pairgenomealign 3.0.0.
Co-authored-by: Joon Klaps <joon.klaps@kuleuven.be>
Co-authored-by: Joon Klaps <joon.klaps@kuleuven.be>
Co-authored-by: Joon Klaps <joon.klaps@kuleuven.be>
Co-authored-by: Joon Klaps <joon.klaps@kuleuven.be>
Co-authored-by: Joon Klaps <joon.klaps@kuleuven.be>
…which I submitted recently based on the local version.
New `--multi_cram` option to produce a multi-query CRAM file combining all the alignments
…n GFF format. Closes #70
Co-authored-by: Mateus de Oliveira Lopes <lopes3137@gmail.com>
Prepare 2.3.0
Collaborator
Author
|
Hi @muffato , as you have interest in genomics and CRAM files, I was wondering if you would be interested in reviewing this PR, where I use the new |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Modules will be updated to new versions in release
3.0.0, together with strict syntax conversion.v2.3.0 "Umi budou" - [June 3rd 2026]
Added--multi_cramoption to produce a multi-query CRAM file combining all the alignments (#60).--multiqc_thumbsoption to produce alignment thumbnails in the MultiQC report (#93).--strandoption to index only one strand of the genome, which reduces memory usage at the expense of speed, and suppresses-/+alignments (#97).--queryand--queryNameconvenience options to skip samplesheet creation when there is only one query genome to align (#112).##sequence-regionfields (#70).FixedFASTA_BGZIP_INDEX_DICT_SAMTOOLSsubworkflow that we just contributed.Parameters--multi_cram--multiqc_thumbs--query--queryName--strandDependenciesSAMTOOLS_BGZIPSAMTOOLS_DICTSAMTOOLS_FAIDXSAMTOOLS_MERGEHTSLIB_BGZIPTABIXPR checklist
nf-core pipelines lint).nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).