-
Notifications
You must be signed in to change notification settings - Fork 64
Add homopolish for nanopore-only assembly #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Changes from all commits
9463a5d
dc34b3f
193c981
24ead8e
3d1cbf8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| name: homopolish | ||
| channels: | ||
| - conda-forge | ||
| - bioconda | ||
| - defaults | ||
| dependencies: | ||
| - bioconda::homopolish=0.4.1 | ||
| - conda-forge::more-itertools=9.1.0 |
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,35 @@ | ||||||||||||||||||
| process HOMOPOLISH { | ||||||||||||||||||
| tag "$meta.id" | ||||||||||||||||||
| label 'process_high' | ||||||||||||||||||
d4straub marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||
|
|
||||||||||||||||||
| conda "${moduleDir}/environment.yml" | ||||||||||||||||||
| container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? | ||||||||||||||||||
| 'https://depot.galaxyproject.org/singularity/homopolish:0.4.1--pyhdfd78af_1' : | ||||||||||||||||||
| 'biocontainers/homopolish:0.4.1--pyhdfd78af_0' }" | ||||||||||||||||||
|
|
||||||||||||||||||
| input: | ||||||||||||||||||
| tuple val(meta), path(medaka_genome) | ||||||||||||||||||
| tuple val(meta_gunzip), path(bacteria_sketch) | ||||||||||||||||||
|
|
||||||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||||
| output: | ||||||||||||||||||
| tuple val(meta), path('*_genome_homopolished.fasta') , emit: assembly | ||||||||||||||||||
| path "versions.yml" , emit: versions | ||||||||||||||||||
|
|
||||||||||||||||||
| when: | ||||||||||||||||||
| task.ext.when == null || task.ext.when | ||||||||||||||||||
|
|
||||||||||||||||||
| script: | ||||||||||||||||||
| def prefix = task.ext.prefix ?: "${meta.id}" | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can follow here nf-core structure to get both prefix and potential args: Additionally, you'll need to add the $args variable to the Homoplasy bash run.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I copied that from another module with something else, but I don´t think I'm using it.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If |
||||||||||||||||||
| """ | ||||||||||||||||||
| homopolish polish \ | ||||||||||||||||||
| -a $medaka_genome \ | ||||||||||||||||||
| -s $bacteria_sketch \ | ||||||||||||||||||
| -m $params.homopolish_model \ | ||||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's okay, but to make the script easier to read, we can use params.homopolish_model as an input channel for this process.
Comment on lines
+24
to
+27
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
double slashes to keep formatting in the |
||||||||||||||||||
| -o . | ||||||||||||||||||
| cat <<-END_VERSIONS > versions.yml | ||||||||||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
I like here an empty line for clarity |
||||||||||||||||||
| "${task.process}": | ||||||||||||||||||
| homopolish: \$( homopolish --version 2>&1 | sed 's/Homopolish VERSION: *//g' ) | ||||||||||||||||||
| END_VERSIONS | ||||||||||||||||||
| """ | ||||||||||||||||||
| } | ||||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| name: homopolish_sketch_preparation | ||
| channels: | ||
| - conda-forge | ||
| - bioconda | ||
| - defaults | ||
| dependencies: | ||
| - conda-forge::sed=4.7 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| process HOMOPOLISH_SKETCH_PREPARATION { | ||
| label 'process_low' | ||
|
|
||
| conda "${moduleDir}/environment.yml" | ||
| container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? | ||
| 'https://depot.galaxyproject.org/singularity/curl:7.80.0' : | ||
| 'biocontainers/curl:7.80.0' }" | ||
|
|
||
| input: | ||
| val(meta) | ||
| path(url) | ||
|
|
||
| output: | ||
| tuple val(meta), path("bacteria.msh.gz"), emit: sketch | ||
Gilbaja marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| path "versions.yml" , emit: versions | ||
|
|
||
| script: | ||
| """ | ||
| curl $params.homopolish_bacteria_sketch_url | ||
| cat <<-END_VERSIONS > versions.yml | ||
| "${task.process}": | ||
| Homopolish_Sketch Bacteria: $params.homopolish_bacteria_last | ||
| END_VERSIONS | ||
| """ | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -33,7 +33,11 @@ params { | |
| dragonflye_args = '' | ||
|
|
||
| // Assembly polishing | ||
| polish_method = 'medaka' // Allowed: ['medaka', 'nanopolish'] | ||
| polish_method = 'medaka' // Allowed: ['medaka', 'nanopolish', 'medaka_homopolish'] | ||
| homopolish_bacteria_sketch_url = 'https://bioinfo.cs.ccu.edu.tw/bioinfo/downloads/Homopolish_Sketch/bacteria.msh.gz' | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be great to allow users to define a local sketch database via CLI . Lets say:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On second thought, we can replace |
||
| homopolish_bacteria_last = '2024-08-16' // From: https://bioinfo.cs.ccu.edu.tw/bioinfo/download.html | ||
| homopolish_model = 'R9.4.pkl' // Allowed: ['R9.4.pkl', 'R10.3.pkl', 'pb.pkl'] | ||
| homopolish_reload_sketch = false | ||
|
|
||
| // Annotation | ||
| annotation_tool = 'prokka' // Allowed: ['prokka', 'bakta','dfast'] | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -8,12 +8,14 @@ | |||||
| // | ||||||
| // MODULE: Local to the pipeline | ||||||
| // | ||||||
| include { PYCOQC } from '../modules/local/pycoqc' | ||||||
| include { NANOPOLISH } from '../modules/local/nanopolish' | ||||||
| include { MEDAKA } from '../modules/local/medaka' | ||||||
| include { KRAKEN2_DB_PREPARATION } from '../modules/local/kraken2/db_preparation' | ||||||
| include { DFAST } from '../modules/local/dfast' | ||||||
| include { CUSTOM_MULTIQC } from '../modules/local/custom/multiqc' | ||||||
| include { PYCOQC } from '../modules/local/pycoqc' | ||||||
| include { NANOPOLISH } from '../modules/local/nanopolish' | ||||||
| include { MEDAKA } from '../modules/local/medaka' | ||||||
| include { KRAKEN2_DB_PREPARATION } from '../modules/local/kraken2/db_preparation' | ||||||
| include { DFAST } from '../modules/local/dfast' | ||||||
| include { CUSTOM_MULTIQC } from '../modules/local/custom/multiqc' | ||||||
| include { HOMOPOLISH_SKETCH_PREPARATION } from '../modules/local/homopolish/sketch_preparation' | ||||||
| include { HOMOPOLISH } from '../modules/local/homopolish' | ||||||
|
|
||||||
| // | ||||||
| // MODULE: Installed directly from nf-core/modules | ||||||
|
|
@@ -37,6 +39,7 @@ include { KRAKEN2_KRAKEN2 as KRAKEN2_LONG } from '../modules/nf-core/krake | |||||
| include { QUAST } from '../modules/nf-core/quast' | ||||||
| include { QUAST as QUAST_BYREFSEQID } from '../modules/nf-core/quast' | ||||||
| include { GUNZIP } from '../modules/nf-core/gunzip' | ||||||
| include { GUNZIP as GUNZIP_HOMOPOLISH } from '../modules/nf-core/gunzip' | ||||||
| include { PROKKA } from '../modules/nf-core/prokka' | ||||||
|
|
||||||
| // | ||||||
|
|
@@ -304,13 +307,40 @@ workflow BACASS { | |||||
| .join( ch_assembly ) | ||||||
| .map { meta, sr, lr, fasta -> tuple(meta, lr, fasta) } | ||||||
| .set { ch_polish_long } // channel: [ val(meta), path(lr), path(fasta) ] | ||||||
| if (params.polish_method == 'medaka'){ | ||||||
| if (params.polish_method in ['medaka', 'medaka_homopolish'] ){ | ||||||
| // | ||||||
| // MODULE: Medaka, polishes assembly - should take either miniasm, canu, or unicycler consensus sequence | ||||||
| // | ||||||
| MEDAKA ( ch_polish_long ) | ||||||
| ch_assembly = MEDAKA.out.assembly | ||||||
| ch_versions = ch_versions.mix(MEDAKA.out.versions) | ||||||
| // If homopolish after medaka | ||||||
| if (params.polish_method == 'medaka_homopolish') { | ||||||
| // Check if sketch file already exists | ||||||
| sketch_path = "$baseDir/$params.outdir/Homopolish_sketch/bacteria.msh.gz" | ||||||
| sketch_file = new File(sketch_path) | ||||||
| // If sketch exists and not forced to reload, unzip sketch from outdir | ||||||
| if (sketch_file.exists() & !params.homopolish_reload_sketch) { | ||||||
| ch_sketch = tuple( | ||||||
| ch_assembly.collect{it[1]}, // meta from assembly channel | ||||||
| sketch_path | ||||||
| ) | ||||||
| GUNZIP_HOMOPOLISH( ch_sketch ) | ||||||
| } else { | ||||||
| // MODULE: Download bacteria sketch | ||||||
| HOMOPOLISH_SKETCH_PREPARATION( | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this process can be removed, since the Nextflow engine can download and stage a file automatically when a URL is provided via params. |
||||||
| ch_assembly.collect{it[1]}, // meta | ||||||
| params.homopolish_bacteria_sketch_url | ||||||
| ) | ||||||
| ch_versions = ch_versions.mix(HOMOPOLISH_SKETCH_PREPARATION.out.versions) | ||||||
| // Unzip bacteria sketch | ||||||
| GUNZIP_HOMOPOLISH ( HOMOPOLISH_SKETCH_PREPARATION.out.sketch ) | ||||||
| } | ||||||
| // MODULE: Homopolish, polishes MEDAKA assembly | ||||||
| HOMOPOLISH ( ch_assembly, GUNZIP_HOMOPOLISH.out.gunzip ) | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| ch_assembly = HOMOPOLISH.out.assembly | ||||||
| ch_versions = ch_versions.mix(HOMOPOLISH.out.versions) | ||||||
| } | ||||||
| } else if (params.polish_method == 'nanopolish') { | ||||||
| // | ||||||
| // MODULE: Nanopolish, polishes assembly using FAST5 files | ||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can follow nf-core convention here: modules/local//
Could you place the main.nf file (and its related files) within
homopolish/homopolish/folder? And also the module would need to be renamed to HOMOPLISH_HOMOPILISH.