-
Notifications
You must be signed in to change notification settings - Fork 64
Add homopolish for nanopore-only assembly #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Daniel-VM
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job, Check the review comments, please.
| task.ext.when == null || task.ext.when | ||
|
|
||
| script: | ||
| def prefix = task.ext.prefix ?: "${meta.id}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can follow here nf-core structure to get both prefix and potential args:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
Additionally, you'll need to add the $args variable to the Homoplasy bash run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I copied that from another module with something else, but I don´t think I'm using it.
I don´t know the nf-core structure for that. Do you think we need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If args isnt used there is no need for the def args line imho.
| @@ -0,0 +1,35 @@ | |||
| process HOMOPOLISH { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can follow nf-core convention here: modules/local//
Could you place the main.nf file (and its related files) within homopolish/homopolish/ folder? And also the module would need to be renamed to HOMOPLISH_HOMOPILISH.
| homopolish polish \ | ||
| -a $medaka_genome \ | ||
| -s $bacteria_sketch \ | ||
| -m $params.homopolish_model \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's okay, but to make the script easier to read, we can use params.homopolish_model as an input channel for this process.
| // Assembly polishing | ||
| polish_method = 'medaka' // Allowed: ['medaka', 'nanopolish'] | ||
| polish_method = 'medaka' // Allowed: ['medaka', 'nanopolish', 'medaka_homopolish'] | ||
| homopolish_bacteria_sketch_url = 'https://bioinfo.cs.ccu.edu.tw/bioinfo/downloads/Homopolish_Sketch/bacteria.msh.gz' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to allow users to define a local sketch database via CLI . Lets say: --homopolish_sketchdb_path ?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, we can replace homopolish_bacteria_sketch_url with homopolish_sketchdb_path. The Nextflow engine can distinguish between a local path and a URL—if a URL is provided, it will fetch it automatically.
| GUNZIP_HOMOPOLISH( ch_sketch ) | ||
| } else { | ||
| // MODULE: Download bacteria sketch | ||
| HOMOPOLISH_SKETCH_PREPARATION( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this process can be removed, since the Nextflow engine can download and stage a file automatically when a URL is provided via params.
d4straub
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree a little work could be done as @Daniel-VM suggested.
Are you up to that @Gilbaja ?
| homopolish polish \ | ||
| -a $medaka_genome \ | ||
| -s $bacteria_sketch \ | ||
| -m $params.homopolish_model \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| homopolish polish \ | |
| -a $medaka_genome \ | |
| -s $bacteria_sketch \ | |
| -m $params.homopolish_model \ | |
| homopolish polish \\ | |
| -a $medaka_genome \\ | |
| -s $bacteria_sketch \\ | |
| -m $homopolish_model \\ |
double slashes to keep formatting in the .command.sh in the work folder.
also, -m $params.homopolish_model \ is bad practice, it should be solved with a val input
| input: | ||
| tuple val(meta), path(medaka_genome) | ||
| tuple val(meta_gunzip), path(bacteria_sketch) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| val homopolish_model | |
| -s $bacteria_sketch \ | ||
| -m $params.homopolish_model \ | ||
| -o . | ||
| cat <<-END_VERSIONS > versions.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| cat <<-END_VERSIONS > versions.yml | |
| cat <<-END_VERSIONS > versions.yml |
I like here an empty line for clarity
| GUNZIP_HOMOPOLISH ( HOMOPOLISH_SKETCH_PREPARATION.out.sketch ) | ||
| } | ||
| // MODULE: Homopolish, polishes MEDAKA assembly | ||
| HOMOPOLISH ( ch_assembly, GUNZIP_HOMOPOLISH.out.gunzip ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| HOMOPOLISH ( ch_assembly, GUNZIP_HOMOPOLISH.out.gunzip ) | |
| HOMOPOLISH ( ch_assembly, GUNZIP_HOMOPOLISH.out.gunzip, params.homopolish_model ) |
PR checklist
nf-core pipelines lint).nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).Issue #66
Adds homopolish as a tool for polishing after polished by medaka.
Before executing homopolish it needs a bacteria sketch to be downloaded (3.3GB) and unzipped.
To do so, an additional module is created, and the workflow includes a condition for checking if it has been already downloaded in the output folder. It also includes a new parameter for forcing the download.
Test cmd
nextflow run . -profile test_long,conda --polish_method medaka_homopolish --outdir results
Pending tasks