Skip to content

nextflow format hello-nextflow/ #565

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 14 additions & 14 deletions hello-nextflow/hello-channels.nf
Original file line number Diff line number Diff line change
@@ -1,4 +1,15 @@
#!/usr/bin/env nextflow
/*
* Pipeline parameters
*/
params.greeting = 'Holà mundo!'

workflow {

// emit a greeting
sayHello(params.greeting)
}


/*
* Use echo to print 'Hello World!' to a file
Expand All @@ -8,24 +19,13 @@ process sayHello {
publishDir 'results', mode: 'copy'

input:
val greeting
val greeting

output:
path 'output.txt'
path 'output.txt'

script:
"""
echo '$greeting' > output.txt
echo '${greeting}' > output.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the curly braces going to be mandatory? why do we need them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are optional in certain cases like simple variable names. The formatter doesn't try to be smart here, it just always uses braces

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for whether we should put them everywhere, there is a trade-off between consistency and shortcuts

Using curly braces everywhere requires a bit more typing but makes the code more consistent, therefore easier to read

Omitting the curly braces where possible saves a bit on typing but makes the code less consistent, thus the reader has to work a bit more to understand the different syntax forms

So I think it comes down to how much time we spend reading vs writing code. At least in software engineering, we spend way more time reading code, so I tend to favor consistency

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's easier to teach consistency as well. So I like this change.

"""
}

/*
* Pipeline parameters
*/
params.greeting = 'Holà mundo!'

workflow {

// emit a greeting
sayHello(params.greeting)
}
21 changes: 11 additions & 10 deletions hello-nextflow/hello-config.nf
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
#!/usr/bin/env nextflow
// Include modules
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea here that module includes should go before anything else? Is there a functional reason or is it purely stylistic? Any interactions wrt setting any variable values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, it's just a convention as it shouldn't matter where you put the includes

In practice, users sometimes put the params first so that they are propagated to the included modules as a side effect. We are trying to move away from that pattern anyway

include { sayHello } from './modules/sayHello.nf'
include { convertToUpper } from './modules/convertToUpper.nf'
include { collectGreetings } from './modules/collectGreetings.nf'
include { cowpy } from './modules/cowpy.nf'


/*
* Pipeline parameters
Expand All @@ -7,18 +13,13 @@ params.greeting = 'greetings.csv'
params.batch = 'test-batch'
params.character = 'turkey'

// Include modules
include { sayHello } from './modules/sayHello.nf'
include { convertToUpper } from './modules/convertToUpper.nf'
include { collectGreetings } from './modules/collectGreetings.nf'
include { cowpy } from './modules/cowpy.nf'

workflow {

// create a channel for inputs from a CSV file
greeting_ch = Channel.fromPath(params.greeting)
.splitCsv()
.map { line -> line[0] }
greeting_ch = Channel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like having to put the .fromPath() bit on the next line. I would keep that like it was before tbh. That way you have the first line do a full thing, then the next lines modify that thing. If you have just greeting_ch = Channel on one line that feels... interrupted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I agree. I would just need to add a special case in the formatter. Though I wonder if we would feel the same about any old foo.bar() or if it's just with channel factories?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me I would do this for any old foo.bar() as well I think.. The first line should contain the first action. Whether it's Channel or foo, neither of these are actions, only the starting points.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I imagine something like this:

ch_samples
  .filter { sample -> sample.n_reads >= 1_000_000 }
  .map { sample -> sample.id }

I think I prefer both calls to be on their own line. I think I treat channel factories differently in my head because the Channel and the factory call are sort of a unit, but in general it seems like the "starting point" should be separate from any transformations performed on it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the format command will now put Channel.fromPath() on a single line, as well as any other channel factory

.fromPath(params.greeting)
.splitCsv()
.map { line -> line[0] }

// emit a greeting
sayHello(greeting_ch)
Expand All @@ -30,7 +31,7 @@ workflow {
collectGreetings(convertToUpper.out.collect(), params.batch)

// emit a message about the size of the batch
collectGreetings.out.count.view { "There were $it greetings in this batch" }
collectGreetings.out.count.view { "There were ${it} greetings in this batch" }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the extra curlies necessary here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also should we be explicitly naming the variable like we do later with line -> line[0] etc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my point above about curly braces. As for the implicit it, yes I would recommend as a best practice that you declare an explicit name such as line. Right now it's only a "paranoid" warning but it will eventually become an error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I got all of the $its in #538 - must have missed this one.


// generate ASCII art of the greetings with cowpy
cowpy(collectGreetings.out.outfile, params.character)
Expand Down
19 changes: 10 additions & 9 deletions hello-nextflow/hello-containers.nf
Original file line number Diff line number Diff line change
@@ -1,22 +1,23 @@
#!/usr/bin/env nextflow
// Include modules
include { sayHello } from './modules/sayHello.nf'
include { convertToUpper } from './modules/convertToUpper.nf'
include { collectGreetings } from './modules/collectGreetings.nf'


/*
* Pipeline parameters
*/
params.greeting = 'greetings.csv'
params.batch = 'test-batch'

// Include modules
include { sayHello } from './modules/sayHello.nf'
include { convertToUpper } from './modules/convertToUpper.nf'
include { collectGreetings } from './modules/collectGreetings.nf'

workflow {

// create a channel for inputs from a CSV file
greeting_ch = Channel.fromPath(params.greeting)
.splitCsv()
.map { line -> line[0] }
greeting_ch = Channel
.fromPath(params.greeting)
.splitCsv()
.map { line -> line[0] }

// emit a greeting
sayHello(greeting_ch)
Expand All @@ -28,5 +29,5 @@ workflow {
collectGreetings(convertToUpper.out.collect(), params.batch)

// emit a message about the size of the batch
collectGreetings.out.count.view { "There were $it greetings in this batch" }
collectGreetings.out.count.view { "There were ${it} greetings in this batch" }
}
75 changes: 38 additions & 37 deletions hello-nextflow/hello-modules.nf
Original file line number Diff line number Diff line change
@@ -1,4 +1,31 @@
#!/usr/bin/env nextflow
/*
* Pipeline parameters
*/
params.greeting = 'greetings.csv'
params.batch = 'test-batch'

workflow {

// create a channel for inputs from a CSV file
greeting_ch = Channel
.fromPath(params.greeting)
.splitCsv()
.map { line -> line[0] }

// emit a greeting
sayHello(greeting_ch)

// convert the greeting to uppercase
convertToUpper(sayHello.out)

// collect all the greetings into one file
collectGreetings(convertToUpper.out.collect(), params.batch)

// emit a message about the size of the batch
collectGreetings.out.count.view { "There were ${it} greetings in this batch" }
}


/*
* Use echo to print 'Hello World!' to a file
Expand All @@ -8,14 +35,14 @@ process sayHello {
publishDir 'results', mode: 'copy'

input:
val greeting
val greeting

output:
path "${greeting}-output.txt"
path "${greeting}-output.txt"

script:
"""
echo '$greeting' > '$greeting-output.txt'
echo '${greeting}' > '${greeting}-output.txt'
"""
}

Expand All @@ -27,14 +54,14 @@ process convertToUpper {
publishDir 'results', mode: 'copy'

input:
path input_file
path input_file

output:
path "UPPER-${input_file}"
path "UPPER-${input_file}"

script:
"""
cat '$input_file' | tr '[a-z]' '[A-Z]' > 'UPPER-${input_file}'
cat '${input_file}' | tr '[a-z]' '[A-Z]' > 'UPPER-${input_file}'
"""
}

Expand All @@ -46,42 +73,16 @@ process collectGreetings {
publishDir 'results', mode: 'copy'

input:
path input_files
val batch_name
path input_files
val batch_name

output:
path "COLLECTED-${batch_name}-output.txt" , emit: outfile
val count_greetings , emit: count
path "COLLECTED-${batch_name}-output.txt", emit: outfile
val count_greetings, emit: count

script:
count_greetings = input_files.size()
count_greetings = input_files.size()
"""
cat ${input_files} > 'COLLECTED-${batch_name}-output.txt'
"""
}

/*
* Pipeline parameters
*/
params.greeting = 'greetings.csv'
params.batch = 'test-batch'

workflow {

// create a channel for inputs from a CSV file
greeting_ch = Channel.fromPath(params.greeting)
.splitCsv()
.map { line -> line[0] }

// emit a greeting
sayHello(greeting_ch)

// convert the greeting to uppercase
convertToUpper(sayHello.out)

// collect all the greetings into one file
collectGreetings(convertToUpper.out.collect(), params.batch)

// emit a message about the size of the batch
collectGreetings.out.count.view { "There were $it greetings in this batch" }
}
39 changes: 20 additions & 19 deletions hello-nextflow/hello-workflow.nf
Original file line number Diff line number Diff line change
@@ -1,4 +1,21 @@
#!/usr/bin/env nextflow
/*
* Pipeline parameters
*/
params.greeting = 'greetings.csv'

workflow {

// create a channel for inputs from a CSV file
greeting_ch = Channel
.fromPath(params.greeting)
.splitCsv()
.map { line -> line[0] }

// emit a greeting
sayHello(greeting_ch)
}


/*
* Use echo to print 'Hello World!' to a file
Expand All @@ -8,29 +25,13 @@ process sayHello {
publishDir 'results', mode: 'copy'

input:
val greeting
val greeting

output:
path "${greeting}-output.txt"
path "${greeting}-output.txt"

script:
"""
echo '$greeting' > '$greeting-output.txt'
echo '${greeting}' > '${greeting}-output.txt'
"""
}

/*
* Pipeline parameters
*/
params.greeting = 'greetings.csv'

workflow {

// create a channel for inputs from a CSV file
greeting_ch = Channel.fromPath(params.greeting)
.splitCsv()
.map { line -> line[0] }

// emit a greeting
sayHello(greeting_ch)
}
15 changes: 7 additions & 8 deletions hello-nextflow/hello-world.nf
Original file line number Diff line number Diff line change
@@ -1,21 +1,20 @@
#!/usr/bin/env nextflow
workflow {

// emit a greeting
sayHello()
}


/*
* Use echo to print 'Hello World!' to a file
*/
process sayHello {

output:
path 'output.txt'
path 'output.txt'

script:
"""
echo 'Hello World!' > output.txt
"""
}

workflow {

// emit a greeting
sayHello()
}
14 changes: 7 additions & 7 deletions hello-nextflow/solutions/1-hello-world/hello-world-3.nf
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
#!/usr/bin/env nextflow
workflow {

// emit a greeting
sayHello()
}

Comment on lines +2 to +7
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike having workflow above processes. I use this order:

  1. include
  2. params
  3. functions
  4. processes
  5. workflows
  6. anonymous workflow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm debating whether to add an option to control the order (e.g. "Formatting > Entry workflow position")

Or maybe just not try to re-order the definitions at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've always done processes first then workflows, but I'm coming around to it.

I don't mind making @adamrtalbot feel uncomfortable, auto-formatters have that effect on everyone 😅 Though maybe this goes beyond what most formatters do?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll survive if it gets reordered, but do we have a particular order that is preferred? It feels weird to be opinionated about something that isn't consequential (not that it's stopped me before).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that what auto-formatters do? 😆 Basically no formatting is consequential. It's mostly for convention and consistency, so that folks get used to navigating pipeline code in the same way everywhere.

Any colour, as long as it's black.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most languages don't have this problem because they just have functions, no processes / workflows.

I do think that most people put their workflows at the bottom. This is also the convention that Paolo used when he introduced DSL2 in the docs, which could explain it.

I'm starting to think the formatter should just preserve the user's order for now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - or at least make it opt in, off by default for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..Maybe with includes as an exception? I do like having them at the top and alphabetically sorted 👀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, feature flags / includes / params should go first no matter what

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the format command now will not try to re-order your declarations unless you enable it, should make the formatting less intrusive overall


/*
* Use echo to print 'Hello World!' to a file
Expand All @@ -8,16 +14,10 @@ process sayHello {
publishDir 'results', mode: 'copy'

output:
path 'output.txt'
path 'output.txt'

script:
"""
echo 'Hello World!' > output.txt
"""
}

workflow {

// emit a greeting
sayHello()
}
28 changes: 14 additions & 14 deletions hello-nextflow/solutions/1-hello-world/hello-world-4.nf
Original file line number Diff line number Diff line change
@@ -1,4 +1,15 @@
#!/usr/bin/env nextflow
/*
* Pipeline parameters
*/
params.greeting = 'Holà mundo!'

workflow {

// emit a greeting
sayHello(params.greeting)
}


/*
* Use echo to print 'Hello World!' to a file
Expand All @@ -8,24 +19,13 @@ process sayHello {
publishDir 'results', mode: 'copy'

input:
val greeting
val greeting

output:
path 'output.txt'
path 'output.txt'

script:
"""
echo '$greeting' > output.txt
echo '${greeting}' > output.txt
"""
}

/*
* Pipeline parameters
*/
params.greeting = 'Holà mundo!'

workflow {

// emit a greeting
sayHello(params.greeting)
}
Loading