-
Notifications
You must be signed in to change notification settings - Fork 19
New Feature - Flexible analysis start #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
|
| "errorMessage": "PacBio Index file for BAM subreads cannot contain spaces and must have extension '.bam.pbi' or being empty" | ||
| }, | ||
| "reads": { | ||
| "start_from": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can find a better name for this field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been scratching my head for naming this.
I used entrypoint, but I didn't felt it was very clear for users.
'start_from' is might not be the best, at least, it's very clear on it's signification.
Do you have propositions?
|
I added a subworflow to chunk the fasta files (from lima, isoseq refine and mapping start) before the mapping steps. |
|
A new update of the CHUNKER to apply it twice in the pipeline. |
PR checklist
nf-core lint).nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).Isoseq providers deliver sequences in many different format depending of the pre-processing they apply (Subreqds, CCS, Full Length isoseq). This even more true with the new MAS-seq.
I had implemented the possibility to deals with these format through options. However, their usage along with the possibility to skip ISOseq processing and align made the samplesheet and the usage of the pipeline complex.
In this PR, I changed the way to inject input sequences into the pipeline. Now, it's possible start analysis from ccs, lima, isoseq refine or at the mapping step. The different types of inputs can be even mixed in the samplesheet.
This modification simplify the usage but also the code.
It not necessary to deals with the different entrypoints any more. The inputs files are injected at the right moment in the main channel paths.