Skip to content

Conversation

@sguizard
Copy link
Collaborator

@sguizard sguizard commented Nov 1, 2025

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/isoseq branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Isoseq providers deliver sequences in many different format depending of the pre-processing they apply (Subreqds, CCS, Full Length isoseq). This even more true with the new MAS-seq.
I had implemented the possibility to deals with these format through options. However, their usage along with the possibility to skip ISOseq processing and align made the samplesheet and the usage of the pipeline complex.

In this PR, I changed the way to inject input sequences into the pipeline. Now, it's possible start analysis from ccs, lima, isoseq refine or at the mapping step. The different types of inputs can be even mixed in the samplesheet.
This modification simplify the usage but also the code.

It not necessary to deals with the different entrypoints any more. The inputs files are injected at the right moment in the main channel paths.

@github-actions
Copy link

github-actions bot commented Nov 1, 2025

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit a6bdafd

+| ✅ 256 tests passed       |+
!| ❗   7 tests had warnings |!
Details

❗ Test warnings:

  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in nextflow.config: Specify any additional parameters here
  • local_component_structure - set_value_channel.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - set_chunk_num_channel.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure

✅ Tests passed:

Run details

  • nf-core/tools version 3.4.1
  • Run at 2025-11-11 14:57:59

"errorMessage": "PacBio Index file for BAM subreads cannot contain spaces and must have extension '.bam.pbi' or being empty"
},
"reads": {
"start_from": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can find a better name for this field

Copy link
Collaborator Author

@sguizard sguizard Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been scratching my head for naming this.
I used entrypoint, but I didn't felt it was very clear for users.
'start_from' is might not be the best, at least, it's very clear on it's signification.

Do you have propositions?

@sguizard
Copy link
Collaborator Author

sguizard commented Nov 3, 2025

I added a subworflow to chunk the fasta files (from lima, isoseq refine and mapping start) before the mapping steps.
As those inputs files are devided into chunks with CCS, the fasta produced is massive and contains all sequences. To mitigate this problem, the CHUNKER subworkflow will split those into small chunks.

@sguizard
Copy link
Collaborator Author

A new update of the CHUNKER to apply it twice in the pipeline.

@sguizard sguizard added the WIP Work in progress label Nov 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

WIP Work in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants