Skip to content

Conversation

@LouisLeNezet
Copy link
Collaborator

PR checklist

This PR add genetic map support for phasing and imputation tool as well as auto convertion to all format.

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/phaseimpute branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@LouisLeNezet LouisLeNezet self-assigned this Nov 17, 2025
@LouisLeNezet LouisLeNezet linked an issue Nov 17, 2025 that may be closed by this pull request
@LouisLeNezet LouisLeNezet added the enhancement New feature or request label Nov 17, 2025
Copy link
Collaborator

@atrigila atrigila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a pretty extensive change, I would benefit from reviewing it in incremental stages. I suggest you make PR's to modules. There, you can submit the local subworkflows and add the tests that require using maps. Once we get those approved, it will be easier to then focus on the broader picture here with tests aiming at the whole pipeline tests.

conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/gawk:5.3.0' :
'biocontainers/gawk:5.3.0' }"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are using GAWK, why don't we use the nf-core module?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue to use the nf-core module is that it would result in a more complex looking module (script in the config), splitting output channel based on regex and therefore a more complicated modules to maintain.
Writing it this may does add a new local module, but make it easier to understand how it works.

docs/usage.md Outdated
`--map_sep "\t" --map_header true --map_col_names "chr,id,cm,pos"`

```csv title="chr21.map"
chr id centimorgan position
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your header here is centimorgan position while in the description you say that it should be pos and cm

Copy link
Collaborator Author

@LouisLeNezet LouisLeNezet Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header is in fact not parsed but can be present, so the columns names in the map file are not useful.

Comment on lines +145 to +147
- `--map_sep`: Field separator used in the map file (e.g. "\t", " ", ",")
- `--map_header`: Whether the file contains a header row (`true` or `false`)
- `--map_col_names`: Ordered list of column names in the file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these could be easily inferred in the module so that we don't add new params to users (which means they might be more prone to errors).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we have a test for all the other imputation tools that require a map? A test with and without a map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Map is not supported yet

2 participants