Skip to content

Fix shuffled NetMHCpan/NetMHCIIpan allele labels in merged/best_allele output (fixes #358)#359

Merged
jonasscheid merged 3 commits into
nf-core:devfrom
jonasscheid:fix/netmhciipan-best-allele-shuffle
Jun 15, 2026
Merged

Fix shuffled NetMHCpan/NetMHCIIpan allele labels in merged/best_allele output (fixes #358)#359
jonasscheid merged 3 commits into
nf-core:devfrom
jonasscheid:fix/netmhciipan-best-allele-shuffle

Conversation

@jonasscheid

@jonasscheid jonasscheid commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Description

Fixes #358.

MERGE_PREDICTIONS reconstructed which allele each NetMHC output column belonged to from sorted(meta.alleles). This is wrong whenever the pipeline's sort order diverges from the predictor's output column order: the rank/BA values (tied to their column position) stay correct, but the allele labels — and the downstream best_allele — get shuffled. Two real triggers:

  • NetMHCIIpan (class II): it writes allele columns sorted in its own allele-name format, which strips the : (DPB1*10:01DPB11001). Combined with variable-width first fields (10 vs 107), the : (ASCII 58, sorts after digits) flips the order relative to the pipeline's mhcgnomes sort.
  • NetMHCpan (class I): a stray space in a samplesheet allele (e.g. ...;C*07:01; C*07:04) makes ' C*07:04' sort first (space = ASCII 32), rotating the whole column→allele mapping by one.

The fix reads the authoritative allele order from the xls header row (each predictor labels every allele block there) instead of assuming sorted(meta.alleles). This is robust to both triggers above and also fixes a latent misalignment when some requested alleles are unsupported by the predictor.

Verified end-to-end with real NetMHCIIpan 4.3 (divergent DPB1*10:01 / DPB1*107:01) and real NetMHCpan 4.2 (the reported spaced-allele case): every allele now maps to its correct rank and best_allele is correct.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/epitopeprediction branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

…he xls header (fixes nf-core#358)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@nf-core-bot

Copy link
Copy Markdown
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.5.2.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

…slabels alleles when a samplesheet allele has a stray space (nf-core#358)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@jonasscheid jonasscheid changed the title Fix shuffled NetMHCIIpan allele labels in merged/best_allele output (fixes #358) Fix shuffled NetMHCpan/NetMHCIIpan allele labels in merged/best_allele output (fixes #358) Jun 11, 2026
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@jonasscheid jonasscheid merged commit 4da7703 into nf-core:dev Jun 15, 2026
9 checks passed
@jonasscheid jonasscheid deleted the fix/netmhciipan-best-allele-shuffle branch June 15, 2026 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants