Skip to content

phosphoRS: faithful dynamic per-window peak-depth optimization (fidelity, #40)#42

Merged
timosachsenberg merged 3 commits into
fix/decoy-flr-scoring-bugsfrom
feature/phosphors-peak-depth-opt
May 31, 2026
Merged

phosphoRS: faithful dynamic per-window peak-depth optimization (fidelity, #40)#42
timosachsenberg merged 3 commits into
fix/decoy-flr-scoring-bugsfrom
feature/phosphors-peak-depth-opt

Conversation

@timosachsenberg
Copy link
Copy Markdown
Contributor

Summary

Faithful reimplementation of phosphoRS's dynamic per-window peak-depth optimization (Taus et al. 2011, pseudocode §9–§12) — the step §22 calls "the most important conceptual difference from a simple site-localization score." It was missing from the running code.

Fidelity PR (separate from the FLR work in #41 — based on that branch; retarget to main after #41 merges).

Background

The active scoring path scored isoforms against the plain filter_spectrum_for_phosphors output. The intended optimizer, _reduce_by_delta_selection, was dead code (disabled by d64e2dd "Use filtered spectrum directly") and buggy:

  • its depth-selection ratio pj/pj1 (maximized) was inverted — it minimized, rather than maximized, isoform separation;
  • its depth loop short-circuited after the first ratio;
  • it passed the experimental peak count as the binomial n (should be the theoretical-ion count).

So phosphoRS was not doing the discrimination-optimized peak selection that defines it.

What this adds

Per 100 m/z window, for each depth 1..8, score every isoform against the top-depth peaks (n = theoretical ions in window, k = matched, p = N·d/w) and pick the depth that maximizes rank1−rank2 (then rank1−rank3, rank1−rank4, then best absolute score; smaller depth on ties). When the window has no site-determining ions, maximize the best absolute score instead.

New, separately-tested helpers:

  • _window_has_site_determining_ions (§10)
  • _isoform_peptide_scores, _choose_window_depth (§9 & §11)
  • _reduce_by_peak_depth_optimization (§8–§12 orchestrator)

Gated by ENABLE_PEAK_DEPTH_OPTIMIZATION (default True). It reuses the consume-once _count_matched_ions matcher, so the binomial n/k are consistent with the rest of the scoring.

Effect (example data, q ≤ 0.01, 5% global FLR)

PhosphoRS sites @ 5% FLR decoy-localization rate runtime
before (filtered spectrum) 151 4.3% ~3.5 s
with optimizer 163 6.2% ~10.5 s

The discrimination-maximizing selection is symmetric (it sharpens whichever isoform genuinely matches best), so it both raises the yield (genuine sites rank above decoys with more resolution) and raises the measured decoy rate (faithful phosphoRS localizes to the decoy more often on these ambiguous spectra). The comparison conclusion is unchanged: PhosphoRS still recovers far fewer sites than AScore (163 vs 441) — its lower reliability here is genuine, not an artifact of the missing optimization. Cost: ~3× slower (per-isoform theoretical-spectrum generation; could be shared with the scoring loop in a follow-up).

Notes for the reviewer

Tests

178 pass (full suite) with the optimizer enabled. +4 unit tests: depth selection maximizes separation, site-determining-ion detection, no-SDI maximizes best score, tie/empty handling.

🤖 Generated with Claude Code

…40)

Implements the phosphoRS peak-depth optimization (Taus et al. 2011, pseudocode
sections 9-12) that the implementation was missing: per 100 m/z window, choose
the peak depth (1..8) that maximizes the rank1-rank2 separation between the best
and second-best isoform (rank3/rank4 tie-breakers; best-absolute-score when no
site-determining ions are present; smaller depth on ties).

This replaces _reduce_by_delta_selection, which was dead code AND buggy: its
depth-selection ratio (pj/pj1, maximized) was inverted (it minimized separation)
and it used the experimental peak count as the binomial n. Since commit d64e2dd
("Use filtered spectrum directly") the active path scored against the
unoptimized filtered spectrum.

New, separately-tested helpers:
- _window_has_site_determining_ions   (section 10)
- _isoform_peptide_scores / _choose_window_depth  (sections 9 & 11)
- _reduce_by_peak_depth_optimization   (sections 8-12 orchestrator)
Gated by ENABLE_PEAK_DEPTH_OPTIMIZATION (default True).

Effect on the example data (q<=0.01, 5% global FLR): PhosphoRS yield 151 -> 163,
decoy-localization rate 4.3% -> 6.2%; phosphoRS ~3x slower (per-isoform
theoretical-spectrum generation). The comparison conclusion is unchanged --
PhosphoRS still recovers far fewer sites than AScore (163 vs 441), confirming
its lower reliability on this data is genuine, not an artifact of the missing
optimization.

+4 unit tests (depth selection, site-determining detection, tie-break, empties).
Full suite: 178 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 31, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 10189306-3901-424e-a594-4c04f4c2817e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/phosphors-peak-depth-opt

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 31, 2026

Not up to standards ⛔

🔴 Issues 7 high · 2 medium · 21 minor

Alerts:
⚠ 30 issues (≤ 0 issues of at least minor severity)

Results:
30 new issues

Category Results
Documentation 21 minor
Security 7 high
Complexity 2 medium

View in Codacy

🟢 Metrics -56 complexity · -11 duplication

Metric Results
Complexity -56
Duplication -11

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

timosachsenberg and others added 2 commits May 31, 2026 15:40
…optimizer

Deletes the now-unused window-reduction code paths (none had any call site):
- _reduce_by_delta_selection  (buggy: inverted depth ratio, experimental peak
  count used as the binomial n) and _reduce_spectrum_by_windows;
- their exclusive helpers _site_determining_ions, _window_intensity_thresholds,
  _count_peaks_above_threshold, _get_intensity_thresholds;
- the now-orphaned MIN_DEPTH constant.

The faithful _reduce_by_peak_depth_optimization replaces them; _get_window_indexes
is kept (used by it). ~371 lines removed; behavior unchanged. Full suite: 178 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lity w (#40)

p = N*d/w (phosphoRS, Taus et al. 2011 section 13) where w is the FULL m/z range
of the MS/MS spectrum. The code used the m/z span of the *extracted* peak list
instead, which is narrower -> p too large -> scores slightly over-conservative.
Use the original spectrum's m/z range (N still = extracted peak count).

Before/after on the full dataset (data/1.mzML, 3697 PSMs): 81/1989 per-PSM
PhosphoRS_pep_score values change, but the 3-way decoy-AA FLR is unchanged
(PhosphoRS 163 sites @ 5% FLR, decoy rate 6.2%; LucXor 574 / AScore 441) -- the
deviation was real but immaterial to the comparison here. Kept for fidelity.

Full suite: 178 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@timosachsenberg timosachsenberg merged commit c01eb28 into fix/decoy-flr-scoring-bugs May 31, 2026
1 of 2 checks passed
@timosachsenberg
Copy link
Copy Markdown
Contributor Author

Fast-forwarded into fix/decoy-flr-scoring-bugs (PR #41), which now contains all three commits here (faithful peak-depth optimizer 499732e, dead-reducer removal 9594a96, full-range w fix c01eb28). All review/merge now happens on #41 against main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant