Skip to content

Separate annotation from scoring and restructure codebase#10

Merged
RalfG merged 7 commits into
release/0.5from
feat/restructuring
Apr 13, 2026
Merged

Separate annotation from scoring and restructure codebase#10
RalfG merged 7 commits into
release/0.5from
feat/restructuring

Conversation

@RalfG
Copy link
Copy Markdown
Member

@RalfG RalfG commented Apr 11, 2026

Decouples spectrum annotation from feature computation so annotations can be performed once and reused by multiple feature generators. Also restructures the codebase into logical modules and adds support for all 6 primary ion series.


Added

  • FragmentAnnotation and AnnotatedMS2Spectrum types — peak-centric annotation representation using only plain Python types (no dependency types exposed)
  • annotate_ms2_spectra() function — standalone annotation step with configurable tolerance_value and tolerance_mode (ppm or Da)
  • score_ms2_spectra() function — computes scoring features from annotated spectra with explicit active_ion_series parameter
  • Support for all 6 primary ion series (a, b, c, x, y, z) across all fragmentation models; inactive series reported as NaN
  • ms2pip/ module — placeholder for future ms2pip-specific functionality
  • Python and Rust unit tests for annotation, scoring, and shared utilities

Changed

  • Restructured codebase into types/, io/, scoring/, ms2pip/ modules
  • lib.rs now only contains module declarations and pymodule registration
  • Feature set is now fixed across all fragmentation models (no longer hardcoded to b/y)
  • matched_ions_pct denominator now uses all active series instead of hardcoded 2
  • Spectrum intensities kept as f32 throughout, cast to f64 only at output boundary

Removed

  • ms2_features_from_ms2spectra() — replaced by the two-step annotate_ms2_spectra()score_ms2_spectra() flow

RalfG added 7 commits April 10, 2026 14:46
Group files by responsibility: types/ for Python-facing data structures,
io/ for file parsing and format dispatch, scoring/ for feature computation,
ms2pip/ for future ms2pip-specific functionality. No logic changes.
Peak-centric annotation types that mirror rustyms output using only
plain Rust/Python types. AnnotatedMS2Spectrum carries the original
spectrum data alongside per-peak fragment annotations.
Extract annotation logic into standalone pyfunction that produces
AnnotatedMS2Spectrum output. Supports all 6 ion series (a/b/c/x/y/z)
and exposes tolerance_value + tolerance_mode parameters.
…ctra

Replace monolithic function with score_ms2_spectra() that consumes
AnnotatedMS2Spectrum. Fixed feature set for all 6 ion series with NaN
for inactive series. Extract shared math helpers to utils.rs.
Add 16 Python tests for annotate_ms2_spectra and score_ms2_spectra,
plus Rust unit tests in utils.rs. Fix empty spectrum panic, type
mismatches, clippy warnings, and PyO3 deprecation warnings.
…iciency

- Remove duplicate f32/f64 arrays in OwnedSpec, convert inline
- Cache parsed peptides to avoid double-parsing
- Extract empty_annotated helper for repeated return blocks
- Replace per-series HashMaps with fixed [_; 6] arrays
- Pre-compute feature name strings outside parallel loop
- Keep intensity accumulation in f32, cast to f64 at output boundary
- Use byte parsing in parse_ion_series_and_index to avoid heap allocs
- Fix stale comment in spectrum_prediction.rs, powf -> exp2
score_ms2_spectra now takes explicit active_ion_series so the caller
specifies which series the fragmentation model produces, rather than
inferring from matched annotations.
@RalfG RalfG merged commit 929d840 into release/0.5 Apr 13, 2026
4 checks passed
@RalfG RalfG added this to the 0.5.0 milestone Apr 13, 2026
@RalfG RalfG deleted the feat/restructuring branch April 13, 2026 06:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant