Bugfixes and improvements in Evaluation module by raphaelvallat · Pull Request #228 · raphaelvallat/yasa

raphaelvallat · 2026-03-06T21:26:50Z

Critical bugfixes and several improvements in Evaluation module.

I asked Claude to do a thorough comparison of the YASA Evaluation pipeline against the Menghini R pipeline.

@remrama please review the proposed bugfixes and the new report method to replace get_table(). These are described in evaluation_review.md. I also renamed some of the column names/parameters (e.g. "parm" --> "param").

Note that I've also started adding a validation of the main metrics against the Menghini pipeline using their sample dataset (not committed yet).

Once we have finalized the code change in this PR, I'll create in another PR a dedicated tutorial comparing the Menghini pipeline versus YASA.

Update: I'm just seeing your recent changes in the evaluation branch..! Let me know how you'd like to move forward. We can merge your branch/PR first.

codecov-commenter · 2026-03-06T21:51:24Z

Codecov Report

❌ Patch coverage is 72.11538% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.57%. Comparing base (211d7fe) to head (5a84f9c).

Files with missing lines	Patch %	Lines
src/yasa/evaluation.py	72.11%	26 Missing and 3 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #228      +/-   ##
==========================================
+ Coverage   92.46%   92.57%   +0.10%     
==========================================
  Files          13       13              
  Lines        2788     2814      +26     
  Branches      379      383       +4     
==========================================
+ Hits         2578     2605      +27     
- Misses        152      153       +1     
+ Partials       58       56       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

remrama

Thank you for the cleanup! Looks great and I'm happy to see this module get sanitized, makes it much more approachable for future features. Nice bug catches as well 🫠

Oh and I figure the docstrings are being skipped for development purposes but will be unskipped prior to v0.8?

remrama · 2026-03-20T17:08:27Z

+| # | Issue | Status | Notes |
+|---|-------|--------|-------|
+| 6 | **Log transformation** missing | ⚠️ Deferred | Planned: `log_transform` param + Euser et al. (2008) back-transform. Separate PR. |
+| 7 | **Individual discrepancy heatmap** (`indDiscr.R`) | ❌ | No equivalent; data available via `get_sleep_stats()` |


Separate PR that can go with ROC curves (for general plotting improvements to this module).

remrama · 2026-03-20T17:57:52Z

Oh, and RE: my separate evaluation branch, just leave it there for now but don't worry about merging it. There are a few early commits that I would like to have for reference during development, but I will delete it myself prior to v0.8 to avoid any confusion.

raphaelvallat · 2026-03-28T14:33:45Z

All PR comments addressed!

remrama · 2026-03-28T15:03:13Z

All PR comments addressed!

Fantastic, thanks @raphaelvallat.

1. evaluation.py line ~341 — Removed the sentence "This corresponds to R's metricsType="sum" in the Menghini et al. 2021 pipeline." from the pooled parameter docstring. 2. evaluation.py — Renamed the n_sleeps property to n_sessions in EpochByEpochAgreement (making it consistent with SleepStatsAgreement.n_sessions), updated all 12 internal references. 3. evaluation.py line ~423 — Fixed column order in get_agreement_bystage docstring to match actual output: fbeta, npv, precision, recall, specificity, support (alphabetical). 4. evaluation.py line ~431 — Changed df.values.T → df.to_numpy().T in the scorer function. 5. evaluation.py line ~1007 — Added strict=True to the zip() call for parametric LoA computation. 6. evaluation.py line ~1342 — Replaced the one-liner comment on _unit() with a multi-line docstring explaining it handles all stats from sleep_statistics(). 7. tests/test_evaluation.py line 16 — Renamed N_NIGHTS → N_SESSIONS throughout the test file, and updated test_n_sleeps → test_n_sessions with ebe.n_sleeps → ebe.n_sessions.

raphaelvallat self-assigned this Mar 6, 2026

raphaelvallat requested a review from remrama March 6, 2026 21:55

raphaelvallat marked this pull request as ready for review March 6, 2026 21:56

This was referenced Mar 7, 2026

Remaining tasks/issues before formal 0.7.0 release #226

Closed

Roadmap for v0.8.0 (Evaluation module) #236

Open

raphaelvallat added this to the v0.8.0 milestone Mar 9, 2026

raphaelvallat force-pushed the evaluation_module_update branch from d4cb619 to 38e102e Compare March 14, 2026 13:33

remrama approved these changes Mar 20, 2026

View reviewed changes

remrama mentioned this pull request Mar 21, 2026

Finalize evaluation module #166

Open

7 tasks

raphaelvallat force-pushed the evaluation_module_update branch from 38e102e to d4cb619 Compare March 28, 2026 14:14

raphaelvallat added 5 commits March 29, 2026 14:00

Bugfixes and improvements in Evaluation

0a3ff98

ruff

d9b159c

Added report method

5fd43d5

ruff

b593071

raphaelvallat force-pushed the evaluation_module_update branch from b39ccb3 to 5a84f9c Compare March 29, 2026 12:00

raphaelvallat merged commit 2319677 into master Mar 29, 2026
18 checks passed

raphaelvallat deleted the evaluation_module_update branch March 29, 2026 12:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfixes and improvements in Evaluation module#228

Bugfixes and improvements in Evaluation module#228
raphaelvallat merged 5 commits into
masterfrom
evaluation_module_update

raphaelvallat commented Mar 6, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Mar 6, 2026 •

edited

Loading

Uh oh!

remrama left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

remrama Mar 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

remrama commented Mar 20, 2026 •

edited

Loading

Uh oh!

raphaelvallat commented Mar 28, 2026

Uh oh!

remrama commented Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

raphaelvallat commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

remrama left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

remrama Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

remrama commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

raphaelvallat commented Mar 28, 2026

Uh oh!

remrama commented Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

raphaelvallat commented Mar 6, 2026 •

edited

Loading

codecov-commenter commented Mar 6, 2026 •

edited

Loading

remrama commented Mar 20, 2026 •

edited

Loading