Skip to content

WIP: Add GED transformer #13259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 48 commits into
base: main
Choose a base branch
from
Draft

WIP: Add GED transformer #13259

wants to merge 48 commits into from

Conversation

Genuster
Copy link
Contributor

@Genuster Genuster commented May 22, 2025

What does this implement/fix?

Adds transformer for generalized eigenvalue decomposition (or approximate joint diagonalization) of covariance matrices.
It generalizes xdawn, csp, ssd, and spoc algorithms.

Additional information

Steps:

  • test that it outputs identical filters and patterns as child classes for all tests by adding temporary assert_allclose calls in code
  • cover tests for _GEDTransformer and core functions
  • add _validate_params to _XdawnTransformer
  • add feature to perform GED in the principal subspace for Xdawn and SPoC
  • add option for CSP and SSD to select restr_type and provide info for CSP
  • add entry in mne's implementation details
  • move SSD and Xdawn pattern computation from np.linalg.pinv to mne's pinv
  • change SSD's multiplication order for dimension reduction for consistency
  • fix SSD's filters_ shape inconsistency
  • move mne.preprocessing._XdawnTransformer to decoding and make it public
  • rename _XdawnTransformer method_params to cov_method_params for consistency
  • SSD performs spectral sorting of components each time .transform() is applied. This could be optimized by sorting filters_, evals_ and patterns_ already in fit and will suit current GED design better
  • in SSD's .transform() when return_filtered=True, subsetting with self.picks_ is done two times - looks like bug
  • remove assert_allclose calls in code
  • clean up newly redundant code from the child classes
  • complete tests for child classes
  • perhaps ssd should use mne.cov.compute_whitener() instead of its own whitener implementation. It won't be identical, but conceptually seems to do the same thing

Then it should be ready for merge!

@larsoner
Copy link
Member

Already have a failure but fortunately it's just a tol issue I think:

mne/decoding/tests/test_csp.py:444: in test_spoc
    spoc.fit(X, y)
mne/decoding/csp.py:985: in fit
    np.testing.assert_allclose(old_filters, self.filters_)
E   AssertionError: 
E   Not equal to tolerance rtol=1e-07, atol=0
E   
E   Mismatched elements: 1 / 100 (1%)
E   Max absolute difference among violations: 9.04019248e-09
E   Max relative difference among violations: 1.11806536e-07
E    ACTUAL: array([[  2.037415,   1.424886,   2.718162,  -3.07798 ,  -3.862132,
E             1.412549,  -3.821452,   1.276637,   1.899782,  -2.389858],
E          [ 11.534231, -22.178034, -12.321628, -52.410096,  62.876084,...
E    DESIRED: array([[  2.037415,   1.424886,   2.718162,  -3.07798 ,  -3.862132,
E             1.412549,  -3.821452,   1.276637,   1.899782,  -2.389858],
E          [ 11.534231, -22.178034, -12.321628, -52.410096,  62.876084,...

I would just bump the rtol a bit here to 1e-6, and if you know the magnitudes are in the single/double digits then an atol=1e-7 would also be reasonable (could do both).

@Genuster
Copy link
Contributor Author

Genuster commented May 22, 2025

Thanks!
Interesting how it passed macos-13/mamba/3.12, but didn't pass macos-latest/mamba/3.12

It might be that the small difference between filters_ will propagate and increase in patterns_, so rtol/atol won't be much of help for patterns_. But let's see

@larsoner
Copy link
Member

Different architectures, macos-13 is Intel x86_64 and macos-latest is ARM / M1. And Windows also failed, could be use of MKL there or something. I'm cautiously optimistic it's just floating point errors...

@Genuster
Copy link
Contributor Author

Genuster commented Jun 3, 2025

@larsoner, I think I covered tests for the core GEDTransformer cases. Could you check that it's enough and I can move to the next step?

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great that the assert statements are passing! Just a few comments below. Also, can you see if you can get closer to 100% coverage here?

https://app.codecov.io/gh/mne-tools/mne-python/pull/13259

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple more comments.

FYI I modified your top comment to have checkboxes (you can see how it's done if you go to edit it yourself) and a rough plan. Can you see if the plan is what you have in mind and update accordingly if needed? Then I can see where you (think you) are after your next push, and when you ask "okay to move on" I'll know what you want to do next 😄

@Genuster
Copy link
Contributor Author

Genuster commented Jun 4, 2025

Thanks Eric!

Great that the assert statements are passing! Just a few comments below. Also, can you see if you can get closer to 100% coverage here?
https://app.codecov.io/gh/mne-tools/mne-python/pull/13259

That's a cool tool, like that! Will do

FYI I modified your top comment to have checkboxes (you can see how it's done if you go to edit it yourself) and a rough plan. Can you see if the plan is what you have in mind and update accordingly if needed? Then I can see where you (think you) are after your next push, and when you ask "okay to move on" I'll know what you want to do next 😄

Alright :)

@Genuster
Copy link
Contributor Author

Hi @larsoner! Is there anything else I should improve before moving on to the next steps?

Also, could you please confirm that I'm not confused with these two?

  • SSD performs spectral sorting of components each time .transform() is applied. This could be optimized by sorting filters_, evals_ and patterns_ already in fit and will suit current GED design better.
  • in SSD's .transform() when return_filtered=True, subsetting with self.picks_ is done two times - looks like bug.
mocking_spongebob

@larsoner
Copy link
Member

SSD performs spectral sorting of components each time .transform() is applied. This could be optimized by sorting filters_, evals_ and patterns_ already in fit and will suit current GED design better.

I think this should be okay. Is the sorting by variance explained or something? Before they're sorted how are they ordered? If it's random for example it's fine to sort, but if it's not random and has some meaning them maybe we shouldn't store them sorted some other way...

in SSD's .transform() when return_filtered=True, subsetting with self.picks_ is done two times - looks like bug.

Yes it sounds like it probably is. Best thing you can do is prove to yourself (and us in review 😄 ) that it's a bug by pushing a tiny test that would fail on main but pass on this PR. Maybe by taking the testing dataset data which has MEG channels followed by EEG channels, and telling it to operate on just EEG channels or something? If it picks twice the second picking should fail because the indices will be out of range...

@Genuster
Copy link
Contributor Author

I think this should be okay. Is the sorting by variance explained or something? Before they're sorted how are they ordered? If it's random for example it's fine to sort, but if it's not random and has some meaning them maybe we shouldn't store them sorted some other way...

They are stored sorted by descending eigenvalues by default. But I think given the sort_by_spectral_ratio parameter it is expected that the filters will be stored according to this sorting when the parameter is True.

Best thing you can do is prove to yourself (and us in review 😄 ) that it's a bug by pushing a tiny test that would fail on main but pass on this PR.

I pushed the test here, but how do I show you that it fails on main?

Is there anything left to do before I can remove the asserts and clean up the classes?

@larsoner
Copy link
Member

I pushed the test here, but how do I show you that it fails on main?

If this were true test-driven development (TDD) you could for example make this your first commit, show it failed on CIs, then proceed to fix it in subsequent commits. But in practice this isn't typically done and would be annoying to do here, so what I would suggest is that you take whatever tiny test segment should fail on main, copy it, switch over to the main branch, paste it back in, and make sure it fails. In principle I could do this as well but better for you to give it a shot and say "yep it failed" and I'll trust you to have done it properly 😄

Is there anything left to do before I can remove the asserts and clean up the classes?

Currently I see a bunch of red CIs. So I would say yes, get those green first. An implicit sub-step of all steps is "make sure CIs are still green" before moving onto the next step (unless of course one of your steps someday is something like, "TDD: add breaking test and see CIs red" or whatever, but it's not that way here). Then proceeding with the plan in the top comment looks good to me, the next step of which is to remove the asserts, then redundant code, etc.!

@Genuster
Copy link
Contributor Author

Genuster commented Jun 26, 2025

If this were true test-driven development (TDD) you could for example make this your first commit, show it failed on CIs, then proceed to fix it in subsequent commits. But in practice this isn't typically done and would be annoying to do here, so what I would suggest is that you take whatever tiny test segment should fail on main, copy it, switch over to the main branch, paste it back in, and make sure it fails. In principle I could do this as well but better for you to give it a shot and say "yep it failed" and I'll trust you to have done it properly 😄

Of course, I checked that it fails before I even started implementing the fix. Thanks for the explanation!

Currently I see a bunch of red CIs. So I would say yes, get those green first. An implicit sub-step of all steps is "make sure CIs are still green" before moving onto the next step (unless of course one of your steps someday is something like, "TDD: add breaking test and see CIs red" or whatever, but it's not that way here). Then proceeding with the plan in the top comment looks good to me, the next step of which is to remove the asserts, then redundant code, etc.!

Yeah, I rushed a bit with a comment before checking that everything's green :) I handled the major problem, but the failure I get now doesn't make much sense to me. The issue seems to be order-related (which is fair given my last commits), but it doesn't appear neither on my windows machine, nor on most of the CIs here. Is there a way to get a more elaborate traceback?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants