Skip to content

[draft] new reader for curry files, using curry-python-reader code #13176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

dominikwelke
Copy link
Contributor

@dominikwelke dominikwelke commented Mar 26, 2025

hi all
as discussed in #13033 here a draft PR to use the official curry reader code.

in a first step i just use the reader as a module (only fixed formatting to force it past pre-commit).
it has some drawbacks (e.g. data is always loaded) and i did not implement all possible data yet (eg hpi, or epoched recordings) but in general it already works pretty well. tested it with all their example data, and one of my own recordings that didnt work in mne before.

it would be great to get some feedback how you want me to proceed with this @drammock @larsoner:

  • do we want to stick with the module approach, leave their code untouched and work with the output (would allow easier updating when they push changes)
  • or should i merge the code more thoroughly. making it easier to maintain and in terms of clarity

BACKGROUND:
the curry data reader currently cant handle all/newer curry files
plan is port code from the official curry reader into mne-python

for permission see #12855

closes #12795 #13033 #12855

@@ -0,0 +1,633 @@
# Authors: The MNE-Python contributors.
# License: BSD-3-Clause
# Copyright the MNE-Python contributors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this file the official reader? is this file copied from somewhere?

Copy link
Contributor Author

@dominikwelke dominikwelke Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's copied from https://github.com/neuroscan/curry-python-reader

the false info you flagged was added by [autofix.ci]
further formatting changes were necessary to pacify pre-commit hook

Copy link
Contributor Author

@dominikwelke dominikwelke Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you discussed this topic with a compumedics dev in #12855

they said they wont supply a pypi or conda version, but we are free to use the code.
the github repo has a BSD3 license applied, but they dont include any note in the file itself

@drammock
Copy link
Member

do we want to stick with the module approach, leave their code untouched and work with the output (would allow easier updating when they push changes), or should i merge the code more thoroughly. making it easier to maintain and in terms of clarity

Given that @CurryKaiser refused our offer to help them package up their reader for PyPI / conda-forge, I see two remaining options:

  1. "vendor" their code. To make it slightly future-proof, we could write a script (in tools/ I guess) that fetches the latest code from their repo, auto-formats it to make it compatible with MNE's pre-commit requirements, and puts the (formatted but otherwise unmodified) code in mne/io/curry/_vendored.py. (This is basically a manual version of git submodule update because I don't think we should invoke git submodule for this use case.) We then adapt our code in mne/io/curry.py to be a wrapper around their code that basically just gets things into proper MNE container objects; and know that we might need to tweak our wrappers any time the vendored code is updated.

  2. Fully incorporate their code. Re-write their reader to align better with our codebase, in terms of variable names, idioms like _check_option or _validate_type, features like preload=False, etc.

Personally I lean toward option 2. I say this because if we're going to try to support curry files, at a minimum we need to be able to fix bugs when they arise, and ideally we should be willing/able to incorporate new features that have high demand from our users (preload=False is an obvious first example). But if we're fixing bugs, do we open PRs to upstream (with no guarantee of responsiveness), or tweak our "thin" wrapper to handle more and more corner cases? Neither option is appealing, so at that point it starts to seem easier to me to just maintain the entire reader ourselves.

@agramfort
Copy link
Member

agramfort commented Mar 28, 2025 via email

@drammock
Copy link
Member

hum... my first reaction is to push a version 0.1 of their package on pypi and rely on this. Basically we maintain a fork and hope that the fork changes are accepted upstream... it feels less hacky and they also have a CI and some testing setup with test_data that I would not duplicate in mne-python...

that indeed is less hacky than my approach to vendoring. I'd be OK with that outcome, though curious what @larsoner will think.

@larsoner
Copy link
Member

I'm fine with that idea but it would be good to get some blessing/permission from them to do this

@drammock
Copy link
Member

@CurryKaiser

I'm fine with that idea but it would be good to get some blessing/permission from them to do this

xref to #12855 (comment) where I've asked for confirmation that Compumedics really doesn't want to be the packager and they're OK with us doing it.

@CurryKaiser
Copy link

@CurryKaiser

I'm fine with that idea but it would be good to get some blessing/permission from them to do this

xref to #12855 (comment) where I've asked for confirmation that Compumedics really doesn't want to be the packager and they're OK with us doing it.

And nothing has changed, so all good from our side. Sorry we couldn't package it for you. And thank you for working on this!

@dominikwelke
Copy link
Contributor Author

thanks @CurryKaiser !

ok, sounds like a plan.. I can start working on this again soon, if you give a go @agramfort @drammock @larsoner

I guess the fork should live in the mne-tools org? I have the necessary rights to create it

@dominikwelke dominikwelke changed the title [draft] new reader for curry files, using curry-pyhon-reader code [draft] new reader for curry files, using curry-python-reader code Apr 1, 2025
@larsoner
Copy link
Member

larsoner commented Apr 1, 2025

Yeah I think so

@drammock
Copy link
Member

drammock commented Apr 1, 2025

Yeah I think so

I already made the fork

@agramfort
Copy link
Member

agramfort commented Apr 1, 2025 via email

@drammock
Copy link
Member

drammock commented Apr 2, 2025

xref to mne-tools/curry-python-reader#1

@dominikwelke
Copy link
Contributor Author

i could need some guidance on 2 things:

  1. channel locations:
    curry files come with channel locations, and for EEG it was straight forward to build a montage and apply.
    but for MEG it seems i need to use other functions. any pointers would help!
    do i need to populate info["dig"] directly?

  2. HPI/CHPI data:
    some MEG files seem to come with these data. how do i store them in the raw object?

a few other things to discuss:

  • preload
    easiest would be to not offer preload=False and just load the data to memory.
    a single load_data() call would also be doable with the official reader, but a chunk reader not really (if we dont want to hack it; e.g. load all data and discard large parts). not sure i'm deep enough in the mne codebase to know what the implications are (e.g. computations, plots etc with unloaded data)
    what are your thought?

  • epoched files
    the reader code looks as if there could be files with epoched recordings, but there are none among their sample files. do any of you know more about this? otherwise ill ask the curry devs

@dominikwelke
Copy link
Contributor Author

p.s. and could you remind me how to switch off the CIs when pushing these early commits?

@larsoner
Copy link
Member

larsoner commented Apr 3, 2025

Push commits with [ci skip] in the commit message and they long / expensive CIs shouldn't run (a few quick ones still will I think)

@larsoner
Copy link
Member

larsoner commented Apr 3, 2025

... for the cHPI stuff it's probably easiest to load a Neuromag raw FIF file with cHPI info and look at how the info is stored for example in info["hpi_subsystem"]. You can also look at the Info docs, especially the notes. It's not complete but it will help.

For preload, since preload=False is in main it would be a functionality regression to remove it. Once you know how the data are stored on disk and how to read an arbitrary time slice from it, it's really not bad to do the work to make preload=False work. So if you can figure this part out in some function, I can help you fit it into the _read_segment_file code. Since the .py file is only a few hundred lines (a lot of which seems like plotting etc. that we won't use), I'm cautiously optimistic we can figure it out and make it work. And then the python-curry-reader code can really be for reading metadata, annotations, sensor locs, etc. plus knowing where to read the data from disk. We can probably even keep the existing _read_segment_file, it should work in theory...

@dominikwelke
Copy link
Contributor Author

dominikwelke commented Apr 14, 2025

ok, _read_segment_file does indeed work unchanged.
the reader should now be more/less functional

  • I'd still need some guidance on handling/storing the channel locations, esp. for MEG data
  • HPI data - looks like I got it wrong - there might not be cHPI data after all, only HPI marker locations provided in different formats depending on the system

@dominikwelke
Copy link
Contributor Author

@CurryKaiser
thanks for the permission to use the code, also from my side!

in another place you said you might be able to provide us with test files - could we perhaps get a small one with epoched recordings in it (format version shouldn't matter)?
your repository for the python reader contains some test files that the reader interprets as epoched, but they dont seem to really be (perhaps the files were truncated for size)

@CurryKaiser
Copy link

Could be that they were truncated, let me check.

@CurryKaiser
Copy link

Ok, try these:
EpochedData

@dominikwelke
Copy link
Contributor Author

thanks for the file @CurryKaiser
fyi, we have now packaged and published the curryreader on PyPI.
it can be installed via pip install curryreader

@dominikwelke
Copy link
Contributor Author

@drammock @larsoner @agramfort
it is on PyPI but not on conda-forge - how is this case dealt with in MNE? should we also submit it to conda forge?

currently pip install mne[full] fetches it, but conda env create --file environment.yml doesnt

related question:
which pip dependency level in pyproject.toml should this go to? i treated curryreader like the antio package (for ANT neuro files) but this makes it an optional requirement (in mne[full]). i believe this mean it wont be automatically installed when calling pip install mne?

@larsoner
Copy link
Member

it is on PyPI but not on conda-forge - how is this case dealt with in MNE? should we also submit it to conda forge?

Yeah use grayskull, it's not too painful, see for example conda-forge/staged-recipes#28279

@dominikwelke
Copy link
Contributor Author

Yeah use grayskull,

see conda-forge PR: conda-forge/staged-recipes#29754

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MEG data problem consulting
5 participants