Skip to content

Refactor _handle_events_reading to allow extracting annotation information stand-alone #1389

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Apr 16, 2025
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
c2d078a
refactoring to get events_file info standalone
matthiasdold Apr 3, 2025
61fd6bb
shortened docstring
matthiasdold Apr 3, 2025
c3f3317
added authors and whats_new
matthiasdold Apr 3, 2025
8428c75
remove _ for private naming convention of events_file_to_annotation_k…
matthiasdold Apr 3, 2025
f17ef79
corrected _ in events_file_to_annotation_kwargs
matthiasdold Apr 3, 2025
44a885e
added html-noplot directive to check local doc builds without plots
matthiasdold Apr 4, 2025
49fa312
back to private naming convention
matthiasdold Apr 4, 2025
2942010
removing func handles as modified are private functions
matthiasdold Apr 4, 2025
f98ac2b
added unit tests for events_file_to_annotation_kwargs
matthiasdold Apr 7, 2025
2dd91cc
added example to docstring
matthiasdold Apr 7, 2025
8868355
docstring for pytest function
matthiasdold Apr 7, 2025
4fdf132
updated whats_new and keeping the Makefile changes local
matthiasdold Apr 8, 2025
d7f92da
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 8, 2025
10f634c
exposing events_file_to_annotation_kwargs at mne_bids level
matthiasdold Apr 8, 2025
4e6b554
trying double backticks
matthiasdold Apr 9, 2025
02d26b2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 9, 2025
cce5558
Update doc/whats_new.rst
sappelhoff Apr 10, 2025
9c75050
added events_file_to_anntation_kwargs to doc/api.rst
matthiasdold Apr 10, 2025
b62ae70
fixing docstring
matthiasdold Apr 10, 2025
a4ce5e5
pytest adjustments
matthiasdold Apr 16, 2025
6559e55
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,10 @@ authors:
family-names: Gerçek
affiliation: 'University of Geneva, Department of Fundamental Neuroscience'
orcid: 'https://orcid.org/0000-0003-1063-6769'
- given-names: Matthias
family-names: Dold
affiliation: 'Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands'
orcid: 'https://orcid.org/0009-0003-1477-4912'
- given-names: Alexandre
family-names: Gramfort
affiliation: 'Université Paris-Saclay, Inria, CEA, Palaiseau, France'
Expand Down
1 change: 1 addition & 0 deletions doc/authors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
.. _Mara Wolter: https://github.com/marakw
.. _Marijn van Vliet: https://github.com/wmvanvliet
.. _Mathieu Scheltienne: https://github.com/mscheltienne
.. _Matthias Dold: https://github.com/matthiasdold
.. _Matt Sanderson: https://github.com/monkeyman192
.. _Maximilien Chaumon: https://github.com/dnacombo
.. _Moritz Gerster: http://moritz-gerster.com
Expand Down
2 changes: 2 additions & 0 deletions doc/whats_new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ The following authors contributed for the first time. Thank you so much! 🤩
* `Christian O'Reilly`_
* `Berk Gerçek`_
* `Arne Gottwald`_
* `Matthias Dold`_

The following authors had contributed before. Thank you for sticking around! 🤘

Expand All @@ -39,6 +40,7 @@ Detailed list of changes
- Empty-room matching now preferentially finds recordings in the subject directory tagged as `task-noise` before looking in the `sub-emptyroom` directories. This adds support for a part of the BIDS specification for ER recordings, by `Berk Gerçek`_ (:gh:`1364`)
- Path matching is now implemenented in a more efficient manner within :meth:`mne_bids.BIDSPath.match()` and :func:`mne_bids.find_matching_paths()`, by `Arne Gottwald` (:gh:`1355`)
- :func:`mne_bids.get_entity_vals()` has a new parameter ``include_match`` to prefilter item matching and ignore non-matched items from begin of directory scan, by `Arne Gottwald` (:gh:`1355`)
- Data from ``events.tsv`` can now be read into an OrderedDict using :func:`mne_bids.events_file_to_annotation_kwargs()`, by `Matthias Dold` (:gh:`1389`)


🧐 API and behavior changes
Expand Down
6 changes: 5 additions & 1 deletion mne_bids/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,11 @@
get_bids_path_from_fname,
find_matching_paths,
)
from mne_bids.read import get_head_mri_trans, read_raw_bids
from mne_bids.read import (
get_head_mri_trans,
read_raw_bids,
events_file_to_annotation_kwargs,
)
from mne_bids.utils import get_anonymization_daysback
from mne_bids.write import (
make_dataset_description,
Expand Down
80 changes: 77 additions & 3 deletions mne_bids/read.py
Original file line number Diff line number Diff line change
Expand Up @@ -523,8 +523,70 @@ def _handle_info_reading(sidecar_fname, raw):
return raw


def _handle_events_reading(events_fname, raw):
"""Read associated events.tsv and convert valid events to annotations on Raw."""
def events_file_to_annotation_kwargs(events_fname: str | Path) -> dict:
r"""
Read the `events.tsv` file and extract onset, duration, and description.

This function reads an events file in TSV format and extracts the onset,
duration, and description of events.

Parameters
----------
events_fname : str
The file path to the `events.tsv` file.

Returns
-------
dict
A dictionary containing the following keys:
- 'onset' : np.ndarray
The onset times of the events in seconds.
- 'duration' : np.ndarray
The durations of the events in seconds.
- 'description' : np.ndarray
The descriptions of the events.
- 'event_id' : dict
A dictionary mapping event descriptions to integer event IDs.

Notes
-----
The function handles the following cases:
- If the `trial_type` column is available, it uses it for event descriptions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add blank line before bullet list starts

- If the `stim_type` column is available, it uses it for backward compatibility.
- If the `value` column is available, it uses it to create the `event_id`.
- If none of the above columns are available, it defaults to using 'n/a' for
descriptions and 1 for event IDs.

Examples
--------
>>> import pandas as pd
>>> from pathlib import Path
>>> import tempfile
>>>
>>> # Create a sample DataFrame
>>> data = {
... 'onset': [0.1, 0.2, 0.3],
... 'duration': [0.1, 0.1, 0.1],
... 'trial_type': ['event1', 'event2', 'event1'],
... 'value': [1, 2, 1],
... 'sample': [10, 20, 30]
... }
>>> df = pd.DataFrame(data)
>>>
>>> # Write the DataFrame to a temporary file
>>> temp_dir = tempfile.gettempdir()
>>> events_file = Path(temp_dir) / 'events.tsv'
>>> df.to_csv(events_file, sep='\t', index=False)
>>>
>>> # Read the events file using the function
>>> events_dict = events_file_to_annotation_kwargs(events_file)
>>> events_dict
{'onset': array([0.1, 0.2, 0.3]),
'duration': array([0.1, 0.1, 0.1]),
'description': array(['event1', 'event2', 'event1'], dtype='<U6'),
'event_id': {'event1': 1, 'event2': 2}}

"""
logger.info(f"Reading events from {events_fname}.")
events_dict = _from_tsv(events_fname)

Expand Down Expand Up @@ -601,9 +663,21 @@ def _handle_events_reading(events_fname, raw):
[0 if du == "n/a" else du for du in events_dict["duration"]], dtype=float
)

return {"onset": ons, "duration": durs, "description": descrs, "event_id": event_id}


def _handle_events_reading(events_fname, raw):
"""Read associated events.tsv and convert valid events to annotations on Raw."""
annotations_info = events_file_to_annotation_kwargs(events_fname)
event_id = annotations_info["event_id"]

# Add events as Annotations, but keep essential Annotations present in raw file
annot_from_raw = raw.annotations.copy()
annot_from_events = mne.Annotations(onset=ons, duration=durs, description=descrs)
annot_from_events = mne.Annotations(
onset=annotations_info["onset"],
duration=annotations_info["duration"],
description=annotations_info["description"],
)
raw.set_annotations(annot_from_events)

annot_idx_to_keep = [
Expand Down
75 changes: 75 additions & 0 deletions mne_bids/tests/test_read.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@

import mne
import numpy as np
import pandas as pd
import pytest
from mne.datasets import testing
from mne.io.constants import FIFF
Expand All @@ -32,6 +33,7 @@
_handle_events_reading,
_handle_scans_reading,
_read_raw,
events_file_to_annotation_kwargs,
get_head_mri_trans,
read_raw_bids,
)
Expand Down Expand Up @@ -1466,3 +1468,76 @@ def test_gsr_and_temp_reading():
raw = read_raw_bids(bids_path)
assert raw.get_channel_types(["GSR"]) == ["gsr"]
assert raw.get_channel_types(["Temperature"]) == ["temperature"]


def test_events_file_to_annotation_kwargs(tmp_path):
"""Test that events file is read correctly."""
bids_path = BIDSPath(
subject="01", session="eeg", task="rest", datatype="eeg", root=tiny_bids_root
)
events_fname = _find_matching_sidecar(bids_path, suffix="events", extension=".tsv")

# ---------------- plain read --------------------------------------------
df = pd.read_csv(events_fname, sep="\t")
ev_kwargs = events_file_to_annotation_kwargs(events_fname=events_fname)
assert (ev_kwargs["onset"] == df["onset"].values).all()
assert (ev_kwargs["duration"] == df["duration"].values).all()
assert (ev_kwargs["description"] == df["trial_type"].values).all()

# ---------------- filtering out n/a values ------------------------------
tmp_tsv_file = tmp_path / "events.tsv"
dext = pd.concat(
[df.copy().assign(onset=df.onset + i) for i in range(5)]
).reset_index(drop=True)

dext = dext.assign(
ix=range(len(dext)),
value=dext.trial_type.map({"start_experiment": 1, "show_stimulus": 2}),
duration=1.0,
)

# nan values for `_drop` must be string values, `_drop` is called on
# `onset`, `value` and `trial_type`. `duration` n/a should end up as float 0
for c in ["onset", "value", "trial_type", "duration"]:
dext[c] = dext[c].astype(str)

dext.loc[0, "onset"] = "n/a"
dext.loc[1, "duration"] = "n/a"
dext.loc[4, "trial_type"] = "n/a"
dext.loc[4, "value"] = (
"n/a" # to check that filtering is also applied when we drop the `trial_type`
)
dext.to_csv(tmp_tsv_file, sep="\t", index=False)

ev_kwargs_filtered = events_file_to_annotation_kwargs(events_fname=tmp_tsv_file)

dext_f = dext[
(dext["onset"] != "n/a")
& (dext["trial_type"] != "n/a")
& (dext["value"] != "n/a")
]

assert (ev_kwargs_filtered["onset"] == dext_f["onset"].astype(float).values).all()
assert (
ev_kwargs_filtered["duration"]
== dext_f["duration"].replace("n/a", "0.0").astype(float).values
).all()
assert (ev_kwargs_filtered["description"] == dext_f["trial_type"].values).all()
assert (
ev_kwargs_filtered["duration"][0] == 0.0
) # now idx=0, as first row is filtered out
Comment on lines +1521 to +1529
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem. Could also consider pd.testing.assert_frame_equal but that would require converting the dict to dataframe first.

Copy link
Contributor Author

@matthiasdold matthiasdold Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we are using a dict for the return type, I would just stick to comparing the iterables here. Implemented all other suggestions though.


# ---------------- default if missing trial_type ------------------------
tmp_tsv_file = tmp_path / "events.tsv"
dext.drop(columns="trial_type").to_csv(tmp_tsv_file, sep="\t", index=False)

ev_kwargs_default = events_file_to_annotation_kwargs(events_fname=tmp_tsv_file)
assert (ev_kwargs_default["onset"] == dext_f["onset"].astype(float).values).all()
assert (
ev_kwargs_default["duration"]
== dext_f["duration"].replace("n/a", "0.0").astype(float).values
).all()
assert (
np.sort(np.unique(ev_kwargs_default["description"]))
== np.sort(dext_f["value"].unique())
).all()
Loading