Skip to content

Commit 0ba63f2

Browse files
Subject or session contains slashes (#570)
* Add s3fs installation to the package setup in CI workflow * Add checks for session_id and subject_id to prevent slashes in DANDI paths * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update importance level for session_id and subject_id checks to best practice violation * Add checks for subject and session IDs to prevent slashes * Enhance documentation for subject ID check with best practice reference * Update CHANGELOG for upcoming v0.7.0 with new checks for subject_id and session_id slashes * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent b479b02 commit 0ba63f2

File tree

7 files changed

+119
-3
lines changed

7 files changed

+119
-3
lines changed

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
# v0.6.4 (Upcoming)
22

3+
### New Checks
4+
* Added checks to make sure subject_id and session_id do not contain slashes [#570](https://github.com/NeurodataWithoutBorders/nwbinspector/pull/570)
5+
36
### Fixes
47
* Fixed incorrect data orientation check for SpikeEventSeries [#592](https://github.com/NeurodataWithoutBorders/nwbinspector/issues/592)
58
* Fix dimensionality check for SpikeEventSeries validation [#581](https://github.com/NeurodataWithoutBorders/nwbinspector/pull/581)

docs/best_practices/nwbfile_metadata.rst

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,8 @@ Check function: :py:meth:`~nwbinspector.checks._nwbfile_metadata.check_processin
7171
File Metadata
7272
-------------
7373

74+
.. _best_practice_session_id:
75+
7476
Session ID
7577
~~~~~~~~~~
7678

@@ -82,6 +84,12 @@ different processing outputs. In this case, the ``session_id`` should be the sam
8284
a standard structure for their own naming schemes so that sessions are unique within the lab and the IDs are easily
8385
human-readable.
8486

87+
The ``session_id`` should not contain slash characters (``/``) as these can cause problems when constructing paths in
88+
the DANDI archive. If your session IDs normally include slash characters, consider replacing them with hyphens (``-``)
89+
or underscores (``_``).
90+
91+
Check function: :py:meth:`~nwbinspector.checks._nwbfile_metadata.check_session_id_no_slashes`
92+
8593
.. _best_practice_file_id:
8694

8795
Identifier
@@ -174,7 +182,7 @@ Check function: :py:meth:`~nwbinspector.checks._nwbfile_metadata.check_subject_e
174182
175183
176184
177-
.. _best_practice_subject_id_exists:
185+
.. _best_practice_subject_id:
178186
179187
Subject ID
180188
~~~~~~~~~~
@@ -185,7 +193,12 @@ not intended for DANDI upload, if the :ref:`nwb-schema:sec-Subject` is specified
185193

186194
In the special case of *in vitro* studies where the 'subject' of scientific interest was not a tissue sample obtained from a living subject but was instead a purified protein, this will be annotated by prepending the keyphrase "protein" to the subject ID; *e.g*, "proteinCaMPARI3". In the case where the *in vitro* experiment is performed on an extracted or cultured biological sample, the other subject attributes (such as age and sex) should be specified as their values at the time the sample was collected.
187195

188-
Check function: :py:meth:`~nwbinspector.checks._nwbfile_metadata.check_subject_id_exists`
196+
Similar to session IDs, the ``subject_id`` should not contain slash characters (``/``) as these can cause problems when
197+
constructing paths in the DANDI archive. If your subject IDs normally include slash characters, consider replacing them
198+
with hyphens (``-``) or underscores (``_``).
199+
200+
Check functions: :py:meth:`~nwbinspector.checks._nwbfile_metadata.check_subject_id_exists` and
201+
:py:meth:`~nwbinspector.checks._nwbfile_metadata.check_subject_id_no_slashes`
189202

190203

191204

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,12 @@
11
CRITICAL: # All the fields under CRITICAL will be required for dandi validate to pass
22
- check_subject_exists
33
- check_subject_id_exists
4+
- check_subject_id_no_slashes
45
- check_subject_sex
56
- check_subject_species_exists
67
- check_subject_species_form
78
- check_subject_age
89
- check_subject_proper_age_range
10+
- check_session_id_no_slashes
911
BEST_PRACTICE_VIOLATION:
1012
- check_data_orientation # not 100% accurate, so need to deelevate from CRITICAL to skip it in dandi validate

src/nwbinspector/checks/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,11 +42,13 @@
4242
check_institution,
4343
check_keywords,
4444
check_processing_module_name,
45+
check_session_id_no_slashes,
4546
check_session_start_time_future_date,
4647
check_session_start_time_old_date,
4748
check_subject_age,
4849
check_subject_exists,
4950
check_subject_id_exists,
51+
check_subject_id_no_slashes,
5052
check_subject_proper_age_range,
5153
check_subject_sex,
5254
check_subject_species_exists,
@@ -114,9 +116,11 @@
114116
"check_experimenter_exists",
115117
"check_experiment_description",
116118
"check_subject_id_exists",
119+
"check_subject_id_no_slashes",
117120
"check_subject_species_exists",
118121
"check_subject_species_form",
119122
"check_subject_proper_age_range",
123+
"check_session_id_no_slashes",
120124
"check_session_start_time_future_date",
121125
"check_processing_module_name",
122126
"check_session_start_time_old_date",

src/nwbinspector/checks/_nwbfile_metadata.py

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,11 @@ def check_subject_proper_age_range(subject: Subject) -> Optional[InspectorMessag
225225

226226
@register_check(importance=Importance.BEST_PRACTICE_SUGGESTION, neurodata_type=Subject)
227227
def check_subject_id_exists(subject: Subject) -> Optional[InspectorMessage]:
228-
"""Check if subject_id is defined."""
228+
"""
229+
Check if subject_id is defined.
230+
231+
Best Practice: :ref:`best_practice_subject_id`
232+
"""
229233
if subject.subject_id is None:
230234
return InspectorMessage(message="subject_id is missing.")
231235

@@ -312,3 +316,39 @@ def check_processing_module_name(processing_module: ProcessingModule) -> Optiona
312316
)
313317

314318
return None
319+
320+
321+
@register_check(importance=Importance.BEST_PRACTICE_VIOLATION, neurodata_type=NWBFile)
322+
def check_session_id_no_slashes(nwbfile: NWBFile) -> Optional[InspectorMessage]:
323+
"""
324+
Check if session_id contains slash characters, which can cause problems when constructing paths in DANDI.
325+
326+
Best Practice: :ref:`best_practice_session_id`
327+
"""
328+
if nwbfile.session_id and "/" in nwbfile.session_id:
329+
return InspectorMessage(
330+
message=(
331+
f"The session_id '{nwbfile.session_id}' contains slash character(s) '/', which can cause problems "
332+
f"when constructing paths in DANDI. Please replace slashes with another character (e.g., '-' or '_')."
333+
)
334+
)
335+
336+
return None
337+
338+
339+
@register_check(importance=Importance.BEST_PRACTICE_VIOLATION, neurodata_type=Subject)
340+
def check_subject_id_no_slashes(subject: Subject) -> Optional[InspectorMessage]:
341+
"""
342+
Check if subject_id contains slash characters, which can cause problems when constructing paths in DANDI.
343+
344+
Best Practice: :ref:`best_practice_subject_id`
345+
"""
346+
if subject.subject_id and "/" in subject.subject_id:
347+
return InspectorMessage(
348+
message=(
349+
f"The subject_id '{subject.subject_id}' contains slash character(s) '/', which can cause problems "
350+
f"when constructing paths in DANDI. Please replace slashes with another character (e.g., '-' or '_')."
351+
)
352+
)
353+
354+
return None

tests/test_check_configuration.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,11 +99,13 @@ def test_load_config(self):
9999
CRITICAL=[
100100
"check_subject_exists",
101101
"check_subject_id_exists",
102+
"check_subject_id_no_slashes",
102103
"check_subject_sex",
103104
"check_subject_species_exists",
104105
"check_subject_species_form",
105106
"check_subject_age",
106107
"check_subject_proper_age_range",
108+
"check_session_id_no_slashes",
107109
],
108110
BEST_PRACTICE_VIOLATION=[
109111
"check_data_orientation",

tests/unit_tests/test_nwbfile_metadata.py

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,13 @@
1313
check_institution,
1414
check_keywords,
1515
check_processing_module_name,
16+
check_session_id_no_slashes,
1617
check_session_start_time_future_date,
1718
check_session_start_time_old_date,
1819
check_subject_age,
1920
check_subject_exists,
2021
check_subject_id_exists,
22+
check_subject_id_no_slashes,
2123
check_subject_proper_age_range,
2224
check_subject_sex,
2325
check_subject_species_exists,
@@ -565,3 +567,53 @@ def test_check_processing_module_name():
565567
def test_pass_check_processing_module_name():
566568
processing_module = ProcessingModule(name="ecephys", description="desc")
567569
assert check_processing_module_name(processing_module) is None
570+
571+
572+
def test_pass_check_session_id_no_slashes():
573+
nwbfile = NWBFile(
574+
session_description="",
575+
identifier=str(uuid4()),
576+
session_start_time=datetime.now().astimezone(),
577+
session_id="session001",
578+
)
579+
assert check_session_id_no_slashes(nwbfile) is None
580+
581+
582+
def test_check_session_id_with_slashes():
583+
nwbfile = NWBFile(
584+
session_description="",
585+
identifier=str(uuid4()),
586+
session_start_time=datetime.now().astimezone(),
587+
session_id="session/001",
588+
)
589+
assert check_session_id_no_slashes(nwbfile) == InspectorMessage(
590+
message=(
591+
"The session_id 'session/001' contains slash character(s) '/', which can cause problems "
592+
"when constructing paths in DANDI. Please replace slashes with another character (e.g., '-' or '_')."
593+
),
594+
importance=Importance.BEST_PRACTICE_VIOLATION,
595+
check_function_name="check_session_id_no_slashes",
596+
object_type="NWBFile",
597+
object_name="root",
598+
location="/",
599+
)
600+
601+
602+
def test_pass_check_subject_id_no_slashes():
603+
subject = Subject(subject_id="subject001")
604+
assert check_subject_id_no_slashes(subject) is None
605+
606+
607+
def test_check_subject_id_with_slashes():
608+
subject = Subject(subject_id="subject/001")
609+
assert check_subject_id_no_slashes(subject) == InspectorMessage(
610+
message=(
611+
"The subject_id 'subject/001' contains slash character(s) '/', which can cause problems "
612+
"when constructing paths in DANDI. Please replace slashes with another character (e.g., '-' or '_')."
613+
),
614+
importance=Importance.BEST_PRACTICE_VIOLATION,
615+
check_function_name="check_subject_id_no_slashes",
616+
object_type="Subject",
617+
object_name="subject",
618+
location="/general/subject",
619+
)

0 commit comments

Comments
 (0)