MacOS Finder ._*
hidden metadata files cause pybids to crash #1069
Open
Description
Example
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 37: invalid start byte
… while trying to decode JSON from file […]/._sub-PA069_ses-V1W1_task-poke_run-2_bold.json
Traceback (most recent call last):
File "/opt/conda/envs/sdcflows/lib/python3.10/site-packages/bids/layout/index.py", line 303, in load_json
return json.load(handle)
File "/opt/conda/envs/sdcflows/lib/python3.10/json/__init__.py", line 293, in load
return loads(fp.read(),
File "/opt/conda/envs/sdcflows/lib/python3.10/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 37: invalid start byte
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/envs/sdcflows/bin/sdcflows", line 8, in <module>
sys.exit(main())
File "/opt/conda/envs/sdcflows/lib/python3.10/site-packages/sdcflows/cli/main.py", line 39, in main
parse_args(argv)
File "/opt/conda/envs/sdcflows/lib/python3.10/site-packages/sdcflows/cli/parser.py", line 281, in parse_args
config.from_dict(vars(opts))
File "/opt/conda/envs/sdcflows/lib/python3.10/site-packages/sdcflows/config.py", line 589, in from_dict
execution.load(settings)
File "/opt/conda/envs/sdcflows/lib/python3.10/site-packages/sdcflows/config.py", line 249, in load
cls.init()
File "/opt/conda/envs/sdcflows/lib/python3.10/site-packages/sdcflows/config.py", line 476, in init
cls._layout = BIDSLayout(
File "/opt/conda/envs/sdcflows/lib/python3.10/site-packages/bids/layout/layout.py", line 177, in __init__
_indexer(self)
File "/opt/conda/envs/sdcflows/lib/python3.10/site-packages/bids/layout/index.py", line 154, in __call__
self._index_metadata()
File "/opt/conda/envs/sdcflows/lib/python3.10/site-packages/bids/layout/index.py", line 415, in _index_metadata
file_md.update(pl())
File "/opt/conda/envs/sdcflows/lib/python3.10/site-packages/bids/layout/index.py", line 305, in load_json
raise OSError(
OSError: Error occurred while trying to decode JSON from file /ocean/projects/med220004p/shared/data_raw/vannucci/bids_raw/sub-PA069/ses-V1W1/func/._sub-PA069_ses-V1W1_task-poke_run-2_bold.json
Proposed Solution
I think these types of files (._*
and .DS_Store
) can be safely ignored.
Context
In analyzing someone else's read-only (to me) data, I hit this issue. I worked around it by creating a symlinked recreation of the data directory without the MacOS hidden metadata files, but I don't think that should have been necessary.
bids-validator raised errors and warnings for the dataset, but none related to these hidden metadata files as far as I can tell:
1: [ERR] Files with such naming scheme are not part of BIDS specification. This error is most commonly caused by typos in file names that make them not BIDS compatible. Please consult the specification and make sure your files are named correctly. If this is not a file naming issue (for example when including files not yet covered by the BIDS specification) you should include a ".bidsignore" file in your dataset (see https://github.com/bids-standard/bids-validator#bidsignore for details). Please note that derived (processed) data should be placed in /derivatives folder and source data (such as DICOMS or behavioural logs in proprietary formats) should be placed in the /sourcedata folder. (code: 1 - NOT_INCLUDED) ./sub-PA028/ses-V2W2/files.txt Evidence: files.txt ./sub-PA070/ses-V1W1/anat/sub-PA070_ses-V2W2_acq-MPR_rec-vNavNorm_T1w.nii.gz Evidence: sub-PA070_ses-V2W2_acq-MPR_rec-vNavNorm_T1w.nii.gz 2: [ERR] 'IntendedFor' field needs to point to an existing file. (code: 37 - INTENDED_FOR) 3: [ERR] You have to define 'TaskName' for this file. (code: 50 - TASK_NAME_MUST_DEFINE) 4: [ERR] Session label in the filename doesn't match with the path of the file. File seems to be saved in incorrect session directory. (code: 65 - SESSION_LABEL_IN_FILENAME_DOESNOT_MATCH_DIRECTORY) 5: [ERR] _T1w.nii[.gz] files must have exactly three dimensions. (code: 95 - T1W_FILE_WITH_TOO_MANY_DIMENSIONS)
1: [WARN] Task scans should have a corresponding events.tsv file. If this is a resting state scan you can ignore this warning or rename the task to include the word "rest". (code: 25 - EVENTS_TSV_MISSING) 2: [WARN] Not all subjects contain the same files. Each subject should contain the same number of files with the same naming unless some files are known to be missing. (code: 38 - INCONSISTENT_SUBJECTS) 3: [WARN] Not all subjects/sessions/runs have the same scanning parameters. (code: 39 - INCONSISTENT_PARAMETERS) 4: [WARN] NIfTI file's header field for pixel dimension information empty or too short. (code: 42 - NIFTI_PIXDIM) 5: [WARN] There are files in the /stimuli directory that are not utilized in any _events.tsv file. (code: 77 - UNUSED_STIMULUS) 6: [WARN] Tabular file contains custom columns not described in a data dictionary (code: 82 - CUSTOM_COLUMN_WITHOUT_DESCRIPTION) 7: [WARN] The onset of the last event is after the total duration of the corresponding scan. This design is suspiciously long. (code: 85 - SUSPICIOUSLY_LONG_EVENT_DESIGN) 8: [WARN] Not all subjects contain the same sessions. (code: 97 - MISSING_SESSION) 9: [WARN] The recommended file /README is missing. See Section 03 (Modality agnostic files) of the BIDS specification. (code: 101 - README_FILE_MISSING) 10: [WARN] The Authors field of dataset_description.json should contain an array of fields - with one author per field. This was triggered because there are no authors, which will make DOI registration from dataset metadata impossible. (code: 113 - NO_AUTHORS)
I get the same errors and warnings in my workaround data directory but avoid the issue in PyBIDS.
Metadata
Assignees
Labels
No labels