Skip to content

Commit 8ce28b8

Browse files
fix(eeglab): trust .set EEG.nbchan over sidecar when they disagree
Found via ds003645 in seed=44 audit: _channels.tsv lists 404 channels (full MEG sensor array + EEG + triggers + system channels) but the .set declares EEG.nbchan=75 (the subset that survived preprocessing/ICA). .fdt size 162030000 = 75 × 4 × 540100 — matches the .set exactly. Previously the reader preferred the sidecar's count when present, producing "404 ch · ? Hz · ? s" in the pill and "not a multiple of 404×4" error. New priority order: 1. .set's EEG.nbchan + EEG.srate (always parsed; authoritative) 2. _channels.tsv + _eeg.json BIDS sidecar (fallback if .set unparseable) 3. Combined-source error if both fail When sidecar disagrees with the .set, warn loudly (it's almost always a BIDS data-curation hint: the sidecar wasn't updated after channel selection). Pass-through unchanged for the inline-data path (no .fdt sibling). Also surfaces during investigation: - ds003570 .fdt is 97 MB truncated (file broken at OpenNeuro level) - ds003751 .fdt is 99% truncated (file broken at OpenNeuro level) These remain detected via clean "X is not a multiple of Y" errors; same category as ds002181 (.fdt completely missing). Test updated to expect the new error wording. 937 unit tests, 0 fails.
1 parent d19d647 commit 8ce28b8

2 files changed

Lines changed: 54 additions & 27 deletions

File tree

formats/eeglab.js

Lines changed: 53 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -159,21 +159,28 @@
159159
}
160160

161161
// Split .set + .fdt path: the .fdt is a flat float32 blob with no
162-
// header. We prefer the BIDS sidecar (_channels.tsv + _eeg.json)
163-
// for nChannels and SamplingFrequency, but it's not always there.
162+
// header. We need nChannels + SamplingFrequency to interpret it.
164163
//
165-
// Real-world examples (EEGDash audit seed=44):
166-
// ds003645: .fdt + .set + _electrodes.tsv only — no _channels.tsv
167-
// ds003751: _eeg.json case-mismatch with .set basename — BIDS
168-
// inheritance walk fails to find the JSON
164+
// Source priority — the .set is the authority, not the BIDS sidecar:
165+
// - sidecar _channels.tsv may list ALL acquired channels (including
166+
// bad/dropped ones, MEG system channels, status, etc.) while the
167+
// .set stores only the channels that were actually written to
168+
// the .fdt (after preprocessing / ICA / channel selection).
169+
// - Real example (ds003645): _channels.tsv has 404 entries (full
170+
// MEG sensor array + EEG + triggers); .set says nbchan=75 (the
171+
// subset that was preprocessed). .fdt size 162030000 = 75 × 4 ×
172+
// 540100 → matches the .set, not the sidecar.
169173
//
170-
// Fallback: parse the .set itself to extract nbchan + srate. The
171-
// .set has these fields even when it's a split-file pair, because
172-
// EEGLAB writes the full EEG struct to .set and only the numeric
173-
// data array goes to .fdt. Mirrors MNE's behaviour in
174-
// mne/io/eeglab/eeglab.py::_check_load_mat.
175-
let nChannels = nChannelsFromSidecar;
176-
let fs = sidecarFsValid ? sidecarFs : null;
174+
// Strategy:
175+
// 1. Always try to parse the .set to get its authoritative nbchan
176+
// + srate (the .fdt-data writer wrote these in lockstep with
177+
// the .fdt's actual layout).
178+
// 2. If .set is unparseable, fall back to the BIDS sidecar values.
179+
// 3. Warn if sidecar and .set disagree (BIDS data-curation hint).
180+
let nChannels = null;
181+
let fs = null;
182+
let setParseFailed = false;
183+
let setParseError = null;
177184
if (nChannels == null || fs == null) {
178185
try {
179186
const setBuf = await HttpRange.fetchBuffer(meta.eeg_url);
@@ -199,26 +206,46 @@
199206
};
200207
const nbchanFromSet = scalarFrom('nbchan');
201208
const srateFromSet = scalarFrom('srate');
202-
if (nChannels == null && nbchanFromSet) nChannels = nbchanFromSet;
203-
if (fs == null && srateFromSet && srateFromSet > 0) fs = srateFromSet;
204-
if (nChannels != null && fs != null) {
209+
if (nbchanFromSet) nChannels = nbchanFromSet;
210+
if (srateFromSet && srateFromSet > 0) fs = srateFromSet;
211+
// Warn loudly when sidecar and .set disagree — almost always a
212+
// sign of post-acquisition channel selection / preprocessing
213+
// that wasn't reflected in the BIDS curation.
214+
if (nChannelsFromSidecar != null && nChannels != null &&
215+
nChannels !== nChannelsFromSidecar) {
205216
console.warn(
206-
`EEGLAB .set+.fdt: BIDS sidecar incomplete; using .set's own ` +
207-
`EEG.nbchan=${nChannels} and EEG.srate=${fs}.`,
217+
`EEGLAB .set+.fdt: sidecar _channels.tsv lists ` +
218+
`${nChannelsFromSidecar} channels but the .set declares ` +
219+
`EEG.nbchan=${nChannels}. Trusting the .set (it matches ` +
220+
`the .fdt's actual layout; the sidecar likely lists all ` +
221+
`acquired channels including dropped/system ones).`,
208222
);
209223
}
210-
} catch (e) {
211-
// .set parse failure is recoverable as long as sidecar gave us
212-
// what we need. Surface only if BOTH sources fail.
213-
if (nChannels == null || fs == null) {
214-
throw new Error(
215-
`EEGLAB .set+.fdt: need either _channels.tsv + _eeg.json BIDS ` +
216-
`sidecars OR a parseable .set with EEG.nbchan + EEG.srate. ` +
217-
`Set parse error: ${e.message}`,
224+
if (sidecarFsValid && fs != null && Math.abs(fs - sidecarFs) > 0.5) {
225+
console.warn(
226+
`EEGLAB .set+.fdt: sidecar SamplingFrequency=${sidecarFs} but ` +
227+
`EEG.srate=${fs} in the .set. Trusting the .set.`,
218228
);
219229
}
230+
} catch (e) {
231+
setParseFailed = true;
232+
setParseError = e;
220233
}
221234
}
235+
// Sidecar fallback: only if .set parse failed AND sidecar has values.
236+
if (nChannels == null && nChannelsFromSidecar != null) {
237+
nChannels = nChannelsFromSidecar;
238+
}
239+
if (fs == null && sidecarFsValid) {
240+
fs = sidecarFs;
241+
}
242+
if (!nChannels && setParseFailed) {
243+
throw new Error(
244+
`EEGLAB .set+.fdt: need either parseable .set with EEG.nbchan + EEG.srate ` +
245+
`OR _channels.tsv + _eeg.json BIDS sidecars. ` +
246+
`Set parse error: ${setParseError ? setParseError.message : 'unknown'}`,
247+
);
248+
}
222249
if (!nChannels) {
223250
throw new Error(
224251
'EEGLAB .fdt reader needs nChannels (either from _channels.tsv ' +

tests/unit-eeglab-standalone.test.mjs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -244,6 +244,6 @@ test('open: split .set + .fdt path no longer requires _channels.tsv (parses .set
244244
};
245245
await assert.rejects(
246246
() => EEGLABReader.open(meta),
247-
/need either _channels\.tsv \+ _eeg\.json BIDS sidecars OR a parseable \.set/,
247+
/need either parseable \.set with EEG\.nbchan \+ EEG\.srate OR _channels\.tsv \+ _eeg\.json/,
248248
);
249249
});

0 commit comments

Comments
 (0)