Skip to content

Unclear IntegrityError when electrode IDs are not globally unique #1447

@pauladkisson

Description

@pauladkisson

Is your feature request related to a problem? Please describe.

Spyglass requires that electrode IDs (specified by the name field of ShanksElectrode) be globally unique across all probes and shanks in an NWB file. However, many acquisition systems and probe interface models generate electrode IDs that are only locally unique - either unique within a single shank or unique within a single probe.

When users attempt to insert NWB files with locally unique electrode IDs, Spyglass raises a DataJoint IntegrityError with a foreign key constraint failure message that does not clearly explain the root cause:

datajoint.errors.IntegrityError: Cannot add or update a child row: a foreign key constraint fails 
(`common_ephys`.`_electrode`, CONSTRAINT `_electrode_ibfk_2` FOREIGN KEY (`probe_id`, `probe_shank`, 
`probe_electrode`) REFERENCES `common_device`.`probe__electrode` (`probe_id`, `probe_shank`,)

This creates several problems:

  1. Unclear error message leading to complex debugging: The IntegrityError doesn't clearly indicate that electrode IDs must be globally unique. Users must trace through the foreign key constraint error to understand that duplicate electrode IDs are the root cause, making it difficult to diagnose and fix the issue.

  2. Common use case not supported: Many probe systems naturally use locally unique numbering schemes (e.g., each shank has electrodes numbered 1-4)

  3. Undocumented requirement: The global uniqueness requirement is not clearly documented, so users following standard NWB practices may be surprised when insertion fails

Describe the solution you'd like

Improve documentation and error handling to make the electrode ID uniqueness requirement clear:

  1. Add clear documentation explaining that:

    • Electrode IDs (ShanksElectrode name field) must be globally unique integers across all probes and shanks
    • Users should construct globally unique integer IDs even when their acquisition system uses locally unique numbering
    • Provide examples of schemes for creating globally unique integer IDs (e.g., using an offset per probe/shank combination, or sequential numbering across all electrodes)
  2. Improve error messages by checking for duplicate electrode IDs before database insertion and raising a clear, informative error such as:

    • "Duplicate electrode ID 'X' found across multiple shanks or probes. Electrode IDs must be globally unique integers. Consider using sequential numbering across all electrodes or applying an offset per probe/shank combination."

Describe alternatives you've considered

The current workaround is to manually construct globally unique electrode IDs when creating the NWB file, for example by concatenating probe ID, shank ID, and local electrode ID. However, this workaround:

  • Requires users to be aware of Spyglass's specific requirements
  • Is not documented, leading to trial-and-error debugging
  • May require modifying data conversion pipelines that automatically generate electrode IDs from acquisition systems

Additional context

Note: Shank IDs only need to be locally unique within each probe, not globally unique. This asymmetry in requirements (shanks can have duplicate IDs across probes, but electrodes cannot) adds to the confusion.

Here is a minimal reproduction of the problem.

from pynwb.testing.mock.file import mock_NWBFile
from pynwb.testing.mock.ecephys import mock_ElectricalSeries
from ndx_franklab_novela import DataAcqDevice, Probe, Shank, ShanksElectrode, NwbElectrodeGroup
from pynwb import NWBHDF5IO
import numpy as np
from pathlib import Path


def main():
    nwbfile = mock_NWBFile(
        identifier="duplicate_electrode_id_bug_demo",
        session_description="Mock NWB file demonstrating duplicate electrode ID issue"
    )

    data_acq_device = DataAcqDevice(
        name="my_data_acq",
        system="my_system",
        amplifier="my_amplifier",
        adc_circuit="my_adc_circuit"
    )
    nwbfile.add_device(data_acq_device)

    probe1_shank1_electrodes = [
        ShanksElectrode(name="1", rel_x=0.0, rel_y=0.0, rel_z=0.0),
        ShanksElectrode(name="2", rel_x=0.0, rel_y=10.0, rel_z=0.0),
    ]
    probe1_shank2_electrodes = [
        ShanksElectrode(name="1", rel_x=0.0, rel_y=0.0, rel_z=0.0),
        ShanksElectrode(name="2", rel_x=0.0, rel_y=10.0, rel_z=0.0),
    ]
    
    probe1 = Probe(
        name="probe1",
        id=1,
        probe_type="tetrode",
        units="um",
        probe_description="first probe",
        contact_side_numbering=False,
        contact_size=10.0,
        shanks=[
            Shank(name="1", shanks_electrodes=probe1_shank1_electrodes),
            Shank(name="2", shanks_electrodes=probe1_shank2_electrodes),
        ],
    )
    nwbfile.add_device(probe1)

    probe2_shank1_electrodes = [
        ShanksElectrode(name="1", rel_x=0.0, rel_y=0.0, rel_z=0.0),
        ShanksElectrode(name="2", rel_x=0.0, rel_y=10.0, rel_z=0.0),
    ]
    probe2_shank2_electrodes = [
        ShanksElectrode(name="1", rel_x=0.0, rel_y=0.0, rel_z=0.0),
        ShanksElectrode(name="2", rel_x=0.0, rel_y=10.0, rel_z=0.0),
    ]
    
    probe2 = Probe(
        name="probe2",
        id=2,
        probe_type="tetrode",
        units="um",
        probe_description="second probe",
        contact_side_numbering=False,
        contact_size=10.0,
        shanks=[
            Shank(name="1", shanks_electrodes=probe2_shank1_electrodes),
            Shank(name="2", shanks_electrodes=probe2_shank2_electrodes),
        ],
    )
    nwbfile.add_device(probe2)

    electrode_group1 = NwbElectrodeGroup(
        name="electrode_group1",
        description="electrode group for probe1",
        location="CA1",
        device=probe1,
        targeted_location="CA1",
        targeted_x=0.0,
        targeted_y=0.0,
        targeted_z=0.0,
        units="um",
    )
    nwbfile.add_electrode_group(electrode_group1)

    electrode_group2 = NwbElectrodeGroup(
        name="electrode_group2",
        description="electrode group for probe2",
        location="CA1",
        device=probe2,
        targeted_location="CA1",
        targeted_x=0.0,
        targeted_y=0.0,
        targeted_z=0.0,
        units="um",
    )
    nwbfile.add_electrode_group(electrode_group2)

    extra_cols = ["probe_shank", "probe_electrode", "bad_channel", "ref_elect_id"]
    for col in extra_cols:
        nwbfile.add_electrode_column(name=col, description=f"description for {col}")

    electrode_counter = 0
    for probe_id, electrode_group in [(1, electrode_group1), (2, electrode_group2)]:
        for shank_id in [1, 2]:
            for electrode_id in [1, 2]:
                nwbfile.add_electrode(
                    location="CA1",
                    group=electrode_group,
                    probe_shank=shank_id,
                    probe_electrode=electrode_id,
                    bad_channel=False,
                    ref_elect_id=0,
                    x=0.0,
                    y=0.0,
                    z=0.0,
                )
                electrode_counter += 1

    electrodes = nwbfile.electrodes.create_region(
        name="electrodes",
        region=list(range(electrode_counter)),
        description="electrodes"
    )
    mock_ElectricalSeries(electrodes=electrodes, nwbfile=nwbfile, data=np.ones((10, 1)))

    nwbfile.create_processing_module(name="behavior", description="dummy behavior module")

    nwbfile_path = Path("/Volumes/T7/CatalystNeuro/Spyglass/raw/mock_duplicate_electrode_id_bug.nwb")
    if nwbfile_path.exists():
        nwbfile_path.unlink()
    nwbfile_path.parent.mkdir(parents=True, exist_ok=True)
    
    with NWBHDF5IO(nwbfile_path, "w") as io:
        io.write(nwbfile)

    import datajoint as dj
    dj_local_conf_path = "/Users/pauladkisson/Documents/CatalystNeuro/Spyglass/spyglass/dj_local_conf.json"
    dj.config.load(dj_local_conf_path)
    import spyglass.common as sgc
    import spyglass.data_import as sgi
    from spyglass.utils.nwb_helper_fn import get_nwb_copy_filename
    
    nwb_copy_file_name = get_nwb_copy_filename(nwbfile_path.name)
    (sgc.Nwbfile & {"nwb_file_name": nwb_copy_file_name}).delete()
    
    sgi.insert_sessions(str(nwbfile_path), rollback_on_fail=True, raise_err=True)


if __name__ == "__main__":
    main()

Error output:

When running this script, the following error occurs:

[17:22:49][INFO] Spyglass: Populating Electrode...
[17:22:49][ERROR] Spyglass: Uncaught exception
Traceback (most recent call last):
  File "/Users/pauladkisson/Documents/CatalystNeuro/DudchenkoConv/woodcode/duplicate_electrode_id_bug.py", line 144, in <module>
    main()
  File "/Users/pauladkisson/Documents/CatalystNeuro/DudchenkoConv/woodcode/duplicate_electrode_id_bug.py", line 140, in main
    sgi.insert_sessions(str(nwbfile_path), rollback_on_fail=True, raise_err=True)
  File "/Users/pauladkisson/Documents/CatalystNeuro/Spyglass/spyglass/src/spyglass/data_import/insert_sessions.py", line 77, in insert_sessions
    return populate_all_common(
  File "/Users/pauladkisson/Documents/CatalystNeuro/Spyglass/spyglass/src/spyglass/common/populate_all_common.py", line 175, in populate_all_common
    single_transaction_make(
  File "/Users/pauladkisson/Documents/CatalystNeuro/Spyglass/spyglass/src/spyglass/common/populate_all_common.py", line 107, in single_transaction_make
    raise err
  File "/Users/pauladkisson/Documents/CatalystNeuro/Spyglass/spyglass/src/spyglass/common/populate_all_common.py", line 104, in single_transaction_make
    table().make(pop_key)
  File "/Users/pauladkisson/Documents/CatalystNeuro/Spyglass/spyglass/src/spyglass/common/common_ephys.py", line 202, in make
    self.insert(
  File "/opt/anaconda3/envs/spyglass/lib/python3.10/site-packages/datajoint/table.py", line 453, in insert
    self.connection.query(
  File "/opt/anaconda3/envs/spyglass/lib/python3.10/site-packages/datajoint/connection.py", line 350, in query
    self._execute_query(cursor, query, args, suppress_warnings)
  File "/opt/anaconda3/envs/spyglass/lib/python3.10/site-packages/datajoint/connection.py", line 306, in _execute_query
    raise translate_query_error(err, query)
datajoint.errors.IntegrityError: Cannot add or update a child row: a foreign key constraint fails (`common_ephys`.`_electrode`, CONSTRAINT `_electrode_ibfk_2` FOREIGN KEY (`probe_id`, `probe_shank`, `probe_electrode`) REFERENCES `common_device`.`probe__electrode` (`probe_id`, `probe_shank`,)

The error message does not clearly indicate that the root cause is duplicate electrode IDs that violate the global uniqueness requirement. Users must debug the foreign key constraint failure to discover this requirement.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions