-
Notifications
You must be signed in to change notification settings - Fork 52
Description
Is your feature request related to a problem? Please describe.
Spyglass's video data import process has an unwanted dependency on task/epoch information, preventing users from inserting video data independently. This coupling creates several issues:
-
Prevents modular development: Users cannot insert video data without first creating task/epoch information, even when the video data is unrelated to specific experimental tasks or when task information is still being developed.
-
Makes debugging difficult: When troubleshooting video import issues, users must also maintain task/epoch data structures, complicating the debugging process and making it harder to isolate video-specific problems.
-
Reduces flexibility: In many experimental scenarios, video recordings may exist independently of task paradigms (e.g., continuous monitoring, exploratory behavior sessions, or video collected during inter-trial intervals).
This tight coupling between video and task data is not required by the NWB standard and limits the flexibility of data processing pipelines.
Describe the solution you'd like
Minimum solution (must-have):
- Raise a clear warning when image series are detected in the NWB file that aren't matched with task information, alerting users that these video files will not be imported into the
VideoFiletable
Intermediate solution (highly recommended):
- Add dedicated documentation explaining that task information is currently required for video files to be imported
- Document the specific relationship between the tasks module and video import
- Provide clear examples of how to structure NWB files to ensure successful video import
Optimal solution (ideal):
- Fully decouple video and task data at the database schema level
- Allow video files to be inserted into Spyglass independently of task/epoch information
- Support optional linking of video data to task epochs when relevant, rather than requiring it
- Enable workflows where video data is processed independently before being associated with task information
Describe alternatives you've considered
The current workaround is to always create task/epoch information alongside video data, even when it's not semantically meaningful or when the task information is still being developed. This workaround:
- Requires creating placeholder or dummy task data
- Adds unnecessary complexity to data ingestion pipelines
- Obscures the actual experimental structure when tasks are added artificially
Additional context
Here is a minimal reproduction of the problem.
from pynwb.testing.mock.file import mock_NWBFile
from pynwb import NWBHDF5IO
from pathlib import Path
from ndx_franklab_novela import CameraDevice
from pynwb.image import ImageSeries
from pynwb.core import DynamicTable
import numpy as np
def add_task(nwbfile):
tasks_module = nwbfile.create_processing_module(name="tasks", description="tasks module")
task_table = DynamicTable(name="task_table_1", description="my task table")
task_table.add_column(name="task_name", description="Name of the task.")
task_table.add_column(name="task_description", description="Description of the task.")
task_table.add_column(name="camera_id", description="Camera ID.")
task_table.add_column(name="task_epochs", description="Task epochs.")
task_table.add_row(
task_name="task1",
task_description="task1 description",
camera_id=[1],
task_epochs=[1]
)
tasks_module.add(task_table)
def add_video(nwbfile):
camera_device = CameraDevice(
name="camera_device 1",
meters_per_pixel=1.0,
model="my_model",
lens="my_lens",
camera_name="my_camera_name",
)
nwbfile.add_device(camera_device)
video_file_path = "/path/to/video.h264"
timestamps = np.linspace(0, 10, 300)
image_series = ImageSeries(
name="my_image_series",
description="Video recording without associated task information",
unit="n.a.",
external_file=[video_file_path],
format="external",
timestamps=timestamps,
device=camera_device,
)
nwbfile.add_acquisition(image_series)
def insert_session(nwbfile_path: Path):
import datajoint as dj
dj_local_conf_path = "/Users/pauladkisson/Documents/CatalystNeuro/Spyglass/spyglass/dj_local_conf.json"
dj.config.load(dj_local_conf_path)
import spyglass.common as sgc
import spyglass.data_import as sgi
from spyglass.utils.nwb_helper_fn import get_nwb_copy_filename
nwb_copy_file_name = get_nwb_copy_filename(nwbfile_path.name)
(sgc.Nwbfile & {"nwb_file_name": nwb_copy_file_name}).delete()
sgi.insert_sessions(str(nwbfile_path), rollback_on_fail=True, raise_err=True)
print(sgc.VideoFile())
# If add_task is called,
# *nwb_file_name *epoch *video_file_nu camera_name video_file_obj
# +------------+ +-------+ +------------+ +------------+ +------------+
# mock_video_tas 1 1 my_camera_name 1bdd667f-67a3-
# (Total: 1)
# If add_task is NOT called,
# *nwb_file_name *epoch *video_file_nu camera_name video_file_obj
# +------------+ +-------+ +------------+ +------------+ +------------+
# (Total: 0)
def main():
nwbfile = mock_NWBFile(
identifier="video_task_coupling_bug_demo",
session_description="Mock NWB file demonstrating Spyglass video/task coupling issue"
)
nwbfile.add_epoch(start_time=0.0, stop_time=10.0, tags=["01"])
nwbfile.create_processing_module(
name="behavior",
description="Behavioral data including video"
)
add_task(nwbfile) # Comment this line to test without task information
add_video(nwbfile)
nwbfile_path = Path("/Volumes/T7/CatalystNeuro/Spyglass/raw/mock_video_task_coupling.nwb")
if nwbfile_path.exists():
nwbfile_path.unlink()
nwbfile_path.parent.mkdir(parents=True, exist_ok=True)
with NWBHDF5IO(nwbfile_path, "w") as io:
io.write(nwbfile)
print(f"Mock NWB file written to {nwbfile_path}")
insert_session(nwbfile_path=nwbfile_path)
if __name__ == "__main__":
main()Behavior:
The script demonstrates a silent failure when task information is not present. The video data import into the VideoFile table depends on the presence of task information:
With task information (add_task(nwbfile) called):
*nwb_file_name *epoch *video_file_nu camera_name video_file_obj
+------------+ +-------+ +------------+ +------------+ +------------+
mock_video_tas 1 1 my_camera_name 1bdd667f-67a3-
(Total: 1)
Without task information (add_task(nwbfile) commented out):
*nwb_file_name *epoch *video_file_nu camera_name video_file_obj
+------------+ +-------+ +------------+ +------------+ +------------+
(Total: 0)
Note that the import process completes without errors or warnings in both cases. When task information is absent, the video data is silently ignored, leading to incomplete data import. This silent failure makes debugging difficult and can lead to users unknowingly losing video data during the import process.