-
Notifications
You must be signed in to change notification settings - Fork 52
Description
Is your feature request related to a problem? Please describe.
Spyglass's VideoFile table assumes a one-to-one relationship between an ImageSeries and a video file, but the NWB standard allows a single ImageSeries to reference multiple external files (e.g., one video file per epoch) using the starting_frame parameter to indicate which frames belong to which file. When an ImageSeries contains multiple external files, Spyglass does not recognize the starting_frame parameter and fails to import any of the video files.
The technical issue is that Spyglass checks if 90% of the ImageSeries timestamps overlap with the epoch's valid times. However, when multiple files are present with starting_frame, Spyglass treats all timestamps as belonging to a single video file, causing the overlap check to fail for all files.
This creates several problems:
-
Video files not imported: When multiple video files are present with
starting_frame, none of the video files are imported because the 90% overlap check fails for all epochs. While warnings are logged, users may not immediately understand why their video data wasn't imported. -
Incompatibility with standard NWB practice: The
starting_frameparameter is part of the NWB standard for indicating which frames belong to which external file. Spyglass does not recognize this parameter, making it incompatible with standard multi-file ImageSeries structures. -
Uninformative warnings: While Spyglass does log warnings that "No video found corresponding to file X, epoch Y", these warnings don't explain that the root cause is Spyglass's lack of support for multiple external files in a single ImageSeries. Users are left to guess why their valid NWB structure fails to import.
-
Lack of documentation: The current limitation is not documented, so users following standard NWB practices may be surprised when their multi-file ImageSeries fail to import without clear guidance on the required workaround.
Describe the solution you'd like
Minimum solution (must-have):
- Add a specific warning when an ImageSeries contains multiple external files, such as:
- "Spyglass does not support multiple external files in a single ImageSeries. Please reorganize your data into one external file per ImageSeries."
- This warning should be raised in addition to the existing "No video found corresponding to file X, epoch Y" warnings
Intermediate solution (highly recommended):
- Add documentation explaining the current one-to-one limitation between ImageSeries and VideoFile entries
- Document the recommended workaround: create separate ImageSeries objects (one per video file)
- Include examples showing how to structure NWB files for Spyglass compatibility
Optimal solution (ideal):
- Support multiple
VideoFileentries per ImageSeries by recognizing thestarting_frameparameter - Create one database entry for each external file in the ImageSeries
- Properly associate each video file with its corresponding epoch(s) based on the
starting_frameindices and timestamps - Maintain backward compatibility with single-file ImageSeries while supporting the multi-file case
Describe alternatives you've considered
The current workaround is to create separate ImageSeries objects for each video file, even when they logically belong to the same recording session. While this workaround is functional, it:
- Deviates from the NWB standard practice where a single ImageSeries can span multiple files using
starting_frame - Requires users to be aware of this Spyglass-specific limitation, which is not currently documented
Additional context
Here is a minimal reproduction of the problem.
from pynwb.testing.mock.file import mock_NWBFile
from pynwb import NWBHDF5IO
from pathlib import Path
from ndx_franklab_novela import CameraDevice
from pynwb.image import ImageSeries
from pynwb.core import DynamicTable
import numpy as np
def add_task(nwbfile):
tasks_module = nwbfile.create_processing_module(name="tasks", description="tasks module")
for i in range(1, 4):
task_table = DynamicTable(name=f"task_table_{i}", description=f"task table {i}")
task_table.add_column(name="task_name", description="Name of the task.")
task_table.add_column(name="task_description", description="Description of the task.")
task_table.add_column(name="camera_id", description="Camera ID.")
task_table.add_column(name="task_epochs", description="Task epochs.")
task_table.add_row(
task_name=f"task{i}",
task_description=f"task{i} description",
camera_id=[1],
task_epochs=[i]
)
tasks_module.add(task_table)
def add_video_with_multiple_files(nwbfile):
camera_device = CameraDevice(
name="camera_device 1",
meters_per_pixel=1.0,
model="my_model",
lens="my_lens",
camera_name="my_camera_name",
)
nwbfile.add_device(camera_device)
video_files = [
"/path/to/video_epoch1.h264",
"/path/to/video_epoch2.h264",
"/path/to/video_epoch3.h264"
]
timestamps = np.linspace(0, 30, 900)
image_series = ImageSeries(
name="my_image_series",
description="Video recordings across multiple epochs",
unit="n.a.",
external_file=video_files,
format="external",
timestamps=timestamps,
starting_frame=[0, 300, 600],
device=camera_device,
)
nwbfile.add_acquisition(image_series)
def insert_session(nwbfile_path: Path):
import datajoint as dj
dj_local_conf_path = "/Users/pauladkisson/Documents/CatalystNeuro/Spyglass/spyglass/dj_local_conf.json"
dj.config.load(dj_local_conf_path)
import spyglass.common as sgc
import spyglass.data_import as sgi
from spyglass.utils.nwb_helper_fn import get_nwb_copy_filename
nwb_copy_file_name = get_nwb_copy_filename(nwbfile_path.name)
(sgc.Nwbfile & {"nwb_file_name": nwb_copy_file_name}).delete()
sgi.insert_sessions(str(nwbfile_path), rollback_on_fail=True, raise_err=True)
print(sgc.VideoFile())
def main():
nwbfile = mock_NWBFile(
identifier="multiple_video_files_bug_demo",
session_description="Mock NWB file demonstrating Spyglass multiple video files issue"
)
nwbfile.add_epoch(start_time=0.0, stop_time=10.0, tags=["01"])
nwbfile.add_epoch(start_time=10.0, stop_time=20.0, tags=["02"])
nwbfile.add_epoch(start_time=20.0, stop_time=30.0, tags=["03"])
nwbfile.create_processing_module(
name="behavior",
description="Behavioral data including video"
)
add_task(nwbfile)
add_video_with_multiple_files(nwbfile)
nwbfile_path = Path("/Volumes/T7/CatalystNeuro/Spyglass/raw/mock_multiple_video_files.nwb")
if nwbfile_path.exists():
nwbfile_path.unlink()
nwbfile_path.parent.mkdir(parents=True, exist_ok=True)
with NWBHDF5IO(nwbfile_path, "w") as io:
io.write(nwbfile)
insert_session(nwbfile_path=nwbfile_path)
if __name__ == "__main__":
main()Behavior:
When running this script with an ImageSeries containing 3 external video files (with starting_frame=[0, 300, 600] to indicate which frames belong to which file), Spyglass creates 0 VideoFile entries instead of the expected 3:
Expected behavior (3 VideoFile entries, one per external file):
*nwb_file_name *epoch *video_file_nu camera_name video_file_obj
+------------+ +-------+ +------------+ +------------+ +------------+
mock_multiple_ 1 1 my_camera_name <UUID>
mock_multiple_ 2 2 my_camera_name <UUID>
mock_multiple_ 3 3 my_camera_name <UUID>
(Total: 3)
Actual behavior (0 VideoFile entries):
*nwb_file_name *epoch *video_file_nu camera_name video_file_obj
+------------+ +-------+ +------------+ +------------+ +------------+
(Total: 0)
Why this happens:
Spyglass checks if at least 90% of the ImageSeries timestamps overlap with each epoch's interval. However, because Spyglass doesn't recognize the starting_frame parameter, it treats all 900 timestamps (spanning 0-30s across all 3 epochs) as belonging to a single video file.
For each epoch (which is only 10 seconds):
- Epoch 1 (0-10s): Only ~300 timestamps overlap (33% < 90% threshold) ❌
- Epoch 2 (10-20s): Only ~300 timestamps overlap (33% < 90% threshold) ❌
- Epoch 3 (20-30s): Only ~300 timestamps overlap (33% < 90% threshold) ❌
Since none of the epochs pass the 90% threshold, NO video files are imported at all. Spyglass does log warnings for each epoch:
[15:37:52][INFO] Spyglass: Populating VideoFile...
[15:37:52][INFO] Spyglass: No video found corresponding to file mock_multiple_video_files_.nwb, epoch 01
[15:37:52][INFO] Spyglass: No video found corresponding to file mock_multiple_video_files_.nwb, epoch 02
[15:37:52][INFO] Spyglass: No video found corresponding to file mock_multiple_video_files_.nwb, epoch 03
However, these warnings don't explain WHY no video was found - specifically, that the 90% timestamp overlap threshold wasn't met because the starting_frame parameter was ignored. Without understanding the root cause, users are left confused about why their valid NWB ImageSeries with video data fails to import.