Expected Behavior
When using labels.merge() to merge two slp files with ImageVideo backends, if there's intersecting annotations (same image has been labelled on both slp files). merge() should either combine, or delete one of them.
Actual Behavior
# assumes:
this: sio.Labels; other: sio.Labels;
this.merge(other)
Annotations (LabeledFrames) from the other sleap file that already exists in this would be pointing at other images in the deduplicated ImageVideo. This does not throw an error normally but corrupts the annotation.
In special cases where len(deduped_video) < the original index of those duplicated annotation, this would throw an error when accessing those images (e.g. in case where calling .create_training_split() on the merged slp files.
for labelled_frame in merged_labels:
image = labelled_frame.video[labelled_frame.frame_idx] # Would throw an out of range error.
See graph:
Steps to recreate
See graph above. I can also provide two small slp files I made to recreate the issue if that's
Screenshot
bad here is Duplication case 2 where an OOB is thrown.
Hypothesis
sleap_io.model.labesl.py:L1787-L1805: /sleap_io/model/labels.py
self.videos.append(deduped_video)
video_map[other_video] = deduped_video # <== Here other_video gets mapped to deduped video
# Build frame index mapping for remaining frames
if isinstance(
other_video.filename, list
) and isinstance(deduped_video.filename, list):
other_basenames = [
Path(f).name for f in other_video.filename
]
deduped_basenames = [
Path(f).name for f in deduped_video.filename
]
for old_idx, basename in enumerate(other_basenames):
if basename in deduped_basenames: # <== Here, if `basename` is not a part of deduped, it will not be added to `frame_idx_map`.
new_idx = deduped_basenames.index(basename)
frame_idx_map[(other_video, old_idx)] = (
deduped_video,
new_idx,
)
So that in L1860-L1879:
for frame_idx, other_frame in enumerate(other.labeled_frames):
if progress_callback:
progress_callback(
frame_idx,
total_frames,
f"Merging frame {frame_idx + 1}/{total_frames}",
)
# Check if frame index needs remapping (for deduplicated/merged videos)
if (other_frame.video, other_frame.frame_idx) in frame_idx_map: # <== It's not in frame_idx_map so it will go to else.
mapped_video, mapped_frame_idx = frame_idx_map[
(other_frame.video, other_frame.frame_idx)
]
else:
# Map video to self
mapped_video = video_map.get(other_frame.video, other_frame.video) # <== map to deduped_video
mapped_frame_idx = other_frame.frame_idx # <== but still maintains the original index
Suggested Fix:
Delete duplicate annotation or add a potential option for user to choose if they want to merge those intersecting annotation or just pick the best one?
Expected Behavior
When using
labels.merge()to merge two slp files withImageVideobackends, if there's intersecting annotations (same image has been labelled on both slp files).merge()should either combine, or delete one of them.Actual Behavior
Annotations (LabeledFrames) from the
othersleap file that already exists inthiswould be pointing at other images in the deduplicatedImageVideo. This does not throw an error normally but corrupts the annotation.In special cases where
len(deduped_video)< the original index of those duplicated annotation, this would throw an error when accessing those images (e.g. in case where calling.create_training_split()on the merged slp files.See graph:
Steps to recreate
See graph above. I can also provide two small slp files I made to recreate the issue if that's
Screenshot
badhere is Duplication case 2 where an OOB is thrown.Hypothesis
sleap_io.model.labesl.py:L1787-L1805: /sleap_io/model/labels.pySo that in
L1860-L1879:Suggested Fix:
Delete duplicate annotation or add a potential option for user to choose if they want to merge those intersecting annotation or just pick the best one?