Summary
Labels.merge() correctly dedupes skeletons structurally via SkeletonMatcher (default method="structure"), so the standard inference→merge flow used by sleap 1.5+ (sleap/gui/learning/runners.py:767) and similar consumers produces correctly-canonicalized projects.
However, Labels.update() (triggered by __attrs_post_init__ on direct Labels(...) construction, and by manual labeled_frames mutation) still dedupes by Python-object identity, which leaks content-equal-but-distinct Skeleton objects into labels.skeletons.
Where
sleap_io/model/labels.py:446-447, same pattern at :847-848 and :876-877:
for inst in lf:
if inst.skeleton not in self.skeletons:
self.skeletons.append(inst.skeleton)
Skeleton is @define(eq=False) (sleap_io/model/skeleton.py:96), so __eq__ falls back to object.__eq__ and in is identity. Two content-equal Skeletons both get appended even though Skeleton.matches() (skeleton.py:691) would identify them as structurally equivalent.
Reproduction
skel_a == skel_b -> False # identity __eq__
skel_a in [skel_b] -> False
skel_a.matches(skel_b) -> True
len(labels.skeletons) -> 2 # expected 1 after Labels.update()
Who's affected
- Direct
Labels(labeled_frames=[...]) construction (post-init → update())
- Manual
labels.labeled_frames.append(lf) followed by labels.update()
- Notebook/script users assembling inference outputs without
merge()
Not affected: the canonical Labels.merge() path used by sleap GUI for inference output ingestion (verified at sleap/gui/learning/runners.py:767-769).
Suggested fix
Mirror what merge() already does internally. In update() and the two parallel paths, use a matches()-based structural lookup and canonicalize inst.skeleton:
canonical = next((s for s in self.skeletons if s.matches(inst.skeleton)), None)
if canonical is None:
self.skeletons.append(inst.skeleton)
elif canonical is not inst.skeleton:
inst.skeleton = canonical # safe — Instance.points is positional
Reassignment is safe because Instance.points is a positional PointsArray ndarray (instance.py:413), not the legacy node-identity-keyed dict that blocked talmolab/sleap#2075.
Priority
Lower than it would have looked from talmolab/sleap#2025 (the user-visible bug there is the inference→merge path, since fixed by talmolab/sleap#2282). This is principled-correctness hygiene; the leak only surfaces for direct-construction edge cases.
Touch sites: labels.py:435-450, :833-850, :870-880. Related: a mirror edge in the sleap-io.js SLP loader is being filed separately.
Summary
Labels.merge()correctly dedupes skeletons structurally viaSkeletonMatcher(defaultmethod="structure"), so the standard inference→merge flow used by sleap 1.5+ (sleap/gui/learning/runners.py:767) and similar consumers produces correctly-canonicalized projects.However,
Labels.update()(triggered by__attrs_post_init__on directLabels(...)construction, and by manuallabeled_framesmutation) still dedupes by Python-object identity, which leaks content-equal-but-distinctSkeletonobjects intolabels.skeletons.Where
sleap_io/model/labels.py:446-447, same pattern at:847-848and:876-877:Skeletonis@define(eq=False)(sleap_io/model/skeleton.py:96), so__eq__falls back toobject.__eq__andinis identity. Two content-equal Skeletons both get appended even thoughSkeleton.matches()(skeleton.py:691) would identify them as structurally equivalent.Reproduction
Who's affected
Labels(labeled_frames=[...])construction (post-init →update())labels.labeled_frames.append(lf)followed bylabels.update()merge()Not affected: the canonical
Labels.merge()path used by sleap GUI for inference output ingestion (verified atsleap/gui/learning/runners.py:767-769).Suggested fix
Mirror what
merge()already does internally. Inupdate()and the two parallel paths, use amatches()-based structural lookup and canonicalizeinst.skeleton:Reassignment is safe because
Instance.pointsis a positionalPointsArrayndarray (instance.py:413), not the legacy node-identity-keyed dict that blocked talmolab/sleap#2075.Priority
Lower than it would have looked from talmolab/sleap#2025 (the user-visible bug there is the inference→merge path, since fixed by talmolab/sleap#2282). This is principled-correctness hygiene; the leak only surfaces for direct-construction edge cases.
Touch sites:
labels.py:435-450,:833-850,:870-880. Related: a mirror edge in the sleap-io.js SLP loader is being filed separately.