Skip to content

Commit f8b681d

Browse files
talmoclaude
andauthored
Fix RGB/BGR color channel ordering in .pkg.slp embedded frames (#250)
* Bump SLP format version to 1.4 and add format history - Increment FORMAT_ID from 1.3 to 1.4 - Document format version history in module docstring: - 1.0: Initial format - 1.1: Coordinate system change - 1.2: Added tracking_score field - 1.3: Explicit tracking_score handling - 1.4: Added channel_order for RGB/BGR tracking (this change) - Remove deprecated embed_video() function (cleanup) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add image plugin system for embedded frame encoding/decoding Introduce a dedicated plugin system for image encoding/decoding, separate from the video plugin system. This controls how embedded frames in .pkg.slp files are encoded and decoded. Changes: - Add _default_image_plugin global variable - Add normalize_image_plugin_name() for validation - Add set_default_image_plugin() and get_default_image_plugin() - Only supports "opencv" and "imageio" (no PyAV for images) - Cleaner separation from video plugins (opencv/FFMPEG/pyav) Rationale: - Image encoding/decoding has different requirements than video streaming - Only OpenCV and imageio support image encode/decode operations - Avoids special-casing PyAV→imageio mappings everywhere - Provides clear API for users to control embedded frame encoding 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Thread plugin parameter through embedding API Add plugin parameter to all embedding functions to allow users to control which image plugin (opencv or imageio) is used for encoding embedded frames. Changes: - Add plugin parameter to process_and_embed_frames() - Use get_default_image_plugin() if None - Auto-detect if no default (opencv preferred, then imageio) - Use plugin to control encoding instead of checking sys.modules - Thread plugin through call chain: - embed_frames() - embed_videos() - write_labels() - save_slp() - Update all docstrings to document plugin parameter - Update encoding logic to use plugin variable instead of sys.modules check Benefits: - Users can explicitly control encoding plugin via save_slp(plugin="opencv") - Global default via set_default_image_plugin() is respected - Maintains backwards compatibility (auto-detect if not specified) - Enables consistent encoding/decoding for color accuracy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Export image plugin functions in public API Add get_default_image_plugin and set_default_image_plugin to the public sleap_io API for controlling embedded frame encoding. Changes: - Import functions from sleap_io.io.video_reading - Add to __all__ export list Usage: import sleap_io as sio sio.set_default_image_plugin("opencv") # Use OpenCV for all embedding sio.save_slp(labels, "file.pkg.slp", embed="all") # Uses OpenCV 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add comprehensive tests for RGB/BGR channel ordering in embedded frames - Add small_robot_3_frame.mp4 test video fixture (3 frames, 22KB) - Add test_embed_channel_order_consistency: Parametrized test covering all 4 encoder/decoder plugin combinations (opencv/imageio) - Add test_embed_channel_order_metadata: Verifies channel_order attribute stored correctly in HDF5 - Add test_embed_backwards_compatibility_channel_order: Ensures legacy files (format < 1.4) default to BGR - Fix test_format_id_1_3_tracking_score to expect format_id 1.4 (was 1.3) Tests verify: ✓ Pixel-perfect frame matching across all plugin combinations ✓ Automatic RGB/BGR channel conversion works correctly ✓ HDF5 metadata stores correct channel_order ("RGB" or "BGR") ✓ Backwards compatibility with pre-1.4 files All 82 tests in test_slp.py pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add image plugin support to ImageVideo with automatic RGB/BGR handling - Add plugin attribute to ImageVideo class with validator and default factory - Support both "opencv" and "imageio" plugins for reading images - Always output RGB regardless of plugin (auto-convert BGR->RGB for OpenCV) - Respect global _default_image_plugin setting from get_default_image_plugin() - Auto-detect plugin if none specified (opencv preferred, then imageio) Updates to ImageVideo._read_frame(): - Use cv2.imread() when plugin="opencv", flip BGR->RGB for color images - Use iio.imread() when plugin="imageio" (RGB native) - No metadata storage needed (always assumes RGB output) Documentation: - Update VideoBackend.from_filename() docstring to document ImageVideo plugin parameter - Enhance ImageVideo class docstring with plugin information Tests: - Add test_image_video_plugin_consistency: Verify both plugins produce identical RGB output - Add test_image_video_default_plugin: Test global default plugin setting - Add test_image_video_explicit_plugin_overrides_default: Test explicit parameter overrides - Add test_image_video_plugin_with_grayscale: Test plugin works with grayscale images - All 34 tests in test_video_reading.py pass Provides consistent API across all video backends (ImageVideo, MediaVideo, HDF5Video). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent b56fad2 commit f8b681d

8 files changed

Lines changed: 550 additions & 138 deletions

File tree

sleap_io/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,9 @@
2525
)
2626
from sleap_io.io.video_reading import (
2727
VideoBackend,
28+
get_default_image_plugin,
2829
get_default_video_plugin,
30+
set_default_image_plugin,
2931
set_default_video_plugin,
3032
)
3133
from sleap_io.io.video_writing import VideoWriter
@@ -72,7 +74,9 @@
7274
"save_slp",
7375
"save_ultralytics",
7476
"save_video",
77+
"get_default_image_plugin",
7578
"get_default_video_plugin",
79+
"set_default_image_plugin",
7680
"set_default_video_plugin",
7781
"VideoBackend",
7882
"VideoWriter",

sleap_io/io/main.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ def save_slp(
5555
embed: bool | str | list[tuple[Video, int]] | None = False,
5656
restore_original_videos: bool = True,
5757
verbose: bool = True,
58+
plugin: Optional[str] = None,
5859
):
5960
"""Save a SLEAP dataset to a `.slp` file.
6061
@@ -79,13 +80,18 @@ def save_slp(
7980
video files. If `False` and `embed=False`, keep references to source
8081
`.pkg.slp` files. Only applies when `embed=False`.
8182
verbose: If `True` (the default), display a progress bar when embedding frames.
83+
plugin: Image plugin to use for encoding embedded frames. One of "opencv"
84+
or "imageio". If None, uses the global default from
85+
`get_default_image_plugin()`. If no global default is set, auto-detects
86+
based on available packages (opencv preferred, then imageio).
8287
"""
8388
return slp.write_labels(
8489
filename,
8590
labels,
8691
embed=embed,
8792
restore_original_videos=restore_original_videos,
8893
verbose=verbose,
94+
plugin=plugin,
8995
)
9096

9197

sleap_io/io/slp.py

Lines changed: 52 additions & 129 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,12 @@
1-
"""This module handles direct I/O operations for working with .slp files."""
1+
"""This module handles direct I/O operations for working with .slp files.
2+
3+
Format version history:
4+
- 1.0: Initial format
5+
- 1.1: Changed coordinate system from top-left pixel at (0, 0) to center at (0, 0)
6+
- 1.2: Added tracking_score field to instances
7+
- 1.3: Added explicit handling for tracking_score
8+
- 1.4: Added channel_order attribute to embedded video datasets to track RGB vs BGR
9+
"""
210

311
from __future__ import annotations
412

@@ -290,128 +298,6 @@ def video_to_dict(video: Video, labels_path: Optional[str] = None) -> dict:
290298
return result
291299

292300

293-
def embed_video(
294-
labels_path: str,
295-
video: Video,
296-
group: str,
297-
frame_inds: list[int],
298-
image_format: str = "png",
299-
fixed_length: bool = True,
300-
) -> Video:
301-
"""Embed frames of a video in a SLEAP labels file.
302-
303-
.. deprecated:: 1.0.0
304-
This function is deprecated. Use `process_and_embed_frames` instead.
305-
306-
Args:
307-
labels_path: A string path to the SLEAP labels file.
308-
video: A `Video` object to embed in the labels file.
309-
group: The name of the group to store the embedded video in. Image data will be
310-
stored in a dataset named `{group}/video`. Frame indices will be stored
311-
in a data set named `{group}/frame_numbers`.
312-
frame_inds: A list of frame indices to embed.
313-
image_format: The image format to use for embedding. Valid formats are "png"
314-
(the default), "jpg" or "hdf5".
315-
fixed_length: If `True` (the default), the embedded images will be padded to the
316-
length of the largest image. If `False`, the images will be stored as
317-
variable length, which is smaller but may not be supported by all readers.
318-
319-
Returns:
320-
An embedded `Video` object.
321-
322-
If the video is already embedded, the original video will be returned. If not,
323-
a new `Video` object will be created with the embedded data.
324-
"""
325-
# Load the image data and optionally encode it.
326-
imgs_data = []
327-
for frame_idx in frame_inds:
328-
frame = video[frame_idx]
329-
330-
if image_format == "hdf5":
331-
img_data = frame
332-
else:
333-
if "cv2" in sys.modules:
334-
img_data = np.squeeze(
335-
cv2.imencode("." + image_format, frame)[1]
336-
).astype("int8")
337-
else:
338-
if frame.shape[-1] == 1:
339-
frame = frame.squeeze(axis=-1)
340-
img_data = np.frombuffer(
341-
iio.imwrite("<bytes>", frame, extension="." + image_format),
342-
dtype="int8",
343-
)
344-
345-
imgs_data.append(img_data)
346-
347-
# Write the image data to the labels file.
348-
with h5py.File(labels_path, "a") as f:
349-
if image_format == "hdf5":
350-
f.create_dataset(
351-
f"{group}/video", data=imgs_data, compression="gzip", chunks=True
352-
)
353-
else:
354-
if fixed_length:
355-
img_bytes_len = 0
356-
for img in imgs_data:
357-
img_bytes_len = max(img_bytes_len, len(img))
358-
ds = f.create_dataset(
359-
f"{group}/video",
360-
shape=(len(imgs_data), img_bytes_len),
361-
dtype="int8",
362-
compression="gzip",
363-
)
364-
for i, img in enumerate(imgs_data):
365-
ds[i, : len(img)] = img
366-
else:
367-
ds = f.create_dataset(
368-
f"{group}/video",
369-
shape=(len(imgs_data),),
370-
dtype=h5py.special_dtype(vlen=np.dtype("int8")),
371-
)
372-
for i, img in enumerate(imgs_data):
373-
ds[i] = img
374-
375-
# Store metadata.
376-
ds.attrs["format"] = image_format
377-
video_shape = video.shape
378-
(
379-
ds.attrs["frames"],
380-
ds.attrs["height"],
381-
ds.attrs["width"],
382-
ds.attrs["channels"],
383-
) = video_shape
384-
385-
# Store frame indices.
386-
f.create_dataset(f"{group}/frame_numbers", data=frame_inds)
387-
388-
# Store source video.
389-
if video.source_video is not None:
390-
# If this is already an embedded dataset, retain the previous source video.
391-
source_video = video.source_video
392-
else:
393-
source_video = video
394-
395-
# Create a new video object with the embedded data.
396-
embedded_video = Video(
397-
filename=labels_path,
398-
backend=VideoBackend.from_filename(
399-
labels_path,
400-
dataset=f"{group}/video",
401-
grayscale=video.grayscale,
402-
keep_open=False,
403-
),
404-
source_video=source_video,
405-
)
406-
407-
grp = f.require_group(f"{group}/source_video")
408-
grp.attrs["json"] = json.dumps(
409-
video_to_dict(source_video, labels_path), separators=(",", ":")
410-
)
411-
412-
return embedded_video
413-
414-
415301
def prepare_frames_to_embed(
416302
labels_path: str,
417303
labels: Labels,
@@ -467,6 +353,7 @@ def process_and_embed_frames(
467353
image_format: str = "png",
468354
fixed_length: bool = True,
469355
verbose: bool = True,
356+
plugin: Optional[str] = None,
470357
) -> dict[Video, Video]:
471358
"""Process and embed frames into a SLEAP labels file.
472359
@@ -484,10 +371,22 @@ def process_and_embed_frames(
484371
variable length, which is smaller but may not be supported by all readers.
485372
verbose: If `True` (the default), display a progress bar for the embedding
486373
process.
374+
plugin: Image plugin to use for encoding. One of "opencv" or "imageio".
375+
If None, uses the global default from `get_default_image_plugin()`.
376+
If no global default is set, auto-detects based on available packages.
487377
488378
Returns:
489379
A dictionary mapping original Video objects to their embedded versions.
490380
"""
381+
# Determine which plugin to use for encoding
382+
from sleap_io.io.video_reading import get_default_image_plugin
383+
384+
if plugin is None:
385+
plugin = get_default_image_plugin()
386+
if plugin is None:
387+
# Auto-detect: prefer opencv, fallback to imageio
388+
plugin = "opencv" if "cv2" in sys.modules else "imageio"
389+
491390
# Initialize a dictionary to store data by group
492391
data_by_group = {}
493392

@@ -508,6 +407,7 @@ def process_and_embed_frames(
508407
"video": video, # All frames in a group are from the same video
509408
"frame_inds": [],
510409
"imgs_data": [],
410+
"channel_order": None, # Track channel order: "RGB" or "BGR"
511411
}
512412

513413
# Load the frame
@@ -516,18 +416,25 @@ def process_and_embed_frames(
516416
# Encode the frame
517417
if image_format == "hdf5":
518418
img_data = frame
419+
channel_order = "RGB" # HDF5 format stores as-is (RGB)
519420
else:
520-
if "cv2" in sys.modules:
421+
if plugin == "opencv":
521422
img_data = np.squeeze(
522423
cv2.imencode("." + image_format, frame)[1]
523424
).astype("int8")
524-
else:
425+
channel_order = "BGR" # OpenCV encodes in BGR
426+
else: # imageio
525427
if frame.shape[-1] == 1:
526428
frame = frame.squeeze(axis=-1)
527429
img_data = np.frombuffer(
528430
iio.imwrite("<bytes>", frame, extension="." + image_format),
529431
dtype="int8",
530432
)
433+
channel_order = "RGB" # imageio encodes in RGB
434+
435+
# Store channel order (should be consistent for all frames in a group)
436+
if data_by_group[group]["channel_order"] is None:
437+
data_by_group[group]["channel_order"] = channel_order
531438

532439
# Store frame data in the appropriate group
533440
data_by_group[group]["imgs_data"].append(img_data)
@@ -570,6 +477,7 @@ def process_and_embed_frames(
570477

571478
# Store metadata
572479
ds.attrs["format"] = image_format
480+
ds.attrs["channel_order"] = data["channel_order"]
573481
video_shape = video.shape
574482
(
575483
ds.attrs["frames"],
@@ -617,6 +525,7 @@ def embed_frames(
617525
embed: list[tuple[Video, int]],
618526
image_format: str = "png",
619527
verbose: bool = True,
528+
plugin: Optional[str] = None,
620529
):
621530
"""Embed frames in a SLEAP labels file.
622531
@@ -628,14 +537,20 @@ def embed_frames(
628537
(the default), "jpg" or "hdf5".
629538
verbose: If `True` (the default), display a progress bar for the embedding
630539
process.
540+
plugin: Image plugin to use for encoding. One of "opencv" or "imageio".
541+
If None, uses the global default from `get_default_image_plugin()`.
631542
632543
Notes:
633544
This function will embed the frames in the labels file and update the `Videos`
634545
and `Labels` objects in place.
635546
"""
636547
frames_metadata = prepare_frames_to_embed(labels_path, labels, embed)
637548
replaced_videos = process_and_embed_frames(
638-
labels_path, frames_metadata, image_format=image_format, verbose=verbose
549+
labels_path,
550+
frames_metadata,
551+
image_format=image_format,
552+
verbose=verbose,
553+
plugin=plugin,
639554
)
640555

641556
if len(replaced_videos) > 0:
@@ -647,6 +562,7 @@ def embed_videos(
647562
labels: Labels,
648563
embed: bool | str | list[tuple[Video, int]],
649564
verbose: bool = True,
565+
plugin: Optional[str] = None,
650566
):
651567
"""Embed videos in a SLEAP labels file.
652568
@@ -664,6 +580,8 @@ def embed_videos(
664580
embedded.
665581
verbose: If `True` (the default), display a progress bar for the embedding
666582
process.
583+
plugin: Image plugin to use for encoding. One of "opencv" or "imageio".
584+
If None, uses the global default from `get_default_image_plugin()`.
667585
668586
If `"source"` is specified, no images will be embedded and the source video
669587
will be restored if available.
@@ -689,7 +607,7 @@ def embed_videos(
689607
else:
690608
raise ValueError(f"Invalid value for embed: {embed}")
691609

692-
embed_frames(labels_path, labels, embed, verbose=verbose)
610+
embed_frames(labels_path, labels, embed, verbose=verbose, plugin=plugin)
693611

694612

695613
def write_videos(
@@ -1054,7 +972,7 @@ def write_metadata(labels_path: str, labels: Labels):
1054972

1055973
with h5py.File(labels_path, "a") as f:
1056974
grp = f.require_group("metadata")
1057-
grp.attrs["format_id"] = 1.3
975+
grp.attrs["format_id"] = 1.4
1058976
grp.attrs["json"] = np.bytes_(json.dumps(md, separators=(",", ":")))
1059977

1060978

@@ -2019,6 +1937,7 @@ def write_labels(
20191937
embed: bool | str | list[tuple[Video, int]] | None = None,
20201938
restore_original_videos: bool = True,
20211939
verbose: bool = True,
1940+
plugin: Optional[str] = None,
20221941
):
20231942
"""Write a SLEAP labels file.
20241943
@@ -2043,6 +1962,10 @@ def write_labels(
20431962
video files. If `False` and `embed=False`, keep references to source
20441963
`.pkg.slp` files. Only applies when `embed=False`.
20451964
verbose: If `True` (the default), display a progress bar when embedding frames.
1965+
plugin: Image plugin to use for encoding embedded frames. One of "opencv"
1966+
or "imageio". If None, uses the global default from
1967+
`get_default_image_plugin()`. If no global default is set, auto-detects
1968+
based on available packages.
20461969
"""
20471970
if Path(labels_path).exists():
20481971
Path(labels_path).unlink()
@@ -2052,7 +1975,7 @@ def write_labels(
20521975
original_videos = [v for v in labels.videos] if embed else None
20531976

20541977
if embed:
2055-
embed_videos(labels_path, labels, embed, verbose=verbose)
1978+
embed_videos(labels_path, labels, embed, verbose=verbose, plugin=plugin)
20561979

20571980
# Determine reference mode based on parameters
20581981
if embed == "source" or (embed is False and restore_original_videos):

0 commit comments

Comments
 (0)