Skip to content

Fix ffmpeg_reader: reshape order, frame-success return, and tuple check#1107

Open
ds17f wants to merge 1 commit into
boltgolt:masterfrom
ds17f:fix/ffmpeg-reader-bugs
Open

Fix ffmpeg_reader: reshape order, frame-success return, and tuple check#1107
ds17f wants to merge 1 commit into
boltgolt:masterfrom
ds17f:fix/ffmpeg-reader-bugs

Conversation

@ds17f
Copy link
Copy Markdown

@ds17f ds17f commented May 5, 2026

The ffmpeg recorder has three bugs that prevent it from supplying usable frames to the recognition pipeline. With these fixes, switching recording_plugin = ffmpeg actually works; without them, it silently breaks recognition.

The bugs

1. Reshape transposes spatial dimensions

.reshape([-1, self.width, self.height, 3])

ffmpeg's rgb24 output is row-major: (height, width, 3). Using width as the row axis transposes every frame.

2. Successful reads return 0

read() returns (0, ...) on its success paths. Callers in compare.py and add.py treat the first value as a boolean success flag (matching cv2.VideoCapture.read()), so successful frames are always discarded.

3. if self.video == (): becomes ambiguous after the first record

Once self.video is a numpy array, equality returns an element-wise array and triggers ValueError: The truth value of an array with more than one element is ambiguous.

This is exactly the failure reported in #835:

File "...recorders/ffmpeg_reader.py", line 109, in read
    if self.video == ():
ValueError: operands could not be broadcast together with shapes (10,720,1280,3) (0,)

isinstance(self.video, tuple) matches the pre-record sentinel without depending on equality.

Scope

Fixes #835.

The ffmpeg recorder had three bugs that prevented it from supplying
usable frames to the recognition pipeline:

1. `numpy.reshape([-1, self.width, self.height, 3])` swaps the spatial
   dimensions. ffmpeg's rgb24 output is row-major (height, width, 3);
   reshaping with width as the row axis transposes every frame.

2. `read()` returned (0, ...) on the success paths. Callers in
   compare.py and add.py treat the first return value as a boolean
   success flag (matching cv2.VideoCapture.read()), so successful
   frames were always discarded.

3. `if self.video == ():` becomes ambiguous once self.video is a
   numpy array — equality returns an element-wise array and triggers
   'truth value of an array is ambiguous'. `isinstance(..., tuple)`
   matches the pre-record sentinel without depending on equality.

With these three fixes the ffmpeg recording_plugin produces usable
frames; without them, switching from opencv to ffmpeg silently breaks
recognition.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ValueError: operands could not be broadcast together with shapes (10,360,640,3) (0,)

1 participant