Skip to content

Conversation

@andrewfb
Copy link
Collaborator

@andrewfb andrewfb commented Oct 2, 2025

This PR introduces a GStreamer-based Linux Capture implementation, as well as explicit video capture mode enumeration and selection, enabling applications to specify exact resolution, pixel format, and codec requirements. Implementation includes Mode support across all platforms (AVFoundation, DirectShow, GStreamer) with enhanced device enumeration and hot-plug detection.

Key Changes

  • New Capture::Mode API - Explicit enumeration and selection of resolution, internal pixel format (RGB24, YUV420P, NV12, etc.), and internal codec (Uncompressed, JPEG, H264, HEVC)
    • GStreamer implementation for Linux - New v4l2src-based capture with automatic format negotiation and decoder pipeline selection for compressed formats
  • AVFoundation enhancements - Mode enumeration from device capabilities, hot-unplug detection via session runtime error notifications
  • DirectShow rewrite - Native COM implementation with ISampleGrabber, removes videoInput 3rd party library dependency
  • Updated samples - CaptureBasic adds ImGui-based Mode selection UI, CaptureTest adds stress testing with random device/mode switching

A note on GStreamer vs. v4linux: This implementation uses GStreamer rather than v4l2 directly, diverging from #2061 and the master...whg:CaptureLinux. While v4l2 is part of the Linux kernel and requires minimal dependencies, I believe GStreamer provides some advantages for video capture:

  • Compressed codec support - Modern cameras (especially high-resolution models) increasingly use compressed formats (MJPEG, H264, HEVC) to reduce USB bandwidth. GStreamer automatically handles decompression through its plugin architecture, while v4l2 requires manual decoder implementation
  • Format handling - GStreamer's abstraction layer provides robust format conversion and handles device-specific quirks
  • Future-proofing - GStreamer provides a path to PipeWire integration as Linux moves toward the newer capture API, while direct v4l2 code would require a significant rework
  • Consistency with video playback - Cinder already uses GStreamer for Movie playback on Linux, making capture implementation architecturally consistent

The tradeoff is an additional system dependency, but GStreamer is widely available on Linux distributions and already required for video playback functionality.

Screenshot from 2025-10-02 18-23-53

Tested with macOS, Windows 10, Windows 11, Ubuntu 24.04, and Ubuntu 25.10 beta using a Zed2i, an ELP 4K USB camera, an Insta360 2C camera, the Apple Studio Display camera (macOS only), an iPhone remote camera (macOS only), and an ASUS laptop camera (Win11 only).

Introduces Mode class to represent specific capture configurations with:
- Resolution (width/height)
- Frame rate
- Codec (Uncompressed, JPEG, H264, HEVC)
- Pixel format (RGB/YUV variants)

Adds Device::getModes() to query supported capture modes and new
Capture::create(device, mode) factory method for explicit mode selection.
Implements comparison operators and stream output for Mode class.
Adds msw::ComPtr template class for managing COM interface lifetimes
without ATL dependency. Provides automatic AddRef/Release management
with copy semantics suitable for traditional COM reference counting.

Updates Windows build configuration and fixes .clang-format for
compatibility with vc2019-included clang-format version
(SortIncludes: Never -> false).
Implements Device::getModes() to enumerate available capture formats
with resolution, frame rate, and pixel format information. Adds
mode-based initialization through initWithDevice:mode: for explicit
format control.

Adds session error notifications and device disconnection detection.
The isCapturing() method now checks session state to detect unplugged
devices, preventing crashes from attempting to access disconnected
hardware.
Implements CaptureImplGStreamer with full Mode support for Linux video
capture. Uses GStreamer's v4l2src for camera access with automatic format
negotiation and conversion pipelines. Supports both compressed (MJPEG,
H264, HEVC) and uncompressed (YUV/RGB) formats with automatic
decompression when needed.

Enumerates device capabilities through V4L2 and provides Mode-based
initialization for explicit format control. Handles device disconnection
gracefully and provides detailed error reporting.

Updates build system with GStreamer dependencies and adds GStreamer to
the dependencies documentation table.
Replaces videoInput library with direct DirectShow implementation using
native COM interfaces. Provides full Mode support with automatic format
enumeration and negotiation.
Adds ImGui interface for runtime device and mode selection. Users can
choose capture devices from a dropdown and select specific modes
(resolution, framerate, format) or use automatic mode selection.

Displays current capture details including resolution, connection status,
frame rate, codec, pixel format, and color model. Demonstrates the new
Mode-based capture API with practical usage examples.
Enhances test application with automated stress testing that randomly
switches between devices and modes. Includes device refresh capability
to detect hot-plugged devices without restarting the app.

Adds vc2019 project files and CMake configuration. Updates to use ImGui
for runtime control and monitoring of stress test parameters.
@andrewfb andrewfb merged commit bcc6bb3 into cinder:master Oct 6, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant