This document outlines the high-level design and architectural details of VLX VisionBridge.
VLX VisionBridge is a headless, high-performance Linux service written in Go. It essentially functions as a remote, headless OBS Studio tailored for remote VMs. It aggregates multiple finite SRT/WebRTC/Media streams into a single composite live stream, which is broadcasted simultaneously to multiple CDNs (YouTube, Twitch, VK).
The project is structured according to common Go conventions, primarily using the internal/ directory to encapsulate private logic:
internal/models: Defines core structs such asLayer,Config, andDatabaseConfig.configs: Contains theconfigs/visionbridge.settings.templateembedded directly into the binary viaconfigs/assets.goto facilitate self-contained installations.internal/config: Handles parsing of thevisionbridge.settingsYAML file and implements configuration watching and diffing.internal/db: Manages the SQLite database connection pool (usinggithub.com/mattn/go-sqlite3) and logging queries.internal/engine: The core FFmpeg command generator and process manager. It is further decoupled into:source: Prepares input arguments, path sanitization, and input file parsing.mixer: Constructs complex filtergraphs and manages dynamic ZMQ updates.streamer: Builds multi-destination output pipelines using theteemuxer.
The service is fully configured via the visionbridge.settings file (default location: /opt/VLX_VisionBridge/etc/visionbridge.settings, but can be overridden by CONFIG_PATH or positional arguments).
The configuration is hot-reloadable. A Config Watcher uses fsnotify to monitor the settings file. When changes are detected, a diffing logic determines the required action:
- RequiresFilterUpdate: Changes to layout properties like
X,Y, andVolumetrigger a live filter update via ZMQ without dropping the stream. - RequiresRestart: Changes to structural properties like
Size,Active, orDestinationsrequire a full FFmpeg process restart.
The mixer coordinates two distinct input pipelines conceptually similar to "Sources" within a "Scene":
Up to 10 independent objects managed directly via FFmpeg inputs. Customizable delay spacers (color or image-based) are dynamically generated for local playlist pipelines.
- State:
Active|Inactive - Input Type:
local(folder of media),srt,rtmp(andrtmps),webrtc,rtsp(andrtsps),ipc_audio(raw PCM over UDS). Forlocal, folder combinations are automatically parsed (video only, image + audio, image only, audio only). - Media:
Video+Audio|Video|Audio - Transform:
Size(scale width),X,Y(Position). - Audio: Configurable
Volumeper layer.
An independently spawned Chromium process dynamically rendering up to 7 Z-layers (Z1 to Z7), captured via x11grab and xvfb-run by FFmpeg.
- Handles standard web URLs, as well as automatic HTML
<video>/<img>/<audio>tag generation for local media. - Background colors can be dynamically injected into the generated HTML.
- Ensures absolute layout positioning via inline CSS directly to eliminate browser offsets.
The FFmpeg Mixer uses advanced filter_complex graphs to scale, position, and overlay inputs based on absolute integer-based pixel sizing and X/Y coordinates.
The base canvas size for the filtergraph is defined by cfg.Input.Resolution within the InputSettings. This establishes the drawing area for all overlays and layers. The final scaled output, which is sent to external destinations, is defined separately by cfg.Output.Resolution. Because the input resolution dictates the fundamental structure of the filtergraph and video buffers, any changes to the input resolution require a full FFmpeg restart, whereas changes to individual layer positions or sizes may not.
To eliminate local SRT network overhead and reduce latency for deployments running alongside VLX_ChatBridge, VisionBridge integrates a dedicated IPC connector:
-
Audio Ingress (
ipc_audio): Accepts raw PCM data (s16le, 48kHz, 2-channel) directly via a Unix Domain Socket (/tmp/vlx_audio.sock), injecting it seamlessly into the FFmpeg audio mixer. -
Control Ingress: A listener on the Unix control socket (
/tmp/vlx_control.sock) handles incoming control messages. It parsesset_input_stateJSON IPC events, safely updates the in-memory config struct using Mutexes, and dynamically dispatches ZMQ commands directly into the FFmpeg filtergraph to adjust elements (like layers) without requiring Web browser overhead or killing the FFmpeg process. -
Auto-Fallback Concept: Users can hook
runOnPublish/runOnUnpublishscripts in MediaMTX to inject JSON into VisionBridge's control socket. This allows creating an automatic "Be Right Back" screen or fallback sequence upon signal loss. -
Dynamic Updates via ZMQ: Live properties (like
overlay@layerIDcoordinates andvolume@layerID) are manipulated in real-time. ZMQ messaging is a mandatory dependency (not optional) for the system to provide the essential real-time filter communication required for dynamic updates with FFmpeg. The mixer binds azmqfilter totcp://127.0.0.1:5555to receive string commands. -
Performance Optimizations: For performance-sensitive code paths in filter generation,
strings.Builderand stack buffers are used overfmt.Sprintfto minimize memory allocations.
The output layer encodes the composite frames into H.264/AAC and pushes to a robust multi-destination pipeline:
teeMuxer: Used for simultaneous output cloning.fifoPseudo-muxer: Integrated to isolate output streams. In the event a single destination (e.g., an unstable RTMP endpoint) fails,fifocombined with:onfail=ignoreprevents the entire FFmpeg process from crashing.- Automatic codec enforcement prevents conversion failures when mixing different media types or dummy streams.
A robust ProcessManager governs the underlying FFmpeg subprocess:
- Idle Behavior: If the stream output is deactivated via configuration or IPC, the ProcessManager keeps FFmpeg fully dormant (consuming 0% CPU) until a
[ZMQ_CONTROL] Target=stream Enabled=truecommand wakes it up. - Health Monitor: Monitors CPU/RAM usage and stream stability, logging metrics to SQLite.
- Error Diagnostics: Maintains a
tailBufferof the last 4096 bytes of the process's standard error stream to pinpoint failures (identifying them as[input],[mixer], or[output]issues). - RetryTracker: Uses a backoff strategy (5 quick retries, 2 slow retries, then dynamic disablement) for isolating failures in sources like Chromium overlays.
- Process reaps and signal listeners ensure no zombie processes remain after graceful or ungraceful shutdown.
While visionbridge.settings triggers state changes, SQLite is the sole database backend used for persistence and telemetry.
- Location: Default path
/opt/VLX_VisionBridge/var/visionbridge.db. - Usage: Stores reusable source templates, layout presets, and broadcast logs (uptime, bitrate fluctuations, error states).
- Non-root execution constraints are explicitly enforced (except during initial installation).
SanitizeInputPathprevents FFmpeg argument injection (e.g., ensuring paths starting with-are prefixed with./).- Strict strict integer typing for overlay properties (Size, X, Y) prevents filter injection vulnerabilities.