Skip to content

Commit 19c064d

Browse files
authored
perf(multimodal): single-pass ffmpeg frame extraction and Drop guard
* perf(multimodal): single-pass ffmpeg extraction, resolution/duration caps, and TempFile guard Replace the per-frame ffmpeg fork in load_video with a single invocation that pipes all sampled frames as a concatenated PNG sequence to stdout (-f image2pipe -vcodec png). The PNG frame boundaries are located by scanning for the 12-byte IEND chunk terminator, allowing the stream to be split into individual frames without a full recursive PNG parser. Each decoded DynamicImage is pushed into the result vector before the next frame is decoded, so peak memory during extraction is bounded by approximately one frame at a time. Before decoding, probe_video now reads width, height, and duration from ffprobe and enforces two configurable caps: - MLXCEL_VIDEO_MAX_PIXELS (default 16 777 216 = 4096x4096): rejects source videos whose width x height exceeds the cap, returning VideoError::ResolutionTooLarge. - MLXCEL_VIDEO_MAX_DURATION_SEC (default 600): rejects videos whose FFprobe-reported duration exceeds the cap, returning VideoError::DurationTooLong. Add TempFile { path: PathBuf } with a Drop impl that calls fs::remove_file and logs a warning on failure. This provides a panic-safe RAII guard for callers (e.g. server/media.rs) that write HTTP-fetched or base64-inline video to a temp path. The single-pass implementation in video.rs itself writes no temp files, so no guard is needed there. New tests cover: find_subsequence helper, TempFile drop (normal path, missing file, panic unwind), resolution cap rejection, duration cap rejection, and single-pass frame count and shape correctness. The #[ignore] bench_single_pass_768_frames test is provided for manual wall-time validation. * fix(multimodal): fail closed on ffprobe missing fields and cap PNG frame size HIGH — apply_probe_caps now defaults missing width/height to u32::MAX so the pixel cap trips immediately rather than silently treating absent dimensions as 0. Missing duration defaults to f64::INFINITY so the duration cap trips instead of treating the video as 0-second. saturating_mul replaces plain multiplication for the pixel product, preventing a u32::MAX * u32::MAX overflow back to 0 that would bypass the cap. The parsing+enforcement logic is extracted from probe_video into apply_probe_caps so unit tests can exercise it without a real ffprobe invocation. MEDIUM — split_png_stream now reads the MLXCEL_VIDEO_MAX_PNG_FRAME_BYTES env var (default 256 MiB) and returns VideoError::Extract with a clear message if the per-frame accumulation buffer exceeds the cap before an IEND marker is found. LOW — load_video rustdoc gains an "# Async warning" block explaining that callers inside a Tokio runtime must wrap the call in spawn_blocking. Five new unit tests: probe_video_missing_width_trips_resolution_cap, probe_video_missing_height_trips_resolution_cap, probe_video_missing_duration_trips_duration_cap, probe_video_both_dimensions_missing_saturates_not_overflows, split_png_stream_rejects_oversized_frame.
1 parent 4ead757 commit 19c064d

2 files changed

Lines changed: 994 additions & 71 deletions

File tree

0 commit comments

Comments
 (0)