Skip to content

ahmeddossamaa/ReVision-RT

Repository files navigation

ReVision-RT

Real-time, content-aware video retargeting in C++17. Reduces the width of a live camera feed or video while preserving visually important content using seam carving — removing connected low-energy pixel paths rather than cropping or scaling.

The energy map driving seam selection fuses four signals:

  • Sobel gradient — edges and texture
  • Depth Anything V2 (via ONNX Runtime) — foreground/background separation
  • U2Net (via ONNX Runtime) — salient object detection
  • Spectral residual saliency (OpenCV) — fast attention map

Motion is tracked per-pixel using optical flow with acceleration scoring. Pixels that moved recently resist removal and are queued for re-validation after a seam cut.


Prerequisites

Requirement Version
C++ compiler C++17 (GCC 9+ or Clang 10+)
CMake 3.16+
OpenCV 4.x with opencv_contrib (saliency module required)
ONNX Runtime 1.16+

OpenCV with contrib (Ubuntu)

sudo apt install libopencv-dev libopencv-contrib-dev

ONNX Runtime

Download a pre-built release from the ONNX Runtime releases page and install to /usr/local:

tar -xzf onnxruntime-linux-x64-*.tgz
sudo cp -r onnxruntime-linux-x64-*/include/* /usr/local/include/
sudo cp -r onnxruntime-linux-x64-*/lib/*     /usr/local/lib/
sudo ldconfig

If you install to a different prefix, update ONNXRUNTIME_ROOT_DIR in CMakeLists.txt.


ONNX Models

Two models are required and must be placed at:

data/models/depth_anything_v2_vits.onnx
data/models/u2net.onnx
Model Source
Depth Anything V2 (ViT-S) Depth-Anything-V2 repository — export to ONNX
U2Net U-2-Net repository — download ONNX release

seamCarvingApp, cameraTest, and videoTest use the Sobel + spectral residual path and do not require models at runtime. Models are only needed for depthTest, fusionTest, and batchProcessor.


Build

mkdir build && cd build
cmake ..
make -j$(nproc)

All binaries land in build/.


Usage

Main application — live camera

cd build
./seamCarvingApp

Opens your front camera and progressively removes seams until the configured target width is reached. Displays original and carved feeds side by side. Press q or ESC to quit.

config.ini is loaded from the working directory at startup. Run from the project root, or copy config.ini into build/.

Camera test

./cameraTest [width_ratio]
# width_ratio: fraction of original width to keep (0.1–1.0, default 0.7)

./cameraTest 0.5   # remove 50% of width
./cameraTest 0.8   # keep 80%

Video test

./videoTest [video_path] [width_ratio]
# Auto-searches ../data/ if no path given. Default ratio: 0.5

./videoTest                           # auto-find, 50% width
./videoTest 0.3                       # auto-find, keep 30%
./videoTest ../data/clip.mp4 0.6      # specific file, keep 60%

Batch processor

Processes a video offline and writes four output streams to output/:

File Contents
<name>_result.mp4 Seam-carved video at target width
<name>_original.mp4 Original frames unchanged
<name>_revalidated.mp4 Per-frame pixels flagged for re-validation
<name>_energy.mp4 Per-frame importance map (HOT colormap)
./batchProcessor <video_path> [width_ratio]

./batchProcessor ../data/clip.mp4          # default 50%
./batchProcessor ../data/clip.mp4 0.75

Depth test (requires model)

Runs Depth Anything V2 on a single image, displays and saves the depth map.

./depthTest <image_path> [model_path]
# Default model path: ../data/models/depth_anything_v2_vits.onnx

./depthTest photo.jpg

Outputs: depth_grayscale.jpg, depth_colored.jpg

Fusion test (requires both models)

Runs the full energy fusion pipeline and displays a 2×3 grid of intermediate maps.

./fusionTest <image_path> [depth_model_path]

./fusionTest photo.jpg

Outputs: fusion_depth.jpg, fusion_u2net.jpg, fusion_gradient.jpg, fusion_result.jpg, fusion_grid.jpg


Configuration

config.ini is read by seamCarvingApp at startup (other test executables use compile-time defaults). The file is optional — if absent, defaults apply.

[frequencies]
display_fps=60.0
motion_fps=30.0
seam_carving_fps=10.0
optical_flow_fps=4.0
validator_fps=10.0

[algorithm]
decay_min=0.8
decay_max=0.95
decay_beta=2.0
movement_threshold=10.0
importance_alpha=0.7       # gradient weight in importance blend
importance_beta=0.3        # motion weight in importance blend
motion_boost=0.2           # extra importance given to recently-moved pixels

[performance]
cache_size_kb=512
batch_size=32
use_simd=true
thread_pool_size=4

[debug]
enable_logging=false
log_level=DEBUG            # DEBUG, INFO, WARN, ERROR
show_seam_paths=false
show_motion_vectors=false
show_importance_map=false
show_valid_pixels=false

Architecture

Threading model

Five threads communicate through SharedData with mutex-protected access:

Thread Role
Display Renders output windows at display_fps
Carver Removes one seam per tick; transitions from INITIAL_CARVING to MAINTENANCE once target width is reached
Motion Computes optical flow, tracks per-pixel acceleration, pushes high-motion pixels onto the re-validation queue
Retarget Manages width targets and state transitions
Validator Drains the re-validation queue

Energy pipeline

Input frame
    ├── Sobel gradient
    ├── Depth Anything V2 (ONNX) ──► EnergyFusion
    └── U2Net saliency (ONNX)           │
                                 Quality assessment
                                 Content-type classification
                                 Adaptive per-signal weighting
                                 4-level multi-scale pyramid fusion
                                 Post-processing
                                        │
                                   Energy map
                                        │
                            Forward-energy seam carving
                            (3-direction DP, bottom-up)
                                        │
                                  Seam removal

Pixel positions are tracked through a left-packed coordinate map, keeping original-to-carved-space mapping consistent across threads. Seam selection strategy is pluggable (BasicSeamSelectionStrategy / RandomSeamSelectionStrategy).


File structure

├── CMakeLists.txt
├── config.ini
├── data/
│   ├── models/
│   │   ├── depth_anything_v2_vits.onnx   # not tracked in git
│   │   └── u2net.onnx                    # not tracked in git
│   └── *.mp4 / *.avi / ...
├── src/
│   ├── algorithms/
│   │   ├── depthEstimator.{hpp,cpp}
│   │   ├── u2NetEstimator.{hpp,cpp}
│   │   ├── saliencyEstimator.{hpp,cpp}
│   │   ├── sobelEstimator.{hpp,cpp}
│   │   ├── energyFusion.{hpp,cpp}
│   │   ├── motionFusion.{hpp,cpp}
│   │   ├── seamCarving.{hpp,cpp}
│   │   └── strategies/seamSelection/
│   ├── core/
│   │   ├── data.{hpp,cpp}
│   │   └── pixelCore.hpp
│   ├── output/
│   │   └── displayManager.{hpp,cpp}
│   ├── threads/
│   │   ├── carver.{hpp,cpp}
│   │   ├── display.{hpp,cpp}
│   │   ├── motion.{hpp,cpp}
│   │   ├── retarget.{hpp,cpp}
│   │   └── validator.{hpp,cpp}
│   └── utils/
└── tests/
    ├── cameraTest.cpp
    ├── videoTest.cpp
    ├── batchProcessor.cpp
    ├── depthTest.cpp
    ├── fusionTest.cpp
    └── visualizer.cpp

Troubleshooting

Problem Fix
No camera detected Check permissions; verify ls /dev/video*
Video not found videoTest searches ../data/ relative to build/; use an explicit path
onnxruntime not found at build Verify /usr/local/include/onnxruntime_cxx_api.h exists and run sudo ldconfig
Model load failure at runtime Confirm both .onnx files are under data/models/ in the project root
OpenCV saliency missing Install libopencv-contrib-dev or rebuild OpenCV with opencv_contrib
Dropped frames / high CPU Lower motion_fps and seam_carving_fps in config.ini
No debug output Set enable_logging=true and log_level=DEBUG in config.ini; run from project root

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors