- Setup project structure
- Add dependencies (OpenCV, nalgebra, bevy, serde)
- Read video frames from file
- Convert frames to grayscale
- Detect ORB features
- Create OrbDetector wrapper (modularized in src/feature/)
- Write basic tests for feature detection
- Match features between frames
- Create FeatureMatcher with BFMatcher
- Filter matches by distance
- Create visualizer example
- Extract matched point coordinates (Point2f)
- Create CameraIntrinsics struct
- Compute essential matrix from point pairs
- Recover pose (R, t) from essential matrix
- Convert OpenCV Mat to nalgebra types
- Build 4x4 transformation matrix
- Initialize global pose tracking
- Update global pose (compose transformations)
- Store trajectory points
- Calculate total distance traveled
- Save trajectory to JSON file
- Add unit tests for pose recovery
- Add example for full visual odometry pipeline with trajectory visualization
- Implement keyframe selection (translation, rotation, match quality criteria)
- Add 3D point triangulation/mapping
- Create MapPoint struct for 3D points
- Implement Triangulator for computing 3D points from 2D correspondences
- Add point cloud export (PLY and JSON formats)
- Create point cloud example with triangulation
- Add real-time 3D visualization with Rerun
- Map management - track points, deduplicate, prune outliers
- Point reobservation - match against existing map points
- Bundle adjustment - local BA for refining poses and points
- Point cloud is sparse - only ~1000-2000 ORB features per frame
- This is VO, not SLAM - missing loop closure and global optimization
- Scale drift - no absolute scale, accumulates error over time
- Bundle Adjustment optimize function really needs an optimization, LU isn't the way to go, perhaps a port from COLMAP could benefit.
- Visual odometry (camera tracking)
- Sparse 3D reconstruction (triangulation)
- Real-time visualization with Rerun - see what's being mapped!
- Map management - track which points exist, deduplicate, prune outliers
- Point reobservation - match against existing map points, not just previous frame
- Local bundle adjustment - optimize sliding window of keyframes and points
- Monocular Depth Estimation - MonoDepth2 integration with tch-rs
- Local mapping - maintain sliding window of recent keyframes and points with BA integration
- Increase point density
- Use SIFT/SURF (more features than ORB)
- Semi-dense tracking (high gradient pixels, not just corners)
- Depth map estimation between keyframes
- Depth filtering - probabilistic depth estimation for each pixel
- Depth fusion - merge depth estimates from multiple views
- Place recognition - DBoW2/DBoW3 for detecting revisited locations
- Loop closure detection - geometric verification of loop candidates
- Pose graph optimization - correct drift when loop is detected
- Global bundle adjustment - optimize all poses and points together (expand current local BA)
- Relocalization - recover from tracking loss
- Map saving/loading - persist maps between runs
- Multi-threading** - separate tracking, local mapping, loop closing threads
- IMU integration - use IMU for better tracking (VI-SLAM)
- Camera calibration module - estimate intrinsics from video
- Support stereo cameras (true scale from stereo baseline)
- Support RGB-D cameras (direct depth from sensor)
- Object detection integration
- Semantic SLAM (label 3D points with objects)
- Neural depth estimation (monocular depth networks)
- Handle degenerate cases (insufficient matches, pure rotation, etc.)
- Better error handling throughout
- More comprehensive tests
- Benchmark on KITTI dataset with ground truth comparison
- GPU acceleration (feature detection, matching)
- Add more camera presets
- Contributor? Please submit a pull request or add a TODO here with a ticket.
Run feature visualizer:
cargo run --example visualize_features /path/to/video.mp4Run full visual odometry with trajectory:
# Use default KITTI intrinsics
cargo run --example visual_odometry /path/to/video.mp4
# Specify custom camera intrinsics
cargo run --example visual_odometry /path/to/video.mp4 -- --fx 500 --fy 500 --cx 320 --cy 240Run point cloud generation with triangulation:
# With Rerun 3D viewer (shows map, trajectory, matches, video in real-time!)
cargo run --example point_cloud --features rerun /path/to/video.mp4 -- --rerun
# Or save to PLY file (default, no Rerun)
cargo run --example point_cloud /path/to/video.mp4 -- --save-ply
# With custom camera intrinsics
cargo run --example point_cloud --features rerun /path/to/video.mp4 -- --rerun --fx 718.856 --fy 718.856 --cx 607.1928 --cy 185.2157Run bundle adjustment demo:
cargo run --example bundle_adjustmentRun depth estimation:
# Single image
cargo run --example depth_estimation --features depth -- test.jpg --encoder weights/encoder.pt --decoder weights/depth.pt
# Video with Rerun visualization
cargo run --example depth_estimation --features depth,rerun -- test.mp4 --cuda --rerun
# Video with OpenCV (no Rerun)
cargo run --example depth_estimation --features depth -- test.mp4 --cuda --saveSee docs/Deep-Learning.md for model installation and setup instructions.
Run tests:
cargo testRun main:
cargo run -- /path/to/video.mp4