Skip to content

pearwpi/VizFlyt2

Repository files navigation

VizFlyt2 (WIP)

A flexible perception, dynamics, and planning system for rendering photorealistic synthetic sensor data using Gaussian Splatting. VizFlyt2 provides mono and stereo camera rendering with support for custom perception modules, simple point-mass dynamics, trajectory planning, and vision-based obstacle avoidance.

🌟 Features

Perception

  • High-Fidelity Rendering: Photorealistic RGB and depth rendering using Gaussian Splatting (via Nerfstudio)
  • Stereo Camera Support: Configurable stereo baseline for binocular vision
  • Module Composition: Compose perception modules using the + operator to create complex pipelines
  • Decorator Factories: Create custom vision modules from simple functions
  • Multiple Modalities: Mono, stereo, optical flow, event cameras, and custom modules
  • Batch Processing: Efficient rendering of entire trajectories

Dynamics

  • Point-Mass Model: Simple dynamics with position, velocity, orientation, angular velocity
  • Two Control Modes: Velocity (kinematic) or acceleration (optional gravity)
  • 105 Lines of Code: No mass, inertia, forces, or torques - just the basics
  • Easy Integration: Couples with planning for trajectory execution
  • NED Frame Support: Uses North-East-Down coordinate system for aerospace applications

Planning

  • Trajectory Planning: Five primitives (line, circle, figure-8, spiral, waypoints)
  • Vision-Based Planning: Reactive obstacle avoidance using potential fields
  • Unified Interface: All planners implement compute_action(**kwargs) β†’ velocity
  • Runtime Switching: Easy to switch between planning strategies
  • Extensible: Support for RL agents, hybrid planners, custom strategies
  • Depth Image Processing: Automatic free space detection and navigation
  • Simple API: Use primitives directly or via planner classes
  • Direct Integration: Works seamlessly with dynamics and perception modules
  • ~1040 Lines of Code: Lightweight, unified, and extensible architecture

πŸ“‹ Table of Contents

πŸš€ Installation

Prerequisites

  • Python 3.8+
  • CUDA-capable GPU (recommended)
  • Nerfstudio

Setup

Option 1: Install as Package (Recommended)

# Clone the repository
git clone https://github.com/pearwpi/VizFlyt2.git
cd VizFlyt2

# Install in editable mode with core dependencies
pip install -e .

# Optional: Install visualization dependencies
pip install -e ".[viz]"

# Optional: Install nerfstudio for perception (follow their guide)
pip install nerfstudio

Option 2: Manual Installation

# Clone the repository
git clone https://github.com/pearwpi/VizFlyt2.git
cd VizFlyt2

# Install dependencies
pip install numpy opencv-python torch transforms3d scipy matplotlib
pip install nerfstudio  # Optional, for perception

Prepare Gaussian Splatting Model (Optional, for Perception)

  • Train a splatfacto model using Nerfstudio
  • Note the path to your config.yml file
  • Prepare camera settings JSON file

🎯 Quick Start

Complete Integration (Perception + Planning + Dynamics)

from planning import PotentialFieldPlanner
from dynamics import PointMassDynamics
import numpy as np

# Setup planner (depth β†’ velocity)
planner = PotentialFieldPlanner(step_size=2.0)

# Setup dynamics
dynamics = PointMassDynamics(
    initial_state={
        'position': np.array([0., 0., -50.]),
        'velocity': np.array([0., 0., 0.]),
        'orientation_rpy': np.array([0., 0., 0.])
    },
    control_mode='velocity'
)

# Main loop
for step in range(1000):
    # 1. Get depth from camera/renderer
    depth = get_depth_image()  # From SplatRenderer or camera
    
    # 2. Compute velocity command
    action = planner.compute_action(depth_image=depth)
    
    # 3. Execute with dynamics
    dynamics.step({'velocity': action['velocity']}, dt=0.1)
    planner.step()

See planning/example_integration.py for complete working example.

Mono Camera Rendering

from perception.splat_render import SplatRenderer
import numpy as np

# Initialize renderer
renderer = SplatRenderer(
    config_path="path/to/config.yml",
    json_path="cam_settings.json"
)

# Define pose (NED coordinates)
position = np.array([0.0, 0.0, 0.0])  # x, y, z in meters
orientation_rpy = np.array([0.0, 0.0, 0.0])  # roll, pitch, yaw in radians

# Render
results = renderer.render(position, orientation_rpy)

# Access outputs
rgb_image = results['rgb']      # (H, W, 3) uint8 BGR
depth_map = results['depth']    # (H, W) float32

Stereo Camera Rendering

from perception.stereo_camera import StereoCamera
import numpy as np

# Initialize stereo camera with 6.5cm baseline
stereo = StereoCamera(
    config_path="path/to/config.yml",
    json_path="cam_settings.json",
    baseline=0.065  # meters
)

# Render stereo pair
results = stereo.render(position, orientation_rpy)

# Access outputs
rgb_left = results['rgb_left']       # (H, W, 3) uint8 BGR
rgb_right = results['rgb_right']     # (H, W, 3) uint8 BGR
depth_left = results['depth_left']   # (H, W) float32
depth_right = results['depth_right'] # (H, W) float32

Module Composition

Compose modules using the + operator:

from perception.splat_render import SplatRenderer
from perception.modules import rgb_vision_module_factory
import cv2

# Create renderer
renderer = SplatRenderer("config.yml", "cam_settings.json")

# Create custom module with decorator
@rgb_vision_module_factory
def edge_detector(rgb_image):
    gray = cv2.cvtColor(rgb_image, cv2.COLOR_BGR2GRAY)
    edges = cv2.Canny(gray, 50, 150)
    return {'edges': edges}

# Compose with + operator
pipeline = renderer + edge_detector

# Render
results = pipeline.render(position, orientation_rpy)
# Access: results['rgb'], results['depth'], results['edges']

πŸ›©οΈ Dynamics

Super simple point-mass dynamics for trajectory generation:

import numpy as np
from dynamics import PointMassDynamics

# Velocity mode - direct control
dynamics = PointMassDynamics(
    initial_state={
        'position': np.array([0., 0., -50.]),
        'velocity': np.array([10., 0., 0.]),
        'orientation_rpy': np.array([0., 0., 0.])
    },
    control_mode='velocity'
)

# Simulation loop
dt = 0.01
for i in range(1000):
    dynamics.step({'velocity': np.array([10., 5., 0.])}, dt)
    position, orientation = dynamics.get_render_params()

Two modes:

  • velocity: Set velocity directly (kinematic, no physics)
  • acceleration: Set acceleration (optional gravity)

See dynamics/README.md for examples.

πŸ›€οΈ Planning

πŸ›€οΈ Planning

All planners use a unified interface: compute_action(**kwargs) β†’ {'velocity': ...}

Reactive Planning (Vision-Based)

from planning import PotentialFieldPlanner

planner = PotentialFieldPlanner(step_size=2.0, verbose=False)

for step in range(1000):
    depth = get_depth_image()
    action = planner.compute_action(depth_image=depth)
    
    dynamics.step({'velocity': action['velocity']}, dt=0.1)
    planner.step()

Trajectory Planning

from planning import TrajectoryPlanner
import numpy as np

planner = TrajectoryPlanner(dt=0.01)
planner.plan_circle(np.array([0., 0., -50.]), radius=20., duration=10.)

planner.reset()
while not planner.is_complete():
    action = planner.compute_action()
    dynamics.step({'velocity': action['velocity']}, dt=0.01)
    planner.step()
    planner.current_index += 1

Primitives: line, circle, figure-8, spiral, waypoints

Custom Planners (e.g., RL Agents)

from planning import BasePlanner

class RLPlanner(BasePlanner):
    def compute_action(self, observation=None, **kwargs):
        action = self.model.predict(observation)
        return {'velocity': action[:3]}

See planning/README.md for details.

πŸ“š Examples

VizFlyt2 includes focused example scripts to help you get started:

1. basic_examples.py - Start Here

Simple examples for mono, stereo, and basic composition.

cd perception
python basic_examples.py

2. custom_module_example.py - Custom Modules

Learn TWO ways to create custom modules:

  • Decorator factories (@rgb_vision_module_factory) - Quick & simple
  • Subclassing (class MyModule(VisionModule)) - Full control
python custom_module_example.py

3. composition_examples.py - Advanced Pipelines

Complex multi-module compositions with interactive menu:

  • Noise + Optical Flow
  • Stereo + Event Cameras
  • Multi-stage pipelines (noise β†’ blur β†’ edges)
python composition_examples.py

4. stereo_composition_example.py - Stereo Compositions

How vision modules work with stereo cameras (runs independently on left/right).

python stereo_composition_example.py

See perception/EXAMPLES.md for detailed guide.

πŸ—οΈ Module Architecture

VizFlyt2 uses a hierarchical module system with two types:

BaseModule

Renders RGB and depth directly from the Gaussian Splat scene (first-stage):

  • SplatRenderer: Single camera RGB + depth
  • StereoCamera: Stereo camera RGB + depth

VisionModule

Processes RGB/depth to produce derived outputs (second-stage):

  • EventCamera: Simulates DVS events from intensity changes
  • OpticalFlow: Computes motion between frames
  • SnowModule: Adds weather effects
  • Custom modules: Create via decorators or subclassing

🎨 Custom Modules

Quick Way: Decorator Factories

For simple, stateless processing:

from perception.modules import rgb_vision_module_factory, vision_module_factory
import cv2

# RGB-only processing (automatically extracts rgb from kwargs)
@rgb_vision_module_factory
def brightness_boost(rgb_image):
    return {'rgb': cv2.convertScaleAbs(rgb_image, alpha=1.3, beta=20)}

# Generic processing (access all kwargs)
@vision_module_factory
def depth_overlay(position, orientation_rpy, **kwargs):
    rgb = kwargs.get('rgb')
    depth = kwargs.get('depth')
    if rgb is None or depth is None:
        return {}
    
    # Blend depth visualization with RGB
    depth_colored = cv2.applyColorMap(
        cv2.normalize(depth, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8),
        cv2.COLORMAP_JET
    )
    blended = cv2.addWeighted(rgb, 0.7, depth_colored, 0.3, 0)
    return {'blended': blended}

# Use in composition
pipeline = renderer + brightness_boost + depth_overlay

Full Control: Subclassing

For modules with state or configuration:

from perception.modules import VisionModule
import cv2

class EdgeDetector(VisionModule):
    def __init__(self, low_threshold=50, high_threshold=150):
        self.low_threshold = low_threshold
        self.high_threshold = high_threshold
    
    def render(self, position, orientation_rpy, rgb=None, **kwargs):
        if rgb is None:
            return {}
        
        gray = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)
        edges = cv2.Canny(gray, self.low_threshold, self.high_threshold)
        
        return {'edges': edges, 'rgb': rgb}

# Use in composition
edge_detector = EdgeDetector(low_threshold=100, high_threshold=200)
pipeline = renderer + edge_detector

When to use each:

  • Decorators: Simple functions, quick prototyping, stateless processing
  • Subclassing: Need initialization parameters, stateful processing, complex logic

πŸ”— Module Composition

Compose modules using the + operator to create pipelines:

from perception.splat_render import SplatRenderer
from perception.stereo_camera import StereoCamera

# Example 1: Mono pipeline
renderer = SplatRenderer("config.yml", "cam.json")
pipeline = renderer + noise_module + blur_module + edge_detector

# Example 2: Stereo pipeline (vision modules run on both left and right)
stereo = StereoCamera("config.yml", "cam.json", baseline=0.065)
stereo_pipeline = stereo + snow_module + edge_detector

# Results have _left and _right suffixes for vision module outputs
results = stereo_pipeline.render(position, orientation_rpy)
# Access: results['rgb_left'], results['rgb_right']
#         results['edges_left'], results['edges_right']

# Example 3: Chain multiple vision modules
pipeline = renderer + module1 + module2 + module3

How it works:

  1. First module (BaseModule) renders RGB and depth
  2. Each VisionModule receives previous outputs via **kwargs
  3. Outputs flow through the pipeline automatically
  4. For stereo, vision modules run independently on left and right views

πŸ“ File Structure

VizFlyt2/
β”œβ”€β”€ README.md
β”œβ”€β”€ QUICK_REFERENCE.md           # Quick reference
β”œβ”€β”€ COMPOSITION_GUIDE.md         # Detailed composition guide
β”œβ”€β”€ dynamics/                     # Dynamics models
β”‚   β”œβ”€β”€ README.md                # Dynamics documentation
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ base.py                  # Base dynamics class
β”‚   β”œβ”€β”€ point_mass.py            # Point-mass model (100 lines)
β”‚   β”œβ”€β”€ utils.py                 # Utilities
β”‚   └── example_simple.py        # Example simulation
β”œβ”€β”€ planning/                     # Trajectory planning
β”‚   β”œβ”€β”€ README.md                # Planning documentation
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ primitives.py            # Trajectory primitives
β”‚   β”œβ”€β”€ trajectory.py            # TrajectoryPlanner class
β”‚   β”œβ”€β”€ planner.py               # Vision-based obstacle avoidance
β”‚   β”œβ”€β”€ example_planning.py      # Trajectory examples
β”‚   └── example_vision_planning.py  # Vision-based examples
└── perception/
    β”œβ”€β”€ modules.py               # Base classes + decorator factories
    β”œβ”€β”€ stereo_camera.py         # Stereo camera implementation
    β”œβ”€β”€ splat_render.py          # Gaussian splatting renderer
    β”œβ”€β”€ utils.py                 # Utility functions
    β”œβ”€β”€ EXAMPLES.md              # Example guide
    β”œβ”€β”€ basic_examples.py        # Basic usage (start here!)
    β”œβ”€β”€ custom_module_example.py # Decorators + subclassing
    β”œβ”€β”€ composition_examples.py  # Advanced compositions
    └── stereo_composition_example.py  # Stereo-specific examples

βš™οΈ Configuration

Camera Settings JSON Format

{
  "camera": {
    "c2w_matrix": [[...], [...], [...], [...]],
    "fov_radians": 1.3089969389957472,
    "render_resolution": 1080
  }
}

Trajectory File Format

Text file with one pose per line:

x y z roll pitch yaw
0.0 0.0 0.0 0.0 0.0 0.0
1.0 0.0 0.0 0.0 0.0 0.0
...
  • Position: meters (NED frame)
  • Orientation: radians (roll-pitch-yaw)

🧭 Coordinate Systems

NED Frame (North-East-Down)

  • X: North (forward)
  • Y: East (right)
  • Z: Down

Body Frame

  • X: Forward
  • Y: Right
  • Z: Down

Orientation

  • Roll: Rotation about X-axis (right-wing down is positive)
  • Pitch: Rotation about Y-axis (nose up is positive)
  • Yaw: Rotation about Z-axis (nose right is positive)

πŸ”§ Advanced Usage

Module Composition

🧭 Coordinate Systems

NED Frame (North-East-Down)

  • X: North (forward)
  • Y: East (right)
  • Z: Down

Orientation

  • Roll: Rotation about X-axis (right-wing down is positive)
  • Pitch: Rotation about Y-axis (nose up is positive)
  • Yaw: Rotation about Z-axis (nose right is positive)

πŸ“– Documentation

πŸ“Š Performance

Typical rendering times (RTX 3090):

  • Mono rendering: ~20-40ms per frame
  • Stereo rendering: ~40-80ms per frame

Tips for optimization:

  • Reduce resolution for faster rendering
  • Process trajectories in batches
  • Use simple vision modules when possible

πŸ› Troubleshooting

Issue: "Config file not found"

  • Ensure the config path points to your trained splatfacto model
  • Check that the path is correct relative to your working directory

Issue: Slow rendering

  • Verify GPU is being used (check CUDA availability)
  • Try reducing image resolution in camera settings JSON
  • Check that the model is in eval mode

Issue: Incorrect camera orientation

  • Verify NED coordinate system is being used correctly
  • Check that roll-pitch-yaw convention matches (aerospace convention)
  • Test with position [0,0,0] and orientation [0,0,0] first

🀝 Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

πŸ“„ License

[Add your license information here]

πŸ™ Acknowledgments

  • Built on Nerfstudio
  • Uses Gaussian Splatting for high-quality rendering
  • Developed by WPI PEAR Lab

πŸ“§ Contact

For questions or support, please open an issue on GitHub or contact the maintainers.

πŸ› Troubleshooting

Issue: "Config file not found"

  • Ensure the config path points to your trained splatfacto model
  • Check that the path is correct relative to your working directory

Issue: Slow rendering

  • Verify GPU is being used (check CUDA availability)
  • Try reducing image resolution
  • Check that the model is in eval mode

Issue: Incorrect camera orientation

  • Verify NED coordinate system is being used correctly
  • Check that roll-pitch-yaw convention matches (aerospace convention)

🀝 Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

πŸ“„ License

[Add your license information here]

πŸ™ Acknowledgments

  • Built on Nerfstudio
  • Uses Gaussian Splatting for high-quality rendering
  • Developed by WPI PEAR Lab

πŸ“§ Contact

For questions or support, please open an issue on GitHub or contact the maintainers.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages