A flexible perception, dynamics, and planning system for rendering photorealistic synthetic sensor data using Gaussian Splatting. VizFlyt2 provides mono and stereo camera rendering with support for custom perception modules, simple point-mass dynamics, trajectory planning, and vision-based obstacle avoidance.
- High-Fidelity Rendering: Photorealistic RGB and depth rendering using Gaussian Splatting (via Nerfstudio)
- Stereo Camera Support: Configurable stereo baseline for binocular vision
- Module Composition: Compose perception modules using the
+operator to create complex pipelines - Decorator Factories: Create custom vision modules from simple functions
- Multiple Modalities: Mono, stereo, optical flow, event cameras, and custom modules
- Batch Processing: Efficient rendering of entire trajectories
- Point-Mass Model: Simple dynamics with position, velocity, orientation, angular velocity
- Two Control Modes: Velocity (kinematic) or acceleration (optional gravity)
- 105 Lines of Code: No mass, inertia, forces, or torques - just the basics
- Easy Integration: Couples with planning for trajectory execution
- NED Frame Support: Uses North-East-Down coordinate system for aerospace applications
- Trajectory Planning: Five primitives (line, circle, figure-8, spiral, waypoints)
- Vision-Based Planning: Reactive obstacle avoidance using potential fields
- Unified Interface: All planners implement
compute_action(**kwargs) β velocity - Runtime Switching: Easy to switch between planning strategies
- Extensible: Support for RL agents, hybrid planners, custom strategies
- Depth Image Processing: Automatic free space detection and navigation
- Simple API: Use primitives directly or via planner classes
- Direct Integration: Works seamlessly with dynamics and perception modules
- ~1040 Lines of Code: Lightweight, unified, and extensible architecture
- Installation
- Quick Start
- Dynamics
- Planning
- Examples
- Module Composition
- Custom Modules
- File Structure
- Configuration
- Coordinate Systems
- Documentation
- Documentation
- Python 3.8+
- CUDA-capable GPU (recommended)
- Nerfstudio
# Clone the repository
git clone https://github.com/pearwpi/VizFlyt2.git
cd VizFlyt2
# Install in editable mode with core dependencies
pip install -e .
# Optional: Install visualization dependencies
pip install -e ".[viz]"
# Optional: Install nerfstudio for perception (follow their guide)
pip install nerfstudio# Clone the repository
git clone https://github.com/pearwpi/VizFlyt2.git
cd VizFlyt2
# Install dependencies
pip install numpy opencv-python torch transforms3d scipy matplotlib
pip install nerfstudio # Optional, for perception- Train a splatfacto model using Nerfstudio
- Note the path to your
config.ymlfile - Prepare camera settings JSON file
from planning import PotentialFieldPlanner
from dynamics import PointMassDynamics
import numpy as np
# Setup planner (depth β velocity)
planner = PotentialFieldPlanner(step_size=2.0)
# Setup dynamics
dynamics = PointMassDynamics(
initial_state={
'position': np.array([0., 0., -50.]),
'velocity': np.array([0., 0., 0.]),
'orientation_rpy': np.array([0., 0., 0.])
},
control_mode='velocity'
)
# Main loop
for step in range(1000):
# 1. Get depth from camera/renderer
depth = get_depth_image() # From SplatRenderer or camera
# 2. Compute velocity command
action = planner.compute_action(depth_image=depth)
# 3. Execute with dynamics
dynamics.step({'velocity': action['velocity']}, dt=0.1)
planner.step()See planning/example_integration.py for complete working example.
from perception.splat_render import SplatRenderer
import numpy as np
# Initialize renderer
renderer = SplatRenderer(
config_path="path/to/config.yml",
json_path="cam_settings.json"
)
# Define pose (NED coordinates)
position = np.array([0.0, 0.0, 0.0]) # x, y, z in meters
orientation_rpy = np.array([0.0, 0.0, 0.0]) # roll, pitch, yaw in radians
# Render
results = renderer.render(position, orientation_rpy)
# Access outputs
rgb_image = results['rgb'] # (H, W, 3) uint8 BGR
depth_map = results['depth'] # (H, W) float32from perception.stereo_camera import StereoCamera
import numpy as np
# Initialize stereo camera with 6.5cm baseline
stereo = StereoCamera(
config_path="path/to/config.yml",
json_path="cam_settings.json",
baseline=0.065 # meters
)
# Render stereo pair
results = stereo.render(position, orientation_rpy)
# Access outputs
rgb_left = results['rgb_left'] # (H, W, 3) uint8 BGR
rgb_right = results['rgb_right'] # (H, W, 3) uint8 BGR
depth_left = results['depth_left'] # (H, W) float32
depth_right = results['depth_right'] # (H, W) float32Compose modules using the + operator:
from perception.splat_render import SplatRenderer
from perception.modules import rgb_vision_module_factory
import cv2
# Create renderer
renderer = SplatRenderer("config.yml", "cam_settings.json")
# Create custom module with decorator
@rgb_vision_module_factory
def edge_detector(rgb_image):
gray = cv2.cvtColor(rgb_image, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 50, 150)
return {'edges': edges}
# Compose with + operator
pipeline = renderer + edge_detector
# Render
results = pipeline.render(position, orientation_rpy)
# Access: results['rgb'], results['depth'], results['edges']Super simple point-mass dynamics for trajectory generation:
import numpy as np
from dynamics import PointMassDynamics
# Velocity mode - direct control
dynamics = PointMassDynamics(
initial_state={
'position': np.array([0., 0., -50.]),
'velocity': np.array([10., 0., 0.]),
'orientation_rpy': np.array([0., 0., 0.])
},
control_mode='velocity'
)
# Simulation loop
dt = 0.01
for i in range(1000):
dynamics.step({'velocity': np.array([10., 5., 0.])}, dt)
position, orientation = dynamics.get_render_params()Two modes:
velocity: Set velocity directly (kinematic, no physics)acceleration: Set acceleration (optional gravity)
See dynamics/README.md for examples.
All planners use a unified interface: compute_action(**kwargs) β {'velocity': ...}
from planning import PotentialFieldPlanner
planner = PotentialFieldPlanner(step_size=2.0, verbose=False)
for step in range(1000):
depth = get_depth_image()
action = planner.compute_action(depth_image=depth)
dynamics.step({'velocity': action['velocity']}, dt=0.1)
planner.step()from planning import TrajectoryPlanner
import numpy as np
planner = TrajectoryPlanner(dt=0.01)
planner.plan_circle(np.array([0., 0., -50.]), radius=20., duration=10.)
planner.reset()
while not planner.is_complete():
action = planner.compute_action()
dynamics.step({'velocity': action['velocity']}, dt=0.01)
planner.step()
planner.current_index += 1Primitives: line, circle, figure-8, spiral, waypoints
from planning import BasePlanner
class RLPlanner(BasePlanner):
def compute_action(self, observation=None, **kwargs):
action = self.model.predict(observation)
return {'velocity': action[:3]}See planning/README.md for details.
VizFlyt2 includes focused example scripts to help you get started:
Simple examples for mono, stereo, and basic composition.
cd perception
python basic_examples.pyLearn TWO ways to create custom modules:
- Decorator factories (
@rgb_vision_module_factory) - Quick & simple - Subclassing (
class MyModule(VisionModule)) - Full control
python custom_module_example.pyComplex multi-module compositions with interactive menu:
- Noise + Optical Flow
- Stereo + Event Cameras
- Multi-stage pipelines (noise β blur β edges)
python composition_examples.pyHow vision modules work with stereo cameras (runs independently on left/right).
python stereo_composition_example.pySee perception/EXAMPLES.md for detailed guide.
VizFlyt2 uses a hierarchical module system with two types:
Renders RGB and depth directly from the Gaussian Splat scene (first-stage):
- SplatRenderer: Single camera RGB + depth
- StereoCamera: Stereo camera RGB + depth
Processes RGB/depth to produce derived outputs (second-stage):
- EventCamera: Simulates DVS events from intensity changes
- OpticalFlow: Computes motion between frames
- SnowModule: Adds weather effects
- Custom modules: Create via decorators or subclassing
For simple, stateless processing:
from perception.modules import rgb_vision_module_factory, vision_module_factory
import cv2
# RGB-only processing (automatically extracts rgb from kwargs)
@rgb_vision_module_factory
def brightness_boost(rgb_image):
return {'rgb': cv2.convertScaleAbs(rgb_image, alpha=1.3, beta=20)}
# Generic processing (access all kwargs)
@vision_module_factory
def depth_overlay(position, orientation_rpy, **kwargs):
rgb = kwargs.get('rgb')
depth = kwargs.get('depth')
if rgb is None or depth is None:
return {}
# Blend depth visualization with RGB
depth_colored = cv2.applyColorMap(
cv2.normalize(depth, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8),
cv2.COLORMAP_JET
)
blended = cv2.addWeighted(rgb, 0.7, depth_colored, 0.3, 0)
return {'blended': blended}
# Use in composition
pipeline = renderer + brightness_boost + depth_overlayFor modules with state or configuration:
from perception.modules import VisionModule
import cv2
class EdgeDetector(VisionModule):
def __init__(self, low_threshold=50, high_threshold=150):
self.low_threshold = low_threshold
self.high_threshold = high_threshold
def render(self, position, orientation_rpy, rgb=None, **kwargs):
if rgb is None:
return {}
gray = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, self.low_threshold, self.high_threshold)
return {'edges': edges, 'rgb': rgb}
# Use in composition
edge_detector = EdgeDetector(low_threshold=100, high_threshold=200)
pipeline = renderer + edge_detectorWhen to use each:
- Decorators: Simple functions, quick prototyping, stateless processing
- Subclassing: Need initialization parameters, stateful processing, complex logic
Compose modules using the + operator to create pipelines:
from perception.splat_render import SplatRenderer
from perception.stereo_camera import StereoCamera
# Example 1: Mono pipeline
renderer = SplatRenderer("config.yml", "cam.json")
pipeline = renderer + noise_module + blur_module + edge_detector
# Example 2: Stereo pipeline (vision modules run on both left and right)
stereo = StereoCamera("config.yml", "cam.json", baseline=0.065)
stereo_pipeline = stereo + snow_module + edge_detector
# Results have _left and _right suffixes for vision module outputs
results = stereo_pipeline.render(position, orientation_rpy)
# Access: results['rgb_left'], results['rgb_right']
# results['edges_left'], results['edges_right']
# Example 3: Chain multiple vision modules
pipeline = renderer + module1 + module2 + module3How it works:
- First module (BaseModule) renders RGB and depth
- Each VisionModule receives previous outputs via
**kwargs - Outputs flow through the pipeline automatically
- For stereo, vision modules run independently on left and right views
VizFlyt2/
βββ README.md
βββ QUICK_REFERENCE.md # Quick reference
βββ COMPOSITION_GUIDE.md # Detailed composition guide
βββ dynamics/ # Dynamics models
β βββ README.md # Dynamics documentation
β βββ __init__.py
β βββ base.py # Base dynamics class
β βββ point_mass.py # Point-mass model (100 lines)
β βββ utils.py # Utilities
β βββ example_simple.py # Example simulation
βββ planning/ # Trajectory planning
β βββ README.md # Planning documentation
β βββ __init__.py
β βββ primitives.py # Trajectory primitives
β βββ trajectory.py # TrajectoryPlanner class
β βββ planner.py # Vision-based obstacle avoidance
β βββ example_planning.py # Trajectory examples
β βββ example_vision_planning.py # Vision-based examples
βββ perception/
βββ modules.py # Base classes + decorator factories
βββ stereo_camera.py # Stereo camera implementation
βββ splat_render.py # Gaussian splatting renderer
βββ utils.py # Utility functions
βββ EXAMPLES.md # Example guide
βββ basic_examples.py # Basic usage (start here!)
βββ custom_module_example.py # Decorators + subclassing
βββ composition_examples.py # Advanced compositions
βββ stereo_composition_example.py # Stereo-specific examples
{
"camera": {
"c2w_matrix": [[...], [...], [...], [...]],
"fov_radians": 1.3089969389957472,
"render_resolution": 1080
}
}Text file with one pose per line:
x y z roll pitch yaw
0.0 0.0 0.0 0.0 0.0 0.0
1.0 0.0 0.0 0.0 0.0 0.0
...
- Position: meters (NED frame)
- Orientation: radians (roll-pitch-yaw)
- X: North (forward)
- Y: East (right)
- Z: Down
- X: Forward
- Y: Right
- Z: Down
- Roll: Rotation about X-axis (right-wing down is positive)
- Pitch: Rotation about Y-axis (nose up is positive)
- Yaw: Rotation about Z-axis (nose right is positive)
- X: North (forward)
- Y: East (right)
- Z: Down
- Roll: Rotation about X-axis (right-wing down is positive)
- Pitch: Rotation about Y-axis (nose up is positive)
- Yaw: Rotation about Z-axis (nose right is positive)
- QUICK_REFERENCE.md - Quick reference and code snippets
- COMPOSITION_GUIDE.md - Detailed guide to module composition
- perception/EXAMPLES.md - Guide to example scripts
- dynamics/README.md - Dynamics models documentation
- planning/README.md - Trajectory planning documentation
Typical rendering times (RTX 3090):
- Mono rendering: ~20-40ms per frame
- Stereo rendering: ~40-80ms per frame
Tips for optimization:
- Reduce resolution for faster rendering
- Process trajectories in batches
- Use simple vision modules when possible
Issue: "Config file not found"
- Ensure the config path points to your trained splatfacto model
- Check that the path is correct relative to your working directory
Issue: Slow rendering
- Verify GPU is being used (check CUDA availability)
- Try reducing image resolution in camera settings JSON
- Check that the model is in eval mode
Issue: Incorrect camera orientation
- Verify NED coordinate system is being used correctly
- Check that roll-pitch-yaw convention matches (aerospace convention)
- Test with position [0,0,0] and orientation [0,0,0] first
Contributions are welcome! Please feel free to submit issues or pull requests.
[Add your license information here]
- Built on Nerfstudio
- Uses Gaussian Splatting for high-quality rendering
- Developed by WPI PEAR Lab
For questions or support, please open an issue on GitHub or contact the maintainers.
Issue: "Config file not found"
- Ensure the config path points to your trained splatfacto model
- Check that the path is correct relative to your working directory
Issue: Slow rendering
- Verify GPU is being used (check CUDA availability)
- Try reducing image resolution
- Check that the model is in eval mode
Issue: Incorrect camera orientation
- Verify NED coordinate system is being used correctly
- Check that roll-pitch-yaw convention matches (aerospace convention)
Contributions are welcome! Please feel free to submit issues or pull requests.
[Add your license information here]
- Built on Nerfstudio
- Uses Gaussian Splatting for high-quality rendering
- Developed by WPI PEAR Lab
For questions or support, please open an issue on GitHub or contact the maintainers.