A ROS 2 package for running inference with trained diffusion policy models in a simulated environment. This package enables real-time robot control using visuomotor diffusion policies for cube stacking tasks with the myCobot 280 robot.
The package provides a complete inference pipeline:
-
Model Inference Node (
model_inference_node):- Loads trained diffusion policy models with ResNet34 backbone
- Subscribes to 424x240 camera images from simulation
- Runs real-time inference at 10Hz
- Publishes joint commands to robot controllers
- Supports GPU acceleration for fast inference
-
Launch Files:
simulation_inference.launch.py: Complete inference setup with Gazebomodel_inference_only.launch.py: Inference node only
-
Utility Scripts:
setup_venv.sh: Automated virtual environment setuprun_simulation_inference.sh: Convenience script for launching inference
- Real-time Performance: 10Hz inference rate for smooth control
- ResNet34 Vision: Pre-trained backbone for robust visual processing
- Diffusion Sampling: DDPM-based action generation
- GPU Acceleration: CUDA support for fast inference
- ROS 2 Jazzy Jalisco
- Gazebo Garden
- Python 3.10+ with PyTorch
- NVIDIA GPU with CUDA support (recommended)
- Trained diffusion policy model checkpoint
- Checkpoint Format: PyTorch .pth file with model_state_dict and args
- Architecture: ResNet34 + MLP (7-dimensional output)
- Input Resolution: 424x240 RGB images
- Training Data: Degrees format for better sensitivity
The package is included in the main workspace. Ensure it's built:
cd ~/ros2_ws
colcon build --packages-select diffusion_policy_inference
source install/setup.bashSet up a dedicated virtual environment for diffusion policy:
# Create virtual environment
python3 -m venv ~/.venvs/diffusion_policy
source ~/.venvs/diffusion_policy/bin/activate
# Install dependencies
cd ~/ros2_ws/DP
pip install -r requirements.txtOr use the provided setup script:
cd ~/ros2_ws/src/diffusion_policy_inference
./setup_venv.shEnsure you have a trained model checkpoint:
# Check model exists
ls -la ~/ros2_ws/DP/checkpoints/model_best-2.pth
# Verify model directory structure
ls -la ~/ros2_ws/DP/
# Should contain: model.py, train.py, dataset.pyThe primary and recommended method for running inference is using the run_realtime_inference.sh script located in the root of your ros2_ws workspace.
cd ~/ros2_ws
./run_realtime_inference.sh --checkpoint_path ./DP/checkpoints/model_best.pth # Or your desired checkpointThis comprehensive script handles the entire process:
- Environment Setup: Configures Gazebo for performance and ROS 2 logging.
- Simulation Launch: Starts Gazebo with the
cube_stacking_world.worldand the myCobot robot, applying necessary friction and real-time settings. - Camera Initialization: Sets the Gazebo camera to the correct viewpoint for the policy.
- Inference Execution:
- Activates the appropriate Python virtual environment (e.g.,
~/.venvs/diffusion_policy). - Executes the
inference_realtime.pyscript (located in~/ros2_ws/DP/) which loads the diffusion policy model and performs inference.
- Activates the appropriate Python virtual environment (e.g.,
- Parameterization: Offers numerous command-line arguments to customize the model checkpoint, inference rate, robot behavior, visualization, and more. Use
./run_realtime_inference.sh --helpto see all available options. - Process Management: Reliably starts and stops Gazebo and the Python inference process, including cleanup on exit.
This script is the sole supported method for running real-time inference with the diffusion policy model.
The following methods are older ways to launch parts of the inference system and are provided for informational or debugging purposes only. For standard operation, use run_realtime_inference.sh.
cd ~/ros2_ws
source ~/.venvs/diffusion_policy/bin/activate
./src/diffusion_policy_inference/run_simulation_inference.sh \
--checkpoint ~/ros2_ws/DP/checkpoints/model_best-2.pth \
--model-dir ~/ros2_ws/DP \
--rate 10.0 \
--log-level infoParameters:
--checkpoint: Path to trained model checkpoint (required)--model-dir: Directory containing model.py and train.py (required)--rate: Inference rate in Hz (default: 10.0)--log-level: ROS log level (default: info)
source ~/.venvs/diffusion_policy/bin/activate
ros2 launch diffusion_policy_inference simulation_inference.launch.py \
checkpoint_path:=~/ros2_ws/DP/checkpoints/model_best-2.pth \
model_dir:=~/ros2_ws/DP \
inference_rate:=10.0 \
log_level:=info# If you used setup_venv.sh
~/ros2_ws/run_inference_with_venv.sh \
--checkpoint ~/ros2_ws/DP/checkpoints/model_best-2.pth \
--model-dir ~/ros2_ws/DPThe inference node expects a PyTorch checkpoint file containing:
checkpoint = {
'model_state_dict': model.state_dict(), # Trained model weights
'args': { # Training configuration
'state_dim': 7, # 6 joints + 1 gripper
'hidden_dim': 256, # MLP hidden dimension
'num_mlp_layers': 4, # MLP depth
'image_feature_dim': 512, # ResNet34 output dim
'timesteps': 1000, # Diffusion timesteps
'beta_start': 1e-4, # Beta schedule start
'beta_end': 0.02, # Beta schedule end
# ... other training parameters
}
}The model_dir must contain:
~/ros2_ws/DP/
├── model.py # DiffusionPolicyModel class
├── train.py # Diffusion sampling functions
├── dataset.py # Data loading utilities
└── checkpoints/
└── model_best-2.pth # Trained checkpoint
Required Functions in train.py:
linear_beta_schedule(): Beta schedule generationextract(): Tensor extraction utilityp_sample(): Single diffusion stepp_sample_loop(): Complete sampling loop
Core Parameters:
checkpoint_path(string): Path to model checkpoint filemodel_dir(string): Directory containing model.py and train.pyinference_rate(double): Inference frequency in Hz (default: 10.0)device(string): Computation device - "cuda" or "cpu" (auto-detected)
Topic Configuration:
image_topic(string): Camera image topic (default: "/camera_head/color/image_raw")joint_states_topic(string): Joint command topic (default: "/joint_states")joint_command_topic(string): Joint command output (default: "/joint_group_position_controller/commands")
Robot Configuration:
joint_names(string[]): Robot joint names for command publishing
ros2 run diffusion_policy_inference model_inference_node \
--ros-args \
-p checkpoint_path:=~/ros2_ws/DP/checkpoints/model_best-2.pth \
-p model_dir:=~/ros2_ws/DP \
-p inference_rate:=10.0 \
-p device:=cuda \
-p image_topic:=/camera_head/color/image_raw# Check CUDA availability
python3 -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
# Monitor GPU usage during inference
nvidia-smi -l 1The system is optimized for real-time performance:
- Target Rate: 10Hz (100ms per inference)
- Typical Performance: 50-80ms on RTX 3070
- Bottlenecks: Image preprocessing and diffusion sampling
- Model Size: ~25MB (ResNet34 + MLP)
- GPU Memory: ~2GB during inference
- CPU Memory: ~1GB for image processing
1. Virtual Environment Not Activated:
# Check current environment
echo $VIRTUAL_ENV
# Should show: /home/username/.venvs/diffusion_policy
# If not, activate:
source ~/.venvs/diffusion_policy/bin/activate2. Model Loading Errors:
# Check checkpoint file
python3 -c "
import torch
checkpoint = torch.load('~/ros2_ws/DP/checkpoints/model_best-2.pth', map_location='cpu')
print('Keys:', checkpoint.keys())
print('Args:', checkpoint.get('args', 'Missing args'))
"3. CUDA Issues:
# Check CUDA installation
nvidia-smi
python3 -c "import torch; print(torch.version.cuda)"
# Force CPU mode if needed
ros2 run diffusion_policy_inference model_inference_node \
--ros-args -p device:=cpu4. Topic Connection Issues:
# Check available topics
ros2 topic list | grep -E 'camera|joint'
# Monitor image topic
ros2 topic hz /camera_head/color/image_raw
# Check joint commands
ros2 topic echo /joint_group_position_controller/commands# Monitor inference node
ros2 node info /model_inference_node
# Check node parameters
ros2 param list /model_inference_node
# View inference logs
ros2 log view | grep model_inference
# Test model loading
cd ~/ros2_ws/DP
python3 -c "
from model import DiffusionPolicyModel
model = DiffusionPolicyModel(state_dim=7)
print('Model loaded successfully')
"# Monitor system resources
htop
# Check inference timing
ros2 topic hz /joint_group_position_controller/commands
# Monitor GPU usage
watch -n 1 nvidia-smi# Custom joint names for different robots
joint_names = [
"shoulder_pan_joint",
"shoulder_lift_joint",
"elbow_joint",
"wrist_1_joint",
"wrist_2_joint",
"wrist_3_joint"
]# Different camera configurations
ros2 launch diffusion_policy_inference simulation_inference.launch.py \
image_topic:=/front_camera/image_raw \
checkpoint_path:=~/models/custom_model.pthROS 2 Packages:
- rclpy
- sensor_msgs
- std_msgs
- cv_bridge
Python Packages:
- torch >= 1.12.0
- torchvision >= 0.13.0
- Pillow >= 8.0.0
- opencv-python >= 4.5.0
- numpy >= 1.21.0
- Diffusion Policy Training - Model training and architecture
- myCobot Stacking Project - Simulation environment
- Main Project README - Overall project documentation