This document describes the data collection system for the cube stacking project, designed to gather trajectory data for training visuomotor diffusion policies.
The data collection system captures trajectories generated by the MoveIt2-driven stacking_manager_node as it autonomously performs the cube stacking task. This provides expert demonstrations for training visuomotor policies.
The system records:
- Robot joint states (6 arm joints)
- Gripper positions (mapped to 0-100 range)
- Camera images (424x240 resolution)
- Synchronized at 10Hz frequency
Data is saved in degrees format for better model sensitivity.
The data collection system consists of two main components:
-
Trajectory Data Interfaces Package
- Custom ROS 2 interfaces for data collection
- Service definitions for starting and stopping episode recording
-
Trajectory Data Collector Package
- State Logger Node: Records joint states, gripper positions, and camera images
- Provides services for starting and stopping data collection episodes
Data is saved in the following directory structure:
~/mycobot_episodes_degrees/
└── episode_YYYYMMDD_HHMMSS_mmm/
├── states.json # Joint states and gripper positions
└── frame_dir/ # Camera images
├── image_00000.png
├── image_00001.png
└── ...
The states.json file contains an array of entries, each with:
angles: Array of 6 joint angles in degreesgripper_value: Gripper position mapped to 0-100 range (0=open, 100=closed)image: Path to the corresponding image file
The primary and recommended method for collecting multiple episodes of data is using the collect_multiple_episodes.sh script located in the root of your ros2_ws workspace.
cd ~/ros2_ws
./collect_multiple_episodes.shThis script automates the process by:
- Allowing configuration of the output directory, total number of episodes, and cube randomization parameters.
- Launching
collect_data.launch.pyfor each episode. - Monitoring each episode for completion or critical failures (e.g., planning, gripper, perception issues).
- Generating a detailed failure report (
FAILED_EPISODE_*.txt) if an episode aborts due to critical errors, preserving the data for manual review. - Recommending the use of
check_episodes.pyfor managing and cleaning up collected data.
This script orchestrates the collect_data.launch.py file, which in turn runs the stacking_manager_node (using MoveIt2 for autonomous stacking) and the state_logger_node (for recording the demonstration).
For detailed behavior of the script, including failure detection patterns and reporting, refer to the comments within collect_multiple_episodes.sh itself.
If you need to collect a single episode, for instance, for testing or debugging purposes, you can still use:
# Single episode (saves to ~/mycobot_episodes_degrees by default)
# This will also use MoveIt2 via stacking_manager_node to perform the task.
ros2 launch mycobot_stacking_project collect_data.launch.pyHowever, for bulk data collection, collect_multiple_episodes.sh is strongly preferred.
The states.json file contains an array of entries in the following format:
[
{
"angles": [10.5, -15.2, 30.8, -45.1, 60.3, -75.6],
"gripper_value": [50],
"image": "frame_dir/image_00000.png"
}
]angles: Joint angles in degrees for the 6 robot jointsgripper_value: Gripper position mapped to 0-100 range (0=open, 100=closed)image: Relative path to the corresponding image file
The Stacking Manager Node (from mycobot_stacking_project package), driven by MoveIt2, executes the cube stacking task. During data collection:
- It calls the State Logger service to start recording at the beginning of its MoveIt2-planned stacking task.
- It calls the State Logger service to stop recording when the MoveIt2 task is complete or fails.
- Passes a unique episode identifier based on the current timestamp.
The collect_multiple_episodes.sh script automates running this MoveIt2-driven process multiple times.
- Launch Errors: Always run process cleanup before launching
- Missing Data: Ensure robot and camera are publishing on expected topics
- Data Location: Episodes save to
~/mycobot_episodes_degrees/in degrees format
To check if data was collected successfully:
# Check episodes directory
ls -la ~/mycobot_episodes_degrees/
# Validate episodes
python3 check_episodes.py scan --dir ~/mycobot_episodes_degrees
# Check JSON data structure
head -n 10 ~/mycobot_episodes_degrees/episode_*/states.jsonThe collected data is used to train visuomotor diffusion policy models:
- Training: Use the DP/ directory training system
- Format: Degrees format for better model sensitivity
- Processing: Compatible with PyTorch and standard ML frameworks
This data collection system is licensed under the MIT License - see the LICENSE file for details.