Sizhe Yang*, Wenye Yu*, Jia Zeng, Jun Lv, Kerui Ren, Cewu Lu, Dahua Lin, Jiangmiao Pang
* Equal Contributions
Shanghai AI Laboratory, The Chinese University of Hong Kong, Shanghai Jiao Tong University
Robotics: Science and Systems (RSS) 2025
robosplat.mp4
RoboSplat is framework that leverages 3D Gaussian Splatting (3DGS) to generate novel demonstrations for RGB-based policy learning.
Starting from a single expert demonstration and multi-view images, our method generates diverse and visually realistic data for policy learning, enabling robust performance across six types of generalization (object poses, object types, camera views, scene appearance, lighting conditions, and embodiments) in the real world.
Compared to previous 2D data augmentation methods, our approach achieves significantly better results across various generalization types.
First, clone this repository.
git clone https://github.com/OpenRobotLab/RoboSplat.git
(Optional) Use conda to manage the python environment.
conda create -n robosplat python=3.10 -y
conda activate robosplat
Install dependencies.
# Install PyTorch according to your CUDA version. For example, if your CUDA version is 11.8:
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu118
# Install other dependencies
pip install -r requirements.txt
# Install diff-gaussian-rasterization (adapted from https://github.com/graphdeco-inria/diff-gaussian-rasterization)
pip install third_party/diff-gaussian-rasterization
# Install PyTorch3D. If you encounter any problems, please refer to the detailed installation instructions at https://github.com/facebookresearch/pytorch3d.
cd third_party
git clone https://github.com/facebookresearch/pytorch3d.git
cd pytorch3d
conda install -c bottler nvidiacub -y
curl -LO https://github.com/NVIDIA/cub/archive/1.16.0.tar.gz
tar xzf 1.16.0.tar.gz
export CUB_HOME=$PWD/cub-1.16.0
pip install -e . # this may take a long time
cd ../..
We provide the reconstructed and preprocessed 3D Gaussians for demonstration generation. Please follow the steps below:
-
Download
data.zip
from the Google Drive folder. -
Place
data.zip
in the root directory of the repository. Then, unzip it by runningunzip data.zip
.
Generate novel demonstrations for the Pick task by running:
python data_aug/generate_demo.py \
--image_size 256 \
--save True \
--save_video True \
--ref_demo_path data/source_demo/real_000000.h5 \
--xy_step_str '[10, 10]' \
--augment_lighting False \
--augment_appearance False \
--augment_camera_pose False \
--output_path data/generated_demo/pick_100
Description of Arguments:
--image_size
: (int) The size of the images in the generated demonstrations.--save
: (bool) Whether to save the generated demonstrations.--save_video
: (bool) Whether to save videos for visualizing the generated demonstrations.--ref_demo_path
: (str) Path to the reference expert demonstration file.--xy_step_str
: (str) A string in the form of '[x
,y
]', which specifies the density of object placement. Here,x
is the number of object positions along the x-axis, andy
is the number of positions along the y-axis. The total number of generated demonstrations will bex
*y
.--augment_lighting
: (bool) Whether to apply lighting augmentation.--augment_appearance
: (bool) Whether to apply appearance augmentation.--augment_camera_pose
: (bool) Whether to apply camera pose augmentation.--output_path
: (str) Path to save the generated demonstrations and the videos.
- Release the code for object pose augmentation, lighting augmentation, appearance augmentation, and camera pose augmentation.
- Release the code for object type augmentation, and embodiment augmentation.
- Release the code for preprocessing 3D Gaussians.
If you find our work helpful, please cite:
@article{robosplat,
title={Novel Demonstration Generation with Gaussian Splatting Enables Robust One-Shot Manipulation},
author={Yang, Sizhe and Yu, Wenye and Zeng, Jia and Lv, Jun and Ren, Kerui and Lu, Cewu and Lin, Dahua and Pang, Jiangmiao},
journal={arXiv preprint arXiv:2504.13175},
year={2025}
}
This repository is released under the Apache 2.0 license.
Our code is built upon 3DGS and diff-gaussian-rasterization. We thank the authors for open-sourcing their code and for their significant contributions to the community.