Convert Pandaset to SemanticKitti format.
Script provided in this repo converts point clouds and semantic annotations from PandaSet format to SemanticKITTI format (.bin
for scans, .label
for labels), for 3D Semantic Segmentation.
The old version of the script is available at the old_version branch.
Visit the Pandaset website, sign up and then download the dataset.
Download also the raw Pandar64 LiDAR data from the link provided in this Issue.
PandaSet scene structure:
.
├── LICENSE.txt
├── annotations
│ ├── cuboids
│ │ ├── XX.pkl.gz
│ └── semseg // Semantic Segmentation is available for specific scenes
│ ├── XX.pkl.gz
│ └── classes.json
├── camera
│ ├── back_camera
│ │ ├── XX.jpg
│ │ ├── intrinsics.json
│ │ ├── poses.json
│ │ └── timestamps.json
│ ├── front_camera
│ │ └── ...
│ ├── front_left_camera
│ │ └── ...
│ ├── front_right_camera
│ │ └── ...
│ ├── left_camera
│ │ └── ...
│ └── right_camera
│ └── ...
├── lidar
│ ├── XX.pkl.gz
│ ├── poses.json
│ └── timestamps.json
└── meta
├── gps.json
└── timestamps.json
The raw Panda dataset consists of a list of sequences, each containing 80 scans stored in .pkl.gz
format.
Output dataset sequence format:
.
├── labels
│ └── XX.label
├── velodyne
│ └── XX.bin
Python packages required:
- Numpy
- Tqdm
- Scipy
Install them using pip install package_name
.
Run the convert_pandaset.py
script secifying the path to Pandaset, raw Pandar64 data and the output path for the converted dataset as follow:
python3 convert_pandaset.py --dataset /path/to/pandaset --raw /path/to/pandar64 --output_path /path/to/converted_dataset
Raw data are utilized to extract the laser ID for each point, to facilitate the spherical projection of LiDAR data into the range image. This is particularly important due to the non-uniform resolution of the Pandar64 LiDAR sensor.
Each 3D point of the converted point clouds includes the following features: x, y, z, i, laser_id
, where laser_id
is an integer in the range [0-63], representing the corresponding laser beam.
Warning: The script only converts scenes with available semantic segmentation labels; all others will be skipped. Additionally, only point clouds generated by the Pandar64 LiDAR are processed, while data from the front LiDAR are ignored. However, with minor modifications to the code, it should be possible to include the front LiDAR data if needed.