This repository demonstrates how to predict 3D bounding boxes in point clouds from an RGB-D dataset by leveraging 2D instance segmentation masks. It includes tools for creating datasets, training, evaluating, and visualizing 3D bounding box predictions using Mask R-CNN and YOLO models.
3DBBox_predictor.mp4
3DBBox_predictor01.mp4
- Dataset Preparation: Convert raw data into YOLO/Mask R-CNN compatible instance segmentation datasets.
- Training: Train Mask R-CNN or YOLO models for instance segmentation.
- Evaluation: Evaluate 3D bounding box predictions and segmentation masks.
- Visualization: Visualize predictions and ground truth for qualitative analysis.
- ONNX Export: Export trained models to ONNX format for inference.
.
├── maskrcnn_utils
│ ├── coco_eval.py
│ ├── coco_utils.py
│ ├── engine.py
│ └── maskrcnn_utils.py
├── models
│ ├── maskrcnn.onnx
│ └── yolo11.onnx
|── scripts
| ├── boundingbox3d_predictor.py
| ├── create_instance_segmentation_dataset.py
| ├── dataset.py
| ├── evaluate_predictions.py
| ├── export_maskrcnn_to_onnx.py
| ├── train_maskrcnn_instance_segmentataion.py
| ├── train_yolo_instance_segmentataion.py
| ├── utils.py
| └── visualize_predictions.py
├── README.md
└── requirements.txt
-
Clone the repository:
git clone <repo-url> cd <repo-directory>
-
Install dependencies:
pip install -r requirements.txt
Prepare your dataset in the data/ directory, then run:
python scripts/create_instance_segmentation_dataset.py --data data --data-segment instance_segmentation_datasetThis will generate YOLO-format segmentation masks and split the dataset into train/val sets.
python scripts/train_maskrcnn_instance_segmentataion.py --dataset instance_segmentation_dataset --output models --epochs 10python scripts/train_yolo_instance_segmentataion.py --dataset instance_segmentation_dataset --model yolo11m-seg.pt --epochs 10 --imsize 640python scripts/export_maskrcnn_to_onnx.py --model-path models/checkpoint_maskrcnn.pth --output models/maskrcnn.onnxpython scripts/evaluate_predictions.py --model models/yolo11.onnx --data data --iou-threshold 0.25 --conf-threshold 0.5Evaluation results using an 0.25 IoU threshold:
| Model | Metric | Score |
|---|---|---|
| YOLOv11 | 3D IoU | 0.4828 |
| Mask RCNN | 3D IoU | 0.4667 |
python scripts/visualize_predictions.py --model models/maskrcnn.onnx --data data --visualization-number 10 --shufflescripts/train_maskrcnn_instance_segmentataion.py: Train Mask R-CNN for instance segmentation.scripts/train_yolo_instance_segmentataion.py: Train YOLO for instance segmentation.scripts/create_instance_segmentation_dataset.py: Prepare and split dataset for training.scripts/evaluate_predictions.py: Evaluate 3D bounding box and segmentation predictions.scripts/visualize_predictions.py: Visualize model predictions and ground truth.scripts/utils.py: Utility functions for dataset handling, model setup, and more.scripts/boundingbox3d_predictor.py: Classes for 3D bounding box prediction using Mask R-CNN and YOLO ONNX models.maskrcnn_utils/: Helper modules for Mask R-CNN training and evaluation.
- Python 3.8+
- PyTorch
- torchvision
- albumentations
- OpenCV
- Open3D
- onnx, onnxruntime
- ultralytics
- pycocotools
- tqdm
- numpy
- pyyaml