- [2025-02-19] Full code release finished!
- [2025-02-19] Updated version on arXiv (v3)!
- [2024-12-11] Poster presentation at NeurIPS 2024!
- [2024-10-31] Camera-ready release on arXiv (v2)!
- [2024-09-25] UNION has been accepted for NeurIPS 2024!
- [2024-05-24] Paper release on arXiv (v1)!
The pipeline builds on top of multiple open source projects. In step 1, RANSAC is used for ground removal (BSD-3-Clause) and HDBSCAN is used for spatial clustering (BSD-3-Clause). Step 2 uses ICP-Flow to get motion estimates (Apache-2.0 license). Lastly, step 3 uses DINOv2 for encoding the camera images (Apache-2.0 license).
Create and activate environment named UNION-Env
using commands below.
You may need to install gcc
by running command sudo apt-get install gcc
to be able to build wheels for pycocotools (required for nuscenes-devkit).
conda env create -f conda/environment.yml
conda activate UNION-Env
The nuScenes dataset can be downloaded here.
UNION pipeline is implemented in Jupyter notebook UNION-pipeline__Get-mobile-objects__nuScenes.ipynb
.
Start JupyterLab with UNION-Env
conda enviroment activated and execute entire notebook to discover mobile objects.
conda activate UNION-Env
jupyter lab
Create and activate environment named openmmlab
using commands below.
conda create --name openmmlab python=3.8
conda activate openmmlab
conda install pytorch=2.1 torchvision=0.16 torchaudio=2.1 pytorch-cuda=12.1 -c pytorch -c nvidia
conda install fsspec=2024.6
conda install numpy=1.23
pip install -U openmim
mim install mmengine==0.9.0
mim install mmcv==2.1.0
mim install mmdet==3.2.0
Clone the mmdetection3d repository and checkout version v1.4 using the commands below. After that, install the package.
git clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
git checkout fe25f7a51d36e3702f961e198894580d83c4387b
pip install -v -e .
Make a soft link for nuScenes in the data folder of mmdetection3d.
After that, process the dataset to get the nuscenes_infos_train.pkl
and nuscenes_infos_val.pkl
files.
ln -s data/. PUT_YOUR_DIRECTORY_HERE_TO_NUSCENES/nuscenes
python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes
When the UNION pipeline has been executed, the mmdetection3d files for UNION can be generated.
This is implemented in Jupyter notebook UNION-pipeline__Generate-mmdet3d-files__nuScenes.ipynb
.
Start JupyterLab with UNION-Env
conda enviroment activated and execute entire notebook.
conda activate UNION-Env
jupyter lab
Some files need to be added to the mmdetection3d repository and some need to be replaced. This can be done using commands below. The files are located in the mmdetection3d-files folder.
cp mmdetection3d-files/CenterPoint*Training*.py mmdetection3d/configs/centerpoint/
cp mmdetection3d-files/CenterPoint*Model*.py mmdetection3d/configs/_base_/models/
cp mmdetection3d-files/nuscenes_metric.py mmdetection3d/mmdet3d/evaluation/metrics/nuscenes_metric.py
cp mmdetection3d-files/nuscenes_dataset.py mmdetection3d/mmdet3d/datasets/nuscenes_dataset.py
Train CenterPoint using the created .pkl files.
conda activate openmmlab
cd mmdetection3d
All training commands follow below:
python tools/train.py configs/centerpoint/CenterPoint-Pillar0200__second-secfpn-8xb4-cyclic-20e-nus-3d__Class-Agnostic-Training__Labels-GT__UNION-file.py
python tools/train.py configs/centerpoint/CenterPoint-Pillar0200__second-secfpn-8xb4-cyclic-20e-nus-3d__Class-Agnostic-Training__Labels-HDBSCAN__UNION-file.py
python tools/train.py configs/centerpoint/CenterPoint-Pillar0200__second-secfpn-8xb4-cyclic-20e-nus-3d__Class-Agnostic-Training__Labels-Scene-Flow__UNION-file.py
python tools/train.py configs/centerpoint/CenterPoint-Pillar0200__second-secfpn-8xb4-cyclic-20e-nus-3d__Class-Agnostic-Training__Labels-UNION__UNION-file.py
python tools/train.py configs/centerpoint/CenterPoint-Pillar0200__second-secfpn-8xb4-cyclic-20e-nus-3d__Multi-Class-003-Training__Labels-GT__UNION-file.py
python tools/train.py configs/centerpoint/CenterPoint-Pillar0200__second-secfpn-8xb4-cyclic-20e-nus-3d__Multi-Class-005pc-Training__Labels-UNION-005pc__UNION-file.py
python tools/train.py configs/centerpoint/CenterPoint-Pillar0200__second-secfpn-8xb4-cyclic-20e-nus-3d__Multi-Class-010pc-Training__Labels-UNION-010pc__UNION-file.py
python tools/train.py configs/centerpoint/CenterPoint-Pillar0200__second-secfpn-8xb4-cyclic-20e-nus-3d__Multi-Class-015pc-Training__Labels-UNION-015pc__UNION-file.py
python tools/train.py configs/centerpoint/CenterPoint-Pillar0200__second-secfpn-8xb4-cyclic-20e-nus-3d__Multi-Class-020pc-Training__Labels-UNION-020pc__UNION-file.py
When the trainings have been done, the results can be computed.
This is implemented in Jupyter notebook UNION-pipeline__Do-evaluation-after-training__nuScenes.ipynb
.
Start JupyterLab with UNION-Env
conda enviroment activated and execute entire notebook.
conda activate UNION-Env
jupyter lab
Class-agnostic 3D object detection performance on nuScenes validation split (150 scenes). For each object discovery method, CenterPoint has been trained with method's generated pseudo-bounding boxes on nuScenes training split (700 scenes). AAE is set to 1.0 by default for all methods. L and C stand for LiDAR and camera, respectively. ST stands for self-training.
Method | Conference | Labels | ST | AP ↑ | NDS ↑ | ATE ↓ | ASE ↓ | AOE ↓ | AVE ↓ |
---|---|---|---|---|---|---|---|---|---|
HDBSCAN | JOSS'17 | L | ❎ | 13.8 | 15.7 | 0.583 | 0.531 | 1.517 | 1.556 |
OYSTER | CVPR'23 | L | ☑️ | 9.1 | 11.5 | 0.784 | 0.521 | 1.514 | - |
LISO | ECCV'24 | L | ☑️ | 10.9 | 13.9 | 0.750 | 0.409 | 1.062 | - |
UNION (ours) | NeurIPS'24 | L+C | ❎ | 39.5 | 31.7 | 0.590 | 0.506 | 0.876 | 0.837 |
Multi-class 3D object detection performance on nuScenes validation split (150 scenes). For each object discovery method, CenterPoint has been trained with the method's generated pseudo-bounding boxes on nuScenes training split (700 scenes), and class-agnostic predictions are assigned to real classes based on their size, i.e. size prior (SP). Vehicle (Veh.), pedestrian (Ped.), and cyclist (Cyc.) classes are used, see paper for more details. AAE is set to 1.0 by default for all methods and classes. UNION-Xpc stands for UNION trained with X pseudo-classes. L and C stand for LiDAR and camera, respectively. †Without clipping precision-recall curve, clipping is default for nuScenes evaluation.
Method | Labels | mAP ↑ | NDS ↑ | Veh. AP ↑ | Ped. AP ↑ | Cyc. AP ↑ | Cyc. AP† ↑ |
---|---|---|---|---|---|---|---|
HDBSCAN+SP | L | 4.9 | 12.8 | 14.1 | 0.4 | 0.0 | 1.5 |
UNION+SP | L+C | 13.0 | 19.7 | 35.2 | 3.7 | 0.0 | 1.5 |
UNION-05pc (ours) | L+C | 25.1 | 24.4 | 31.0 | 44.2 | 0.0 | 0.7 |
UNION-10pc (ours) | L+C | 20.4 | 22.1 | 27.6 | 33.7 | 0.0 | 0.5 |
UNION-15pc (ours) | L+C | 18.9 | 21.2 | 25.6 | 31.1 | 0.0 | 0.4 |
UNION-20pc (ours) | L+C | 19.0 | 21.9 | 25.1 | 31.9 | 0.0 | 2.2 |
If UNION is useful to your research, please kindly recognize our contributions by citing our paper.
@inproceedings{lentsch2024union,
title={{UNION}: Unsupervised {3D} Object Detection using Object Appearance-based Pseudo-Classes},
author={Lentsch, Ted and Caesar, Holger and Gavrila, Dariu M},
booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
pages={22028--22046},
volume={37},
year={2024}
}