P2-Net

This paper focus on joint description and detection of local features for pixel and point matching: P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching by Bing Wang, Changhao Chen, Zhaopeng Cui, Jie Qin, Chris Xiaoxuan Lu, Zhengdi Yu, Peijun Zhao, Zhen Dong, Fan Zhu, Niki Trigoni, Andrew Markham.

Introduction

Accurate 2D and 3D keypoint detection and description are vital for establishing image-point cloud correspondences. We propose a dual fully convolutional framework to directly match pixels and points, enabling fine-grained correspondence establishment. Our approach, integrating an ultra-wide reception mechanism and novel loss function, mitigates information variations between pixel and point local regions.

Network Architecture

Installation

Create the environment and install the required libaries:

conda create -n P2NET python==3.8
conda activate P2NET
conda install --file requirements.txt

Compile the C++ extension module for python located in cpp_wrappers. Open a terminal in this folder, and run:

sh compile_wrappers.sh

The code has been tested on Python 3.8, PyTorch 1.3.1, GCC 11.3 and CUDA 11.7, but it should work with other configurations.

Dataset Download

Indoor Datasets

(1) 7Scenes

The dataset can be downloaded from here.

Let's take the '7-scenes-chess' scene as an example to explore its data format: Inside, there's a file named 'camera-intrinsics.txt' containing the internal parameters of the scene's camera. Additionally, there are multiple folders named 'seq-xx' each containing a bunch of frames saved as '.color.png', '.depth.png', and '.pose.txt' files.

7-Scenes--7-scenes-chess--camera-intrinsics.txt
                       |--seq-01--.color.png
                               |--.depth.png
                               |--.pose.txt
                       |--seq-02
                       |--...
       |--7-scenes-fire
       |--...

Although the 7scenes dataset itself doesn't provide point cloud data, you can generate it yourself using these files. Just run the '7scenes_gen.py' script located in the 'data/tools' directory. This script is capable of generating point cloud data from the 7scenes dataset and producing pairs of '.pkl' format files required for training and validation.

python 7scenes_gen.py

(2) RGB-D Scenes V2

The dataset can be downloaded from here.

RGB-D Scenes V2 contains 14 scenes, ranging from scene_01 to scene_14, with the first 10 scenes used for training and the last 4 scenes used for testing.

The format of this dataset is the same as that of the 7Scenes dataset. Simply run the 'RGB-D_gen.py' script located in the 'data/tools' directory. The script can generate point cloud data from the corresponding data set and generate the '.pkl' format file pairs required for training and validation.

python RGB-D_gen.py

(3) 3DMatch

The dataset can be downloaded from here.

3DMatch dataset consists of data from 62 scenes, with 54 scenes used for training and 8 scenes used for evaluation. The specific scene names can be found in train.txt and test.txt.

Generate the '.pkl' format file pairs required for training and validation：

python 3DMatch_gen.py

(4) ScanNet

The dataset can be downloaded from here.

ScanNet is a RGB-D video dataset. We use the smaller subset option provided by the authors, scannet_frames_25k, which includes both the training and test sets.

Organize the data format of this dataset in 7Scenes format.

Generate the '.pkl' format file pairs required for training and validation：

python ScanNet_gen.py

Outdoor Datasets

Kitti-DC

Kitti-DC is an outdoor dataset with 342 RGB-3D point cloud pairs from 4 distinct urban scenes, collected using a 64-line LiDAR scanner mounted on a moving vehicle.

The dataset is accessible in Google Cloud: Kitti-DC.

Organize the data format of this dataset in 7Scenes format.

Generate the '.pkl' format file pairs required for training and validation：

python Kitti-DC_gen.py

Trianing

(1) Take the 7Scenes dataset as an example. The training on 7Scenes dataset can be done by running:

python train_p2net.py  --opt_p2net data_dir="./7Scenes"  img_dir="./7Scenes"

(2) The training on RGB-D Scenes V2 dataset can be done by running:

python train_p2net.py  --opt_p2net data_dir="./RGB-D"  img_dir="./RGB-D"

(3) The training on 3DMatch dataset can be done by running:

python train_p2net.py  --opt_p2net data_dir="./3DMatch"  img_dir="./3DMatch"

(4) The training on ScanNet dataset can be done by running:

python train_p2net.py  --opt_p2net data_dir="./ScanNet"  img_dir="./ScanNet"

(5) The training on Kitti-DC dataset can be done by running:

python train_p2net.py  --opt_p2net data_dir="./Kitti-DC"  img_dir="./Kitti-DC"