Skip to content
/ P2-Net Public

P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching

Notifications You must be signed in to change notification settings

BingCS/P2-Net

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

P2-Net

This paper focus on joint description and detection of local features for pixel and point matching: P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching by Bing Wang, Changhao Chen, Zhaopeng Cui, Jie Qin, Chris Xiaoxuan Lu, Zhengdi Yu, Peijun Zhao, Zhen Dong, Fan Zhu, Niki Trigoni, Andrew Markham.

Introduction

Accurate 2D and 3D keypoint detection and description are vital for establishing image-point cloud correspondences. We propose a dual fully convolutional framework to directly match pixels and points, enabling fine-grained correspondence establishment. Our approach, integrating an ultra-wide reception mechanism and novel loss function, mitigates information variations between pixel and point local regions.

Network Architecture

Network

Installation

Create the environment and install the required libaries:

conda create -n P2NET python==3.8
conda activate P2NET
conda install --file requirements.txt

Compile the C++ extension module for python located in cpp_wrappers. Open a terminal in this folder, and run:

sh compile_wrappers.sh

The code has been tested on Python 3.8, PyTorch 1.3.1, GCC 11.3 and CUDA 11.7, but it should work with other configurations.

Dataset Download

Indoor Datasets

(1) 7Scenes

The dataset can be downloaded from here.

Let's take the '7-scenes-chess' scene as an example to explore its data format: Inside, there's a file named 'camera-intrinsics.txt' containing the internal parameters of the scene's camera. Additionally, there are multiple folders named 'seq-xx' each containing a bunch of frames saved as '.color.png', '.depth.png', and '.pose.txt' files.

7-Scenes--7-scenes-chess--camera-intrinsics.txt
                       |--seq-01--.color.png
                               |--.depth.png
                               |--.pose.txt
                       |--seq-02
                       |--...
       |--7-scenes-fire
       |--...

Although the 7scenes dataset itself doesn't provide point cloud data, you can generate it yourself using these files. Just run the '7scenes_gen.py' script located in the 'data/tools' directory. This script is capable of generating point cloud data from the 7scenes dataset and producing pairs of '.pkl' format files required for training and validation.

python 7scenes_gen.py

(2) RGB-D Scenes V2

The dataset can be downloaded from here.

RGB-D Scenes V2 contains 14 scenes, ranging from scene_01 to scene_14, with the first 10 scenes used for training and the last 4 scenes used for testing.

The format of this dataset is the same as that of the 7Scenes dataset. Simply run the 'RGB-D_gen.py' script located in the 'data/tools' directory. The script can generate point cloud data from the corresponding data set and generate the '.pkl' format file pairs required for training and validation.

python RGB-D_gen.py

(3) 3DMatch

The dataset can be downloaded from here.

3DMatch dataset consists of data from 62 scenes, with 54 scenes used for training and 8 scenes used for evaluation. The specific scene names can be found in train.txt and test.txt.

Generate the '.pkl' format file pairs required for training and validation:

python 3DMatch_gen.py

(4) ScanNet

The dataset can be downloaded from here.

ScanNet is a RGB-D video dataset. We use the smaller subset option provided by the authors, scannet_frames_25k, which includes both the training and test sets.

Organize the data format of this dataset in 7Scenes format.

Generate the '.pkl' format file pairs required for training and validation:

python ScanNet_gen.py

Outdoor Datasets

Kitti-DC

Kitti-DC is an outdoor dataset with 342 RGB-3D point cloud pairs from 4 distinct urban scenes, collected using a 64-line LiDAR scanner mounted on a moving vehicle.

The dataset is accessible in Google Cloud: Kitti-DC.

Organize the data format of this dataset in 7Scenes format.

Generate the '.pkl' format file pairs required for training and validation:

python Kitti-DC_gen.py

Trianing

(1) Take the 7Scenes dataset as an example. The training on 7Scenes dataset can be done by running:

python train_p2net.py  --opt_p2net data_dir="./7Scenes"  img_dir="./7Scenes"

(2) The training on RGB-D Scenes V2 dataset can be done by running:

python train_p2net.py  --opt_p2net data_dir="./RGB-D"  img_dir="./RGB-D"

(3) The training on 3DMatch dataset can be done by running:

python train_p2net.py  --opt_p2net data_dir="./3DMatch"  img_dir="./3DMatch"

(4) The training on ScanNet dataset can be done by running:

python train_p2net.py  --opt_p2net data_dir="./ScanNet"  img_dir="./ScanNet"

(5) The training on Kitti-DC dataset can be done by running:

python train_p2net.py  --opt_p2net data_dir="./Kitti-DC"  img_dir="./Kitti-DC"

Testing

(1) 7Scenes

Using the model trained in the previous step, extract keypoints, descriptors, and calculate scores for the dataset:

python test_p2net.py --run extractor  --opt_evaluation data_dir="./7Scenes"  img_dir="./7Scenes"

Finally, execute the evaluation function and pass in the paths to the extracted results and the data as parameters:

python test_p2net.py --run evaluator  --opt_evaluation data_dir="./7Scenes"  img_dir="./7Scenes"

(2) RGB-D Scenes V2

Using the model trained in the previous step, extract keypoints, descriptors, and calculate scores for the dataset:

python test_p2net.py --run extractor  --opt_evaluation data_dir="./RGB-D"  img_dir="./RGB-D"

Finally, execute the evaluation function and pass in the paths to the extracted results and the data as parameters:

python test_p2net.py --run evaluator  --opt_evaluation data_dir="./RGB-D"  img_dir="./RGB-D"

(3) 3DMatch

Using the model trained in the previous step, extract keypoints, descriptors, and calculate scores for the dataset:

python test_p2net.py --run extractor  --opt_evaluation data_dir="./3DMatch"  img_dir="./3DMatch"

Finally, execute the evaluation function and pass in the paths to the extracted results and the data as parameters:

python test_p2net.py --run evaluator  --opt_evaluation data_dir="./3DMatch"  img_dir="./3DMatch"

(4) ScanNet

Using the model trained in the previous step, extract keypoints, descriptors, and calculate scores for the dataset:

python test_p2net.py --run extractor  --opt_evaluation data_dir="./ScanNet"  img_dir="./ScanNet"

Finally, execute the evaluation function and pass in the paths to the extracted results and the data as parameters:

python test_p2net.py --run evaluator  --opt_evaluation data_dir="./ScanNet"  img_dir="./ScanNet"

(5) Kitti-DC

Using the model trained in the previous step, extract keypoints, descriptors, and calculate scores for the dataset:

python test_p2net.py --run extractor  --opt_evaluation data_dir="./Kitti-DC"  img_dir="./Kitti-DC"

Finally, execute the evaluation function and pass in the paths to the extracted results and the data as parameters:

python test_p2net.py --run evaluator  --opt_evaluation data_dir="./Kitti-DC"  img_dir="./Kitti-DC"

Pretrained models

Download pre-models from Baidu-Disk: models

Results

Evaluation on Three Benchmarks:

PnP FMR IR IN RR
3DMatch 99.4 47.9 196.2 66.9
ScanNet 97.6 57.5 143.6 76.7
Kitti-DC 100 67.9 175.1 98.8
Kabsch FMR IR IN RR
3DMatch 100 51.1 196.2 94.9
ScanNet 97.6 57.5 143.6 85.9
Kitti-DC 100 67.9 175.1 97.7

Evaluation on the RGB-D Scenes V2 dataset:

Scenes11 Scenes12 Scenes13 Scenes14
FMR 94.6 96.7 88.1 69.4
RR 79.5 76.2 56.3 45.6
IR 38.8 47.2 38.2 26.6

Evaluation on the 7Scenes dataset:

Chess Fire Heads Office Pumpkin Kitchen Stairs
FMR 100 99.2 88.1 91.8 89.5 92.5 64.3
RR 95.8 84.1 94.6 83.8 70.7 68.9 65.8
IR 72.1 69.2 59.2 69.0 62.0 62.8 44.8

Visualization

Below are some sample visualization results of our method: Matching visualization1 Matching visualization2 Matching visualization3

Citation

If you find this project useful, please cite:

@INPROCEEDINGS {wang2021p2net,
author = { Wang, Bing and Chen, Changhao and Cui, Zhaopeng and Qin, Jie and Lu, Chris Xiaoxuan and Yu, Zhengdi and Zhao, Peijun and Dong, Zhen and Zhu, Fan and Trigoni, Niki and Markham, Andrew },
booktitle = { 2021 IEEE/CVF International Conference on Computer Vision (ICCV) },
title = {{ P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching }},
year = {2021},
pages = {15984-15993},
doi = {10.1109/ICCV48922.2021.01570}
}

About

P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published