Skip to content

A novel hybrid architecture combining CNNs and Swin Transformers, to address limitations in 3D voxel reconstruction.

License

Notifications You must be signed in to change notification settings

SandeepaInduwaraSamaranayake/SwinVox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CodeFactor

SwinVox

This repository contains the source code for SwinVox, a novel deep learning model for reconstructing 3D voxel-based shapes from multiple 2D input images (views). Building upon the foundations of the Pix2Vox++ (https://gitlab.com/hzxie/Pix2Vox) architecture, SwinVox integrates the powerful Swin Transformer (https://github.com/microsoft/Swin-Transformer) for robust feature extraction and introduces an optional Cross-View Attention mechanism to enhance the fusion of multi-view information. This project aims to improve the fidelity and detail of 3D reconstructions, particularly in complex scenarios.

Algorithm_Design drawio

Datasets

We use the ShapeNet dataset in our experiments, which are available below:

Extracted (Recommended)

Archieved (Alternative)

Separate Rendering images & voxelized models (Extracted | Alternative)

Pretrained Models

The pretrained models on ShapeNet are available as follows:

Prerequisites

Clone the Code Repository

git clone https://github.com/SandeepaInduwaraSamaranayake/SwinVox.git

Install Python Denpendencies

cd SwinVox
pip install -r requirements.txt

Update Settings in config.py

You need to update the file path of the datasets:

__C.DATASETS.SHAPENET.RENDERING_PATH        = '/path/to/Datasets/ShapeNet/ShapeNetRendering/%s/%s/rendering/%02d.png'
__C.DATASETS.SHAPENET.VOXEL_PATH            = '/path/to/Datasets/ShapeNet/ShapeNetVox32/%s/%s/model.binvox'
__C.DATASETS.PASCAL3D.ANNOTATION_PATH       = '/path/to/Datasets/PASCAL3D/Annotations/%s_imagenet/%s.mat'
__C.DATASETS.PASCAL3D.RENDERING_PATH        = '/path/to/Datasets/PASCAL3D/Images/%s_imagenet/%s.JPEG'
__C.DATASETS.PASCAL3D.VOXEL_PATH            = '/path/to/Datasets/PASCAL3D/CAD/%s/%02d.binvox'
__C.DATASETS.PIX3D.ANNOTATION_PATH          = '/path/to/Datasets/Pix3D/pix3d.json'
__C.DATASETS.PIX3D.RENDERING_PATH           = '/path/to/Datasets/Pix3D/img/%s/%s.%s'
__C.DATASETS.PIX3D.VOXEL_PATH               = '/path/to/Datasets/Pix3D/model/%s/%s/%s.binvox'

Get Started

To train SwinVox, you can simply use the following command:

python3 runner.py

To test SwinVox, you can use the following command:

python3 runner.py --test --weights=/path/to/pretrained/model.pth

License

This project is open sourced under MIT license.