Skip to content

CyberAgentAILab/KaoLRM

Repository files navigation

KaoLRM

Repurposing Pre-trained Large Reconstruction Models for Parametric 3D Face Reconstruction (3DV 2026)

arXiv

Overview

KaoLRM is a parametric 3D face reconstruction approach that adapts pre-trained Large Reconstruction Models (LRMs) for high-quality face modeling. The system combines FLAME parametric face models with 2D Gaussian Splatting to reconstruct 3D faces from single facial images.

Installation

Requirements: Ubuntu 22.04, CUDA 12.6, NVIDIA A100 (or equivalent)

1. Environment Setup

We recommend using a dedicated conda environment:

conda create -n kaolrm python=3.10 -y
conda activate kaolrm

2. Install PyTorch

pip install torch==2.9.1 torchvision==0.24.1 --index-url https://download.pytorch.org/whl/cu126

3. Install Dependencies

# Install main dependencies
pip install --no-build-isolation -r requirements.txt

# Install xformers 
pip install xformers==0.0.33.post2 --index-url https://download.pytorch.org/whl/cu126

4. Download FLAME Models

FLAME models require registration at https://flame.is.tue.mpg.de/

bash fetch_data.sh

You will be prompted to enter your FLAME account credentials.

Pre-trained Models

Download pre-trained checkpoints from Releases. Note that the checkpoint files are under CC BY-NC 4.0.

The downloaded checkpoints should be placed at releases/mono/ and releases/multiview/ directories, respectively.

Inference

Input Image Preparation

Input images should follow the OpenLRM convention. Use background removal tools:

Sample images are provided in data/sample_input/.

Running Inference

# For (in-the-wild) frontal views
sh infer_mono.sh

# For profile views
sh infer_multiview.sh

Inference Outputs

Results are saved to dumps/releases/{model_type}/:

  • 3D meshes (.ply files) and FLAME parameters (.npy files)
  • Animations (.gif files) and visualizations (.png files)

Acknowledgement

The code is heavily based on the following projects.

  • OpenLRM: as the strong backbone
  • 2DGS: as the representation of the visualized geometries
  • PyTorch3D: for the differentiable rendering of FLAME meshes
  • DECA: for the reference of loss term design

We have also used the following repositories during the projects.

License

The source code of this project is licensed under the Apache License 2.0.

However, this project depends on several components with additional restrictions that limit the effective license to non-commercial research use only:

Component License Scope
KaoLRM source code Apache 2.0 kaolrm/ and scripts/
EG3D-derived code (triplane decoder) NVIDIA Non-Commercial kaolrm/models/gaussian_decoder.py
Pre-trained model weights CC BY-NC 4.0 releases/
FLAME model code MPI Non-Commercial kaolrm/models/flame.py
DINOv2 (vendored) Apache 2.0 kaolrm/models/encoders/dinov2/
diff-surfel-rasterization Non-Commercial installed via requirements.txt

Note: Commercial use of this project is prohibited due to the NVIDIA EG3D license and the Max Planck Institute FLAME model license. For commercial licensing of FLAME, contact ps-license@tuebingen.mpg.de.

No copyleft licenses (GPL/LGPL/AGPL) are used in this project.

Citation

@article{zhu2026kaolrm,
  title={KaoLRM: Repurposing Pre-trained Large Reconstruction Models for Parametric 3D Face Reconstruction},
  author={Zhu, Qingtian and Cao, Xu and Wang, Zhixiang and Zheng, Yinqiang and Taketomi, Takafumi},
  journal={International Conference on 3D Vision},
  year={2026}
}

About

[3DV 2026] Code for KaoLRM: Repurposing Pre-trained Large Reconstruction Models for Parametric 3D Face Reconstruction

Resources

License

Stars

Watchers

Forks

Contributors