This repository contains the official implementation of the paper "FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video".
- Pre-requisites
- Installation
- Dataset
- Getting the Dataset
- Evaluation
- Training
- Creating a new experiment
- CAD Models
- Citation
- uv as a package manager
The code has been tested only on Debian 12.0 with NVIDIA GPUs and CUDA 11.8.
It should work on any OS and any accelerator, as long as a bash shell is available.
From the root folder of the repository, run the following command:
uv sync --all-extrasThis will install all the additional dependencies required to download the dataset and extract it too.
To activate the environment, run:
source .venv/bin/activateThis repository provides a script to download and extract the dataset used in the paper.
Although that is the recommended approach, it is also possible to download it manually.
The dataset is publicly available and hosted on Edmond.
It can be downloaded in two different resolutions: 256x256 and 384x384.
Both version work with the provided code, but the 256x256 version is the one used in the paper.
To download the dataset automatically, run the following command:
frame download dataset --output-path <path/to/output/folder> --resolution 256This command will ask prompt you to accept the license agreement and then download the zip file containing the dataset.
The script leverages playwright to open a browser in headless mode and download the dataset.
If you never used playwright before, it might ask you to install the required browsers. You can do that by running:
playwright installIf you prefer to download the dataset manually, you can do so by going to the dataset page and clicking on the frame_v002_256.zip entry.
Given the dataset zip file, you can extract it using the provided script.
frame extract --file <path/to/dataset_256x256.zip> --output-folder <path/to/output/folder>This will extract the zip file in the specified output folder and convert the mp4 files to multiple jpeg images.
We provide a helper script to manually inspect the dataset. It can be run as follows:
python scripts/loop.py --helpIn order to evaluate a model, you can run the following command:
python scripts/eval.py --data <path/to/dataset> --experiment <name>Where <name> is the name of the experiment you want to evaluate, or the path to the checkpoint file.
In order to download the checkpoints used in the paper, you can run the following command:
frame download modelsThis will download the checkpoint files (for the backbone and the STF) in the checkpoints folder.
If for any reason you want to download them manually, you can find them at the same dataset page.
You can evaluate the model in an end2end fashion by running:
python scripts/end2end.py --data <path/to/dataset>Keep in mind that in order to run eval.py with the STF model, you need to have the backbone model cached, and run the Cross Training step before.
As highlighted in the paper, the training process is divided into three steps:
- Train the backbone model
- Cross Caching
- Train the STF model
python scripts/train.py --data <path/to/dataset> --experiment backboneThen, we do the Cross Caching (Section 4.4 of the paper).
./scripts/crosstraining.sh -d <path/to/dataset>And we cache the results.
./scripts/crosscache.sh -d <path/to/dataset>Finally, we can train the STF model.
python scripts/train.py --data <path/to/dataset> --experiment stfThis repository is based on hydra for configuration management.
In order to create a new experiment, you can create a new .yaml file in the configs/experiments folder.
That would be loaded automatically whenever you run a new training with --experiment <name> where <name> is the name of the new .yaml file.
A new experiment will inherit all the parameters from the default.yaml file, and you can override or change them from the new experiment file.
We refer to the hydra and omegaconf documentation for more details on how to use them.
Instructions on how to print the CAD models can be found here.
This project would not have been possible without some amazing open source projects. A subset of them are:
If you use this code in your research, please consider citing our paper:
@article{boscolo2025frame,
title = {FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video},
author = {Boscolo Camiletto, Andrea and Wang, Jian and Alvarado, Eduardo and Dabral, Rishabh and Beeler, Thabo and Habermann, Marc and Theobalt, Christian},
year = {2025},
journal={CVPR},
}