MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign Language Recognition

Weichao Zhao, Hezhen Hu, Wengang Zhou, Yunyao Mao, Min Wang and Houqiang Li

This repository includes Python (PyTorch) implementation of this paper.

Accepted by TCSVT2024

Requirements

python==3.8.13
torch==1.8.1+cu111
torchvision==0.9.1+cu111
tensorboard==2.9.0
scikit-learn==1.1.1
tqdm==4.64.0
numpy==1.22.4

Pre-Training

Please refer to the bash scripts

Datasets

Download the original datasets, including SLR500, NMFs_CSL, WLASL and MSASL
Utilize the off-the-shelf pose estimator MMPose with the setting of Topdown Heatmap + Hrnet + Dark on Coco-Wholebody to extract the 2D keypoints for sign language videos.
The final data is formatted as follows:

    Data
    ├── NMFs_CSL
    ├── SLR500
    ├── WLASL
    └── MSASL
        ├── Video
        ├── Pose
        └── Annotations

Pretrained Model

You can download the pretrained model from this link: pretrained model on four ISLR datasets

Citation

If you find this work useful for your research, please consider citing our work:

@article{zhao2024masa,
  title={MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign Language Recognition},
  author={Zhao, Weichao and Hu, Hezhen and Zhou, Wengang and Mao, Yunyao and Wang, Min and Li, Houqiang},
  journal={arXiv},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
feeder		feeder
images		images
moco		moco
options		options
utils		utils
README.md		README.md
dataset.py		dataset.py
misc.py		misc.py
pretrain.py		pretrain.py
script_pretrain.sh		script_pretrain.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign Language Recognition

Requirements

Pre-Training

Datasets

Pretrained Model

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign Language Recognition

Requirements

Pre-Training

Datasets

Pretrained Model

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages